RepoBasetenBasetenpublished Jun 18, 2024seen 5d

basetenlabs/Workshop-TRT-LLM

Python

Open original ↗

Captured source

source ↗
published Jun 18, 2024seen 5dcaptured 10hhttp 200method plain

basetenlabs/Workshop-TRT-LLM

Language: Python

Stars: 25

Forks: 15

Open issues: 0

Created: 2024-06-18T18:31:33Z

Pushed: 2024-06-26T04:11:03Z

Default branch: main

Fork: no

Archived: no

README:

AI Engineer World's Fair TensorRT-LLM Workshop

View Slides

Welcome to *From model weights to API endpoint with TensorRT-LLM* presented at The AI Engineer World's Fair!

We're your hosts, Pankaj Gupta and Philip Kiely from Baseten, and we're thrilled to have you here today.

This workshop has three live coding components, which correspond to numbered folders:

1. Building a TensorRT engine manually with TensorRT-LLM 2. Building an engine automatically on deployment with Truss 3. Benchmarking deployed models

Specific instructions for each component are in the respective folders' READMEs.

Let's get some TPS!

— Pankaj and Philip