What does this repo signal mean?

Scaleway published scaleway/ai-pulse-nvidia-trt-llm (Python). This repository signal exposes tooling, eval, infrastructure, or model-adjacent work before it may appear in a launch post. High-signal details: repo scaleway/ai-pulse-nvidia-trt-llm · language Python. onlylabs links this event to 1 captured evidence page and 6 related repo signals.

Scaleway Repo: scaleway/ai-pulse-nvidia-trt-llm

Captured source

source ↗

GitHub/github.com/scaleway/ai-pulse-nvidia-trt-llm

scaleway/ai-pulse-nvidia-trt-llm repository metadata

Source ↗

published Nov 14, 2023seen 5dcaptured 8hhttp 200method plain

scaleway/ai-pulse-nvidia-trt-llm

Description: Sources and datasets to deploy Nvidia TRT -LLM on Scaleway Ecosystem

Language: Python

License: Apache-2.0

Stars: 1

Forks: 1

Open issues: 0

Created: 2023-11-14T14:36:24Z

Pushed: 2023-12-29T16:41:17Z

Default branch: main

Fork: no

Archived: no

README: ![ai pulse banner](./docs/images/common/ai-pulse-banner.jpeg)

Efficient deployment and inference of GPU-accelerated LLMs

Introduction

NVIDIA TensorRT-LLM, which will be part of NVIDIA AI Enterprise, is an open-source software that delivers state-of-the-art performance for LLM serving using NVIDIA GPUs. It consists of the TensorRT deep learning compiler and includes optimized kernels, pre- and post-processing steps, and multi-GPU/multi-node communication primitives. In this repository, you will find sources that have been used by Nvidia to introduce Tensort-LLM during Scaleway AI Pulse 1st edition .

Guide Presentation

The Workshop aims to introduce TensorRT-LLM features and capabilities and walk through steps needed to build and run your model in TensorRT-LLM on both single GPU and multi-GPUs. We also use Triton Inference Server and TensorRT-LLM Backend to deploy the engines generated by TensorRT-LLM.

Getting Started

Let's start by setting-up the Scaleway prerequisites and the complete environment. Go to [Setup](./docs/01-setup.md).