rtp-llm

Name: rtp-llm
Availability: InStock

Alibaba’s high-performance LLM inference engine (CUDA-focused) for production serving of diverse decoder architectures.

Why it is included

Appears in TAAFT’s #llm repository listings as Alibaba’s open serving-oriented stack.

GPU inference teams evaluating alternatives to vLLM/Triton for datacenter LLM APIs.

vLLM · TensorRT-LLM · SGLang