vllm

Name: vllm
Rating: 5 (76801 reviews)

A high-throughput and memory-efficient inference and serving engine for LLMs

github AI Tools Python free

★ 76,801Stars

15,657Forks

76,801Watchers

4Views

Apr 2026Last Update

View on GitHub →

About vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

#amd #blackwell #cuda #deepseek #deepseek-v3 #gpt #gpt-oss #inference #kimi #llama