2 results found Sort:

4.9k
32.3k
apache-2.0
263
A high-throughput and memory-efficient inference and serving engine for LLMs
Created 2023-02-09
3,880 commits to main branch, last one 7 hours ago
Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.
Created 2024-01-09
1,315 commits to main branch, last one 3 days ago