2 results found Sort:

6.9k
44.9k
apache-2.0
378
A high-throughput and memory-efficient inference and serving engine for LLMs
Created 2023-02-09
5,842 commits to main branch, last one 5 hours ago
Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.
Created 2024-01-09
1,503 commits to main branch, last one 4 days ago