3 results found Sort:

2.8k
20.3k
apache-2.0
194
A high-throughput and memory-efficient inference and serving engine for LLMs
Created 2023-02-09
1,444 commits to main branch, last one 13 hours ago
PygmalionAI's large-scale inference engine
Created 2023-06-23
632 commits to main branch, last one 15 days ago
Foundation model benchmarking tool. Run any model on Amazon SageMaker and benchmark for performance across instance type and serving stack options.
Created 2024-01-09
359 commits to main branch, last one 18 hours ago