3 results found Sort:
A high-throughput and memory-efficient inference and serving engine for LLMs
Created
2023-02-09
1,444 commits to main branch, last one 13 hours ago
PygmalionAI's large-scale inference engine
Created
2023-06-23
632 commits to main branch, last one 15 days ago
Foundation model benchmarking tool. Run any model on Amazon SageMaker and benchmark for performance across instance type and serving stack options.
Created
2024-01-09
359 commits to main branch, last one 18 hours ago