2 results found Sort:
A high-throughput and memory-efficient inference and serving engine for LLMs
Created
2023-02-09
3,511 commits to main branch, last one 13 hours ago
Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.
Created
2024-01-09
1,213 commits to main branch, last one 19 hours ago