4 results found Sort:

4.5k
29.7k
apache-2.0
242
A high-throughput and memory-efficient inference and serving engine for LLMs
Created 2023-02-09
3,290 commits to main branch, last one 7 hours ago
Large-scale LLM inference engine
Created 2023-06-23
801 commits to main branch, last one 2 days ago
Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.
Created 2024-01-09
1,129 commits to main branch, last one 9 hours ago
This Guidance demonstrates how to deploy a machine learning inference architecture on Amazon Elastic Kubernetes Service (Amazon EKS). It addresses the basic implementation requirements as well as ways...
Created 2021-08-11
108 commits to main branch, last one 18 days ago