4 results found Sort:
- Filter by Primary Language:
- Python (2)
- Jupyter Notebook (1)
- Shell (1)
- +
A high-throughput and memory-efficient inference and serving engine for LLMs
Created
2023-02-09
3,511 commits to main branch, last one 13 hours ago
Large-scale LLM inference engine
Created
2023-06-23
825 commits to main branch, last one 21 hours ago
Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.
Created
2024-01-09
1,213 commits to main branch, last one 20 hours ago
This Guidance demonstrates how to deploy a machine learning inference architecture on Amazon Elastic Kubernetes Service (Amazon EKS). It addresses the basic implementation requirements as well as ways...
Created
2021-08-11
108 commits to main branch, last one about a month ago