Search Results - RepositoryStats

4 results found Sort:

Filter by Primary Language:
C++ (1)
Jupyter Notebook (1)
Python (1)
Shell (1)
+

vllm vllm-project

6.6k

43.2k

apache-2.0

357

A high-throughput and memory-efficient inference and serving engine for LLMs

Created 2023-02-09

5,531 commits to main branch, last one 17 hours ago

aphrodite-engine aphrodite-engine

149

1.4k

agpl-3.0

Large-scale LLM inference engine

tpu cuda lora rocm intel api-rest inferentia inference-engine machine-learning speculative-decoding

Created 2023-06-23

1,253 commits to main branch, last one 3 days ago

foundation-model-benchmarking-tool aws-samples

233

mit-0

Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.

g5 g6 p5 g6e p4d llama2 llama3 bedrock deepseek trainium benchmark sagemaker inferentia deepseek-r1 benchmarking generative-ai foundation-models evaluation-metrics

Created 2024-01-09

1,491 commits to main branch, last one 26 days ago

guidance-for-machine-learning-inference-on-aws aws-solutions-library-samples

mit-0

This Guidance demonstrates how to deploy a machine learning inference architecture on Amazon Elastic Kubernetes Service (Amazon EKS). It addresses the basic implementation requirements as well as ways...

ml graviton3 inferentia eks-cluster do-framework mlops-workflow

Created 2021-08-11

108 commits to main branch, last one 5 months ago