50 results found Sort:
- Filter by Primary Language:
- Python (29)
- Jupyter Notebook (6)
- C++ (3)
- Java (3)
- Go (2)
- Rust (2)
- TypeScript (2)
- Cuda (1)
- +
A high-throughput and memory-efficient inference and serving engine for LLMs
Created
2023-02-09
3,511 commits to main branch, last one 13 hours ago
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
Created
2019-04-02
3,365 commits to main branch, last one a day ago
In this repository, I will share some useful notes and references about deploying deep learning-based models in production.
Created
2018-05-03
221 commits to master branch, last one 12 days ago
FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on a...
Created
2020-07-21
12,120 commits to master branch, last one 6 months ago
Standardized Serverless ML Inference Platform on Kubernetes
Created
2019-03-27
1,630 commits to master branch, last one 2 days ago
🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSys...
Created
2019-01-07
509 commits to master branch, last one 3 months ago
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Created
2023-07-22
344 commits to main branch, last one 3 days ago
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Created
2023-10-20
840 commits to main branch, last one 18 hours ago
🏕️ Reproducible development environment
Created
2022-04-11
1,113 commits to main branch, last one 4 months ago
AICI: Prompts as (Wasm) Programs
Created
2023-09-26
1,616 commits to main branch, last one 10 days ago
MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates t...
Created
2019-09-01
5,566 commits to development branch, last one 21 hours ago
Hopsworks - Data-Intensive AI platform with a Feature Store
Created
2018-07-26
6,412 commits to master branch, last one 17 days ago
The simplest way to serve AI/ML models in production
Created
2022-07-06
1,266 commits to main branch, last one a day ago
A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine
Created
2021-03-13
390 commits to main branch, last one 2 days ago
Model Deployment at Scale on Kubernetes 🦄️
Created
2021-07-30
639 commits to main branch, last one 6 months ago
A scalable inference server for models optimized with OpenVINO™
Created
2018-09-26
2,610 commits to main branch, last one 22 hours ago
A throughput-oriented high-performance serving framework for LLMs
Created
2024-08-19
43 commits to main branch, last one 2 months ago
Python + Inference - Model Deployment library in Python. Simplest model inference server ever.
Created
2022-04-04
55 commits to main branch, last one about a year ago
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
Created
2023-12-27
1,607 commits to main branch, last one about a month ago
Securely share and store AI/ML projects as OCI artifacts in your container registry.
Created
2024-02-02
609 commits to main branch, last one 20 hours ago
FastAPI Skeleton App to serve machine learning models production-ready.
Created
2020-02-15
12 commits to main branch, last one 6 months ago
Learn to serve Stable Diffusion models on cloud infrastructure at scale. This Lightning App shows load-balancing, orchestrating, pre-provisioning, dynamic batching, GPU-inference, micro-services worki...
This repository has been archived
(exclude archived)
Created
2022-08-23
314 commits to main branch, last one about a year ago
This repository has no description...
Created
2023-06-14
347 commits to main branch, last one 9 days ago
Serverless LLM Serving for Everyone.
Created
2024-01-23
93 commits to main branch, last one 2 days ago
BentoDiffusion: A collection of diffusion models served with BentoML
Created
2023-06-12
56 commits to main branch, last one 20 days ago
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
Created
2024-03-01
123 commits to main branch, last one 2 days ago
A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.
Created
2020-01-23
372 commits to master branch, last one 5 months ago
Common library for serving TensorFlow, XGBoost and scikit-learn models in production.
This repository has been archived
(exclude archived)
Created
2018-03-12
505 commits to main branch, last one about a year ago
A scalable, high-performance serving system for federated learning models
Created
2019-09-10
157 commits to master branch, last one 4 months ago
ClearML - Model-Serving Orchestration and Repository Solution
Created
2021-04-12
140 commits to main branch, last one 4 months ago