51 results found Sort:

4.9k
32.3k
apache-2.0
263
A high-throughput and memory-efficient inference and serving engine for LLMs
Created 2023-02-09
3,880 commits to main branch, last one a day ago
797
7.2k
apache-2.0
76
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
Created 2019-04-02
3,406 commits to main branch, last one 2 days ago
In this repository, I will share some useful notes and references about deploying deep learning-based models in production.
Created 2018-05-03
221 commits to master branch, last one about a month ago
787
4.2k
apache-2.0
117
FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on a...
Created 2020-07-21
12,120 commits to master branch, last one 7 months ago
1.1k
3.7k
apache-2.0
66
Standardized Serverless ML Inference Platform on Kubernetes
Created 2019-03-27
1,652 commits to master branch, last one a day ago
🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSys...
Created 2019-01-07
509 commits to master branch, last one 4 months ago
216
2.7k
apache-2.0
22
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Created 2023-07-22
399 commits to main branch, last one a day ago
149
2.3k
apache-2.0
33
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Created 2023-10-20
867 commits to main branch, last one 4 days ago
159
2.1k
apache-2.0
23
🏕️ Reproducible development environment
Created 2022-04-11
1,113 commits to main branch, last one 5 months ago
78
2.0k
mit
24
AICI: Prompts as (Wasm) Programs
Created 2023-09-26
1,616 commits to main branch, last one about a month ago
256
1.5k
apache-2.0
28
MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates t...
Created 2019-09-01
5,737 commits to development branch, last one 2 days ago
39
1.4k
other
15
Olares: An Open-Source Sovereign Cloud OS for Local AI
Created 2024-04-29
753 commits to main branch, last one a day ago
145
1.2k
agpl-3.0
36
Hopsworks - Data-Intensive AI platform with a Feature Store
Created 2018-07-26
6,412 commits to master branch, last one about a month ago
The simplest way to serve AI/ML models in production
Created 2022-07-06
1,298 commits to main branch, last one 2 days ago
61
805
apache-2.0
13
A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine
Created 2021-03-13
393 commits to main branch, last one 10 days ago
69
792
other
19
Model Deployment at Scale on Kubernetes 🦄️
Created 2021-07-30
639 commits to main branch, last one 7 months ago
212
688
apache-2.0
32
A scalable inference server for models optimized with OpenVINO™
Created 2018-09-26
2,668 commits to main branch, last one 17 hours ago
27
667
apache-2.0
8
A throughput-oriented high-performance serving framework for LLMs
Created 2024-08-19
43 commits to main branch, last one 3 months ago
61
586
apache-2.0
13
An open source DevOps tool for packaging and versioning AI/ML models, datasets, code, and configuration into an OCI artifact.
Created 2024-02-02
645 commits to main branch, last one 5 days ago
87
564
apache-2.0
41
Python + Inference - Model Deployment library in Python. Simplest model inference server ever.
Created 2022-04-04
55 commits to main branch, last one 2 years ago
53
556
apache-2.0
13
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
Created 2023-12-27
1,607 commits to main branch, last one 2 months ago
FastAPI Skeleton App to serve machine learning models production-ready.
Created 2020-02-15
12 commits to main branch, last one 7 months ago
Learn to serve Stable Diffusion models on cloud infrastructure at scale. This Lightning App shows load-balancing, orchestrating, pre-provisioning, dynamic batching, GPU-inference, micro-services worki...
This repository has been archived (exclude archived)
Created 2022-08-23
314 commits to main branch, last one about a year ago
37
388
apache-2.0
13
Serverless LLM Serving for Everyone.
Created 2024-01-23
111 commits to main branch, last one 4 days ago
65
385
apache-2.0
15
This repository has no description...
Created 2023-06-14
348 commits to main branch, last one 27 days ago
25
342
apache-2.0
12
BentoDiffusion: A collection of diffusion models served with BentoML
Created 2023-06-12
56 commits to main branch, last one about a month ago
32
256
apache-2.0
19
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
Created 2024-03-01
126 commits to main branch, last one 3 days ago
37
224
apache-2.0
6
A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.
Created 2020-01-23
372 commits to master branch, last one 6 months ago
40
139
apache-2.0
11
ClearML - Model-Serving Orchestration and Repository Solution
Created 2021-04-12
142 commits to main branch, last one 5 days ago
33
139
apache-2.0
24
Common library for serving TensorFlow, XGBoost and scikit-learn models in production.
This repository has been archived (exclude archived)
Created 2018-03-12
505 commits to main branch, last one about a year ago