43 results found Sort:

2.8k
20.3k
apache-2.0
194
A high-throughput and memory-efficient inference and serving engine for LLMs
Created 2023-02-09
1,444 commits to main branch, last one 12 hours ago
753
6.7k
apache-2.0
74
The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!
Created 2019-04-02
3,108 commits to main branch, last one 24 hours ago
In this repository, I will share some useful notes and references about deploying deep learning-based models in production.
Created 2018-05-03
219 commits to master branch, last one 14 days ago
769
4.1k
apache-2.0
114
FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on a...
Created 2020-07-21
12,120 commits to master branch, last one 18 days ago
978
3.2k
apache-2.0
63
Standardized Serverless ML Inference Platform on Kubernetes
Created 2019-03-27
1,499 commits to master branch, last one 4 days ago
156
1.9k
apache-2.0
22
🏕️ Reproducible development environment
Created 2022-04-11
1,109 commits to main branch, last one 19 days ago
171
1.9k
apache-2.0
20
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Created 2023-07-22
220 commits to main branch, last one 5 days ago
72
1.8k
mit
19
AICI: Prompts as (Wasm) Programs
Created 2023-09-26
1,341 commits to main branch, last one 11 days ago
116
1.7k
apache-2.0
29
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Created 2023-10-20
706 commits to main branch, last one 2 days ago
239
1.3k
apache-2.0
25
MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates t...
Created 2019-09-01
4,756 commits to development branch, last one a day ago
143
1.1k
agpl-3.0
32
Hopsworks - Data-Intensive AI platform with a Feature Store
Created 2018-07-26
6,395 commits to master branch, last one 3 days ago
The simplest way to serve AI/ML models in production
Created 2022-07-06
976 commits to main branch, last one 14 hours ago
70
770
other
19
Model Deployment at Scale on Kubernetes 🦄️
Created 2021-07-30
639 commits to main branch, last one 23 days ago
48
712
apache-2.0
13
A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine
Created 2021-03-13
365 commits to main branch, last one 11 days ago
196
641
apache-2.0
30
A scalable inference server for models optimized with OpenVINO™
Created 2018-09-26
2,338 commits to main branch, last one 23 hours ago
85
558
apache-2.0
41
Python + Inference - Model Deployment library in Python. Simplest model inference server ever.
Created 2022-04-04
55 commits to main branch, last one about a year ago
29
404
apache-2.0
11
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
Created 2023-12-27
776 commits to main branch, last one 10 days ago
Learn to serve Stable Diffusion models on cloud infrastructure at scale. This Lightning App shows load-balancing, orchestrating, pre-provisioning, dynamic batching, GPU-inference, micro-services worki...
This repository has been archived (exclude archived)
Created 2022-08-23
314 commits to main branch, last one 8 months ago
FastAPI Skeleton App to serve machine learning models production-ready.
Created 2020-02-15
12 commits to main branch, last one 28 days ago
24
324
apache-2.0
12
OneDiffusion: Run any Stable Diffusion models and fine-tuned weights with ease
Created 2023-06-12
47 commits to main branch, last one 2 days ago
47
247
apache-2.0
11
This repository has no description...
Created 2023-06-14
282 commits to main branch, last one 18 hours ago
37
224
apache-2.0
6
A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.
Created 2020-01-23
371 commits to master branch, last one 3 months ago
20
201
apache-2.0
9
Tools for easing the handoff between AI/ML and App/SRE teams.
Created 2024-02-02
386 commits to main branch, last one 14 hours ago
17
148
apache-2.0
14
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
Created 2024-03-01
82 commits to main branch, last one 2 days ago
33
138
apache-2.0
25
Common library for serving TensorFlow, XGBoost and scikit-learn models in production.
This repository has been archived (exclude archived)
Created 2018-03-12
505 commits to main branch, last one about a year ago
77
134
apache-2.0
29
A scalable, high-performance serving system for federated learning models
Created 2019-09-10
155 commits to master branch, last one 6 months ago
40
128
apache-2.0
11
ClearML - Model-Serving Orchestration and Repository Solution
Created 2021-04-12
138 commits to main branch, last one 2 months ago
Serving PyTorch models with TorchServe :fire:
Created 2020-11-01
111 commits to main branch, last one 3 years ago
Deploy DL/ ML inference pipelines with minimal extra code.
Created 2020-04-09
458 commits to master branch, last one about a month ago
13
88
apache-2.0
4
The official python package for NimbleBox. Exposes all APIs as CLIs and contains modules to make ML 🌸
Created 2021-07-29
396 commits to master branch, last one 11 months ago