Search Results - RepositoryStats

4.9k

32.3k

apache-2.0

263

A high-throughput and memory-efficient inference and serving engine for LLMs

amd gpt hpu llm tpu xpu cuda rocm llama mlops llmops pytorch trainium inference inferentia llm-serving transformer model-serving

Created 2023-02-09

3,880 commits to main branch, last one a day ago

BentoML bentoml

797

7.2k

apache-2.0

76

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

llm mlops llmops python multimodal llm-serving ai-inference deep-learning generative-ai llm-inference model-serving ml-engineering machine-learning inference-platform model-inference-service

Created 2019-04-02

3,406 commits to main branch, last one 2 days ago

Deep-Learning-in-Production ahkarami

687

4.3k

unknown

148

In this repository, I will share some useful notes and references about deploying deep learning-based models in production.

Created 2018-05-03

221 commits to master branch, last one about a month ago

FedML FedML-AI

787

4.2k

apache-2.0

117

FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on a...

mlops edge-ai ai-agent deep-learning model-serving inference-engine machine-learning model-deployment federated-learning on-device-training distributed-training

Created 2020-07-21

12,120 commits to master branch, last one 7 months ago

kserve kserve

1.1k

3.7k

apache-2.0

66

Standardized Serverless ML Inference Platform on Kubernetes

k8s genai istio mlops kserve knative pytorch sklearn xgboost kubeflow kubernetes tensorflow service-mesh hacktoberfest llm-inference model-serving machine-learning model-interpretability artificial-intelligence

Created 2019-03-27

1,652 commits to master branch, last one a day ago

AI-System-School HuaizhengZhang

313

2.7k

mit

125

🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSys...

genai mlsys llmsys ai-infra model-serving model-training large-language-models

Created 2019-01-07

509 commits to master branch, last one 4 months ago

lightllm ModelTC

216

2.7k

apache-2.0

22

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

gpt llm nlp llama deep-learning model-serving openai-triton

Created 2023-07-22

399 commits to main branch, last one a day ago

lorax predibase

149

2.3k

apache-2.0

33

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

gpt llm lora llama llmops pytorch fine-tuning llm-serving transformers llm-inference model-serving

Created 2023-10-20

867 commits to main branch, last one 4 days ago

envd tensorchord

159

2.1k

apache-2.0

23

🏕️ Reproducible development environment

mlops docker llmops buildkit hacktoberfest model-serving mlops-workflow developer-tools development-environment

Created 2022-04-11

1,113 commits to main branch, last one 5 months ago

aici microsoft

78

2.0k

mit

24

AICI: Prompts as (Wasm) Programs

ai llm rust wasm llmops wasmtime inference llm-serving transformer llm-framework llm-inference model-serving language-model

Created 2023-09-26

1,616 commits to main branch, last one about a month ago

mlrun mlrun

256

1.5k

apache-2.0

28

MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates t...

mlops python workflow kubernetes data-science model-serving mlops-workflow data-engineering machine-learning experiment-tracking

Created 2019-09-01

5,737 commits to development branch, last one 2 days ago

Olares beclab

39

1.4k

other

15

Olares: An Open-Source Sovereign Cloud OS for Local AI

mcp nas edge-ai homelab local-ai ai-agents ai-privacy homeserver kubernetes self-hosted model-serving home-automation

Created 2024-04-29

753 commits to main branch, last one a day ago

hopsworks logicalclocks

145

1.2k

agpl-3.0

36

Hopsworks - Data-Intensive AI platform with a Feature Store

ml aws gcp azure mlops kserve python pyspark hopsworks governance serverless data-science feature-store model-serving machine-learning feature-management feature-engineering

Created 2018-07-26

6,412 commits to master branch, last one about a month ago

truss basetenlabs

75

933

mit

19

The simplest way to serve AI/ML models in production

falcon whisper wizardlm packaging easy-to-use open-source inference-api model-serving inference-server machine-learning stable-diffusion artificial-intelligence

Created 2022-07-06

1,298 commits to main branch, last one 2 days ago

mosec mosecorg

61

805

apache-2.0

13

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

cv gpu jax llm tts rust mlops mxnet python pytorch tensorflow llm-serving deep-learning hacktoberfest model-serving nerual-network machine-learning machine-learning-platform

Created 2021-03-13

393 commits to main branch, last one 10 days ago

Yatai bentoml

69

792

other

19

Model Deployment at Scale on Kubernetes 🦄️

k8s mlops bentoml kubernetes model-serving machine-learning model-deployment

Created 2021-07-30

639 commits to main branch, last one 7 months ago

model_server openvinotoolkit

212

688

apache-2.0

32

A scalable inference server for models optimized with OpenVINO™

ai dag edge cloud serving openvino inference kubernetes deep-learning model-serving machine-learning

Created 2018-09-26

2,668 commits to main branch, last one 17 hours ago

Nanoflow efeslab

27

667

apache-2.0

8

A throughput-oriented high-performance serving framework for LLMs

llm cuda llama2 inference llm-serving model-serving

Created 2024-08-19

43 commits to main branch, last one 3 months ago

kitops jozu-ai

61

586

apache-2.0

13

An open source DevOps tool for packaging and versioning AI/ML models, datasets, code, and configuration into an OCI artifact.

Created 2024-02-02

645 commits to main branch, last one 5 days ago

pinferencia underneathall

87

564

apache-2.0

41

Python + Inference - Model Deployment library in Python. Simplest model inference server ever.

Created 2022-04-04

55 commits to main branch, last one 2 years ago

rtp-llm alibaba

53

556

apache-2.0

13

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

gpt llm llama llmops inference llm-serving model-serving

Created 2023-12-27

1,607 commits to main branch, last one 2 months ago

fastapi-ml-skeleton eightBEC

83

414

apache-2.0

6

FastAPI Skeleton App to serve machine learning models production-ready.

python fastapi python3 model-serving machine-learning

Created 2020-02-15

12 commits to main branch, last one 7 months ago

stable-diffusion-deploy Lightning-Universe

39

392

apache-2.0

20

Learn to serve Stable Diffusion models on cloud infrastructure at scale. This Lightning App shows load-balancing, orchestrating, pre-provisioning, dynamic batching, GPU-inference, micro-services worki...

model-serving stable-diffusion

This repository has been archived (exclude archived)

Created 2022-08-23

314 commits to main branch, last one about a year ago

ServerlessLLM ServerlessLLM

37

388

apache-2.0

13

Serverless LLM Serving for Everyone.

cuda pytorch model-serving model-as-a-service serverless-inference large-language-models huggingface-transformers

Created 2024-01-23

111 commits to main branch, last one 4 days ago

xFasterTransformer intel

65

385

apache-2.0

15

This repository has no description...

llm qwen xeon intel llama chatglm inference transformer model-serving

Created 2023-06-14

348 commits to main branch, last one 27 days ago

BentoDiffusion bentoml

25

342

apache-2.0

12

BentoDiffusion: A collection of diffusion models served with BentoML

ai lora kubernetes fine-tuning model-serving diffusion-models stable-diffusion

Created 2023-06-12

56 commits to main branch, last one about a month ago

JetStream AI-Hypercomputer

32

256

apache-2.0

19

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

gpt gpu jax llm tpu gemma llama mlops llama2 llmops pytorch inference transformer llm-inference model-serving large-language-models

Created 2024-03-01

126 commits to main branch, last one 3 days ago

chitra aniketmaurya

37

224

apache-2.0

6

A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.

mlops python fastapi gradcam pytorch tensorflow deep-learning hacktoberfest image-dataset model-serving visualization bounding-boxes image-processing machine-learning model-deployment object-detection model-visualization image-classification model-interpretation

Created 2020-01-23

372 commits to master branch, last one 6 months ago

clearml-serving allegroai

40

139

apache-2.0

11

ClearML - Model-Serving Orchestration and Repository Solution

ai mlops devops triton clearml serving kubernetes serving-ml deep-learning model-serving machine-learning tensorflow-serving serving-pytorch-models triton-inference-server

Created 2021-04-12

142 commits to main branch, last one 5 days ago

zoltar spotify

33

139

apache-2.0

24

Common library for serving TensorFlow, XGBoost and scikit-learn models in production.

java xgboost tensorflow model-serving machine-learning

This repository has been archived (exclude archived)

Created 2018-03-12

505 commits to main branch, last one about a year ago