Statistics for topic inference
RepositoryStats tracks 636,567 Github repositories, of these 332 are tagged with the inference topic. The most common primary language for repositories using this topic is Python (141). Other languages include: C++ (61), Jupyter Notebook (31), Rust (12), TypeScript (12), Go (11)
Stargazers over time for topic inference
Most starred repositories for topic inference (view more)
Trending repositories for topic inference (view more)
A high-throughput and memory-efficient inference and serving engine for LLMs
SGLang is a fast serving framework for large language models and vision language models.
Cross-platform, customizable ML solutions for live and streaming media.
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
Community maintained hardware plugin for vLLM on Ascend
SGLang is a fast serving framework for large language models and vision language models.
Minimal code and examnples for inferencing Sapiens foundation human models in Pytorch
A high-throughput and memory-efficient inference and serving engine for LLMs
SGLang is a fast serving framework for large language models and vision language models.
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any...
Community maintained hardware plugin for vLLM on Ascend
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
βΈοΈ Easy, advanced inference platform for large language models on Kubernetes. π Star to support our work!
Deep learned, NVIDIA-accelerated 3D object pose estimation
The .NET library to consume 100+ APIs: OpenAI, Anthropic, Google, DeepSeek, Cohere, Mistral, Azure, xAI, Perplexity, Groq, Ollama, LocalAi, and many more!
A high-throughput and memory-efficient inference and serving engine for LLMs
SGLang is a fast serving framework for large language models and vision language models.
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any...
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation
β‘ Bhumi β The fastest AI inference client for Python, built with Rust for unmatched speed, efficiency, and scalability π
Community maintained hardware plugin for vLLM on Ascend
The .NET library to consume 100+ APIs: OpenAI, Anthropic, Google, DeepSeek, Cohere, Mistral, Azure, xAI, Perplexity, Groq, Ollama, LocalAi, and many more!
Any model. Any hardware. Zero compromise. Built with @ziglang / @openxla / MLIR / @bazelbuild
AI Productivity Tool - Free and open source, improve user productivity, protect privacy and data security. Provide efficient and convenient AI solutions, built-in local exclusive ChatGPT, Phi, DeepSee...
Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code.
A high-throughput and memory-efficient inference and serving engine for LLMs
SGLang is a fast serving framework for large language models and vision language models.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
πA curated list of Awesome Diffusion Inference Papers with codes: Sampling, Caching, Multi-GPUs, etc. ππ
Community maintained hardware plugin for vLLM on Ascend
PocketGroq is a powerful Python library that simplifies integration with the Groq API, offering advanced features for natural language processing, web scraping, and autonomous agent capabilities. Key ...