Statistics for topic llm-inference
RepositoryStats tracks 584,796 Github repositories, of these 159 are tagged with the llm-inference topic. The most common primary language for repositories using this topic is Python (90). Other languages include: Jupyter Notebook (15), C++ (13)
Stargazers over time for topic llm-inference
Most starred repositories for topic llm-inference (view more)
Trending repositories for topic llm-inference (view more)
A list of software that allows searching the web with the assistance of AI.
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
A list of software that allows searching the web with the assistance of AI.
Arch is an intelligent prompt gateway. Engineered with (fast) LLMs for the secure handling, robust observability, and seamless integration of prompts with your APIs - outside business logic. Built by ...
Build responsible, controlled and transparent applications on top of LLM models!
☸️ Easy, advanced inference platform for large language models on Kubernetes. 🌟 Star to support our work!
Pure C++ implementation of several models for real-time chatting on your computer (CPU)
A list of software that allows searching the web with the assistance of AI.
Arch is an intelligent prompt gateway. Engineered with (fast) LLMs for the secure handling, robust observability, and seamless integration of prompts with your APIs - outside business logic. Built by ...
A list of software that allows searching the web with the assistance of AI.
Arch is an intelligent prompt gateway. Engineered with (fast) LLMs for the secure handling, robust observability, and seamless integration of prompts with your APIs - outside business logic. Built by ...
Build responsible, controlled and transparent applications on top of LLM models!
A list of software that allows searching the web with the assistance of AI.
ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
A list of software that allows searching the web with the assistance of AI.
A list of software that allows searching the web with the assistance of AI.
Minimalist web-searching platform with an AI assistant that runs directly from your browser. Uses WebLLM, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.space
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Code examples and resources for DBRX, a large language model developed by Databricks
Tensor parallelism is all you need. Run LLMs on an AI cluster at home using any device. Distribute the workload, divide RAM usage, and increase inference speed.
Streamlines and simplifies prompt design for both developers and non-technical users with a low code approach.
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
LLM (Large Language Model) FineTuning
Streamlines and simplifies prompt design for both developers and non-technical users with a low code approach.
An acceleration library that supports arbitrary bit-width combinatorial quantization operations
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference