Statistics for topic llm-inference
RepositoryStats tracks 595,856 Github repositories, of these 171 are tagged with the llm-inference topic. The most common primary language for repositories using this topic is Python (97). Other languages include: Jupyter Notebook (16), C++ (13)
Stargazers over time for topic llm-inference
Most starred repositories for topic llm-inference (view more)
Trending repositories for topic llm-inference (view more)
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
A list of software that allows searching the web with the assistance of AI.
Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
Yet Another Language Model: LLM inference in C++/CUDA, no libraries except for I/O
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
LLM notes, including model inference, transformer model structure, and llm framework code analysis notes
Code examples and resources for DBRX, a large language model developed by Databricks
Arch is an intelligent gateway for agents. Engineered with (fast) LLMs for the secure handling, rich observability, and seamless integration of prompts with your APIs - all outside business logic. Bui...
Streamlines and simplifies prompt design for both developers and non-technical users with a low code approach.
A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
Streamlines and simplifies prompt design for both developers and non-technical users with a low code approach.
An acceleration library that supports arbitrary bit-width combinatorial quantization operations
USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference