6 results found Sort:

A native macOS app that allows users to chat with a local LLM that can respond with information from files, folders and websites on your Mac without installing any other software. Powered by llama.cpp...
Created 2024-10-09
446 commits to main branch, last one a day ago
125
134
unknown
129
Ollama负载均衡服务器 | 一款高性能、易配置的开源负载均衡服务器,优化Ollama负载。它能够帮助您提高应用程序的可用性和响应速度,同时确保系统资源的有效利用。
Created 2025-03-10
8 commits to main branch, last one about a month ago
Higher performance OpenAI LLM service than vLLM serve: A pure C++ high-performance OpenAI LLM service implemented with GPRS+TensorRT-LLM+Tokenizers.cpp, supporting chat and function call, AI agents, d...
Created 2024-08-21
158 commits to master branch, last one 8 days ago
4
94
apache-2.0
9
Official PyTorch implementation for Hogwild! Inference: Parallel LLM Generation with a Concurrent Attention Cache
Created 2025-04-08
25 commits to main branch, last one a day ago
A user-friendly Command-line/SDK tool that makes it quickly and easier to deploy open-source LLMs on AWS
Created 2025-01-25
103 commits to main branch, last one a day ago
0
28
unknown
1
🔥🔥🔥Breaking long thought processes of o1-like LLMs, such as DeepSeek-R1, QwQ
Created 2025-02-17
17 commits to main branch, last one about a month ago