4 results found Sort:

223
2.8k
apache-2.0
22
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Created 2023-07-22
434 commits to main branch, last one 4 days ago
Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
Created 2023-10-17
277 commits to main branch, last one 2 months ago
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
Created 2023-07-13
160 commits to main branch, last one 22 hours ago
Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.
Created 2024-02-29
1,132 commits to deepauto/dev branch, last one a day ago