4 results found Sort:
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Created
2023-07-22
399 commits to main branch, last one 8 hours ago
Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
Created
2023-10-17
277 commits to main branch, last one 24 days ago
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
Created
2023-07-13
158 commits to main branch, last one about a month ago
Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.
Created
2024-02-29
904 commits to main branch, last one 16 days ago