8 results found Sort:
- Filter by Primary Language:
- Python (5)
- HTML (1)
- Jupyter Notebook (1)
- +
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Created
2020-07-21
3,610 commits to master branch, last one 24 hours ago
Large Language Models for All, 🦙 Cult and More, Stay in touch !
Created
2023-03-30
27 commits to main branch, last one about a year ago
🦖 X—LLM: Cutting Edge & Easy LLM Finetuning
Created
2023-11-10
62 commits to main branch, last one 11 months ago
Advanced Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
Created
2024-01-04
305 commits to main branch, last one a day ago
Run any Large Language Model behind a unified API
Created
2023-04-02
31 commits to main branch, last one 11 months ago
🪶 Lightweight OpenAI drop-in replacement for Kubernetes
This repository has been archived
(exclude archived)
Created
2023-05-23
154 commits to main branch, last one 9 months ago
A guide about how to use GPTQ models with langchain
Created
2023-05-11
8 commits to main branch, last one about a year ago