156 results found Sort:

5.6k
45.8k
apache-2.0
252
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Created 2023-05-28
2,738 commits to main branch, last one a day ago
1.9k
18.8k
apache-2.0
183
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Created 2023-03-15
556 commits to main branch, last one 11 months ago
1.3k
15.1k
mit
134
Faster Whisper transcription with CTranslate2
Created 2023-02-11
246 commits to master branch, last one 12 days ago
1.6k
10.9k
mit
128
[🔥updating ...] AI 自动量化交易机器人(完全本地部署) AI-powered Quantitative Investment Research Platform. 📃 online docs: https://ufund-me.github.io/Qbot ✨ :news: qbot-mini: https://github.com/Charmve/iQuant
Created 2022-11-23
143 commits to main branch, last one 4 months ago
Accessible large language models via k-bit quantization for PyTorch.
Created 2021-06-04
849 commits to main branch, last one a day ago
490
5.3k
other
129
Lossy PNG compressor — pngquant command based on libimagequant library
Created 2009-09-17
1,207 commits to main branch, last one 2 months ago
512
4.8k
mit
30
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Created 2023-04-13
769 commits to main branch, last one 15 days ago
804
4.4k
apache-2.0
130
Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller
This repository has been archived (exclude archived)
Created 2018-04-24
643 commits to master branch, last one about a year ago
342
3.7k
mit
58
Fast inference engine for Transformer models
Created 2019-09-23
2,194 commits to master branch, last one 4 days ago
181
3.1k
other
55
Sparsity-aware deep learning inference runtime for CPUs
Created 2020-12-14
1,052 commits to main branch, last one 8 months ago
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Created 2019-12-02
162 commits to master branch, last one about a year ago
448
2.9k
apache-2.0
162
A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
This repository has been archived (exclude archived)
Created 2018-05-17
957 commits to master branch, last one 2 years ago
516
2.8k
apache-2.0
54
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
Created 2021-07-20
1,190 commits to main branch, last one 25 days ago
Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)
Created 2017-04-28
17 commits to master branch, last one 4 years ago
206
2.6k
apache-2.0
34
Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJ...
Created 2023-03-19
593 commits to main branch, last one 6 months ago
263
2.4k
apache-2.0
32
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Created 2020-07-21
3,739 commits to master branch, last one 12 hours ago
Run Mixtral-8x7B models in Colab or consumer desktops
Created 2023-12-15
86 commits to master branch, last one about a year ago
398
2.3k
other
49
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Created 2020-04-21
2,659 commits to develop branch, last one 23 hours ago
476
2.2k
mit
40
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Ari...
Created 2019-12-04
295 commits to master branch, last one 3 years ago
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (pape...
Created 2018-10-18
301 commits to master branch, last one 28 days ago
235
1.9k
bsd-3-clause
43
PyTorch native quantization and sparsity for training and inference
Created 2023-11-03
1,168 commits to main branch, last one 4 days ago
A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
Created 2020-04-15
2,493 commits to main branch, last one 5 days ago
249
1.7k
apache-2.0
16
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
Created 2021-12-30
291 commits to master branch, last one about a year ago
351
1.6k
apache-2.0
90
PaddleSlim is an open-source library for deep model compression and architecture search.
Created 2019-12-16
1,246 commits to develop branch, last one 3 months ago
236
1.6k
apache-2.0
21
OpenMMLab Model Compression Toolbox and Benchmark.
Created 2021-12-22
229 commits to main branch, last one about a year ago
325
1.5k
apache-2.0
117
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Created 2018-10-31
837 commits to master branch, last one about a month ago
102
1.5k
mit
22
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Created 2023-03-30
443 commits to master branch, last one 9 days ago
67
1.3k
unknown
7
Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
Created 2023-09-12
72 commits to main branch, last one 3 months ago
207
1.3k
other
31
Brevitas: neural network quantization in PyTorch
Created 2018-07-10
1,406 commits to master branch, last one 5 months ago
Efficient computing methods developed by Huawei Noah's Ark Lab
Created 2019-09-04
157 commits to master branch, last one 4 months ago