156 results found Sort:
- Filter by Primary Language:
- Python (108)
- Jupyter Notebook (18)
- C++ (7)
- C (4)
- Cuda (3)
- JavaScript (1)
- Kotlin (1)
- C# (1)
- Ruby (1)
- Rust (1)
- Go (1)
- Tcl (1)
- +
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Created
2023-05-28
2,738 commits to main branch, last one a day ago
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Created
2023-03-15
556 commits to main branch, last one 11 months ago
Faster Whisper transcription with CTranslate2
Created
2023-02-11
246 commits to master branch, last one 12 days ago
[🔥updating ...] AI 自动量化交易机器人(完全本地部署) AI-powered Quantitative Investment Research Platform. 📃 online docs: https://ufund-me.github.io/Qbot ✨ :news: qbot-mini: https://github.com/Charmve/iQuant
Created
2022-11-23
143 commits to main branch, last one 4 months ago
Accessible large language models via k-bit quantization for PyTorch.
Created
2021-06-04
849 commits to main branch, last one a day ago
Lossy PNG compressor — pngquant command based on libimagequant library
Created
2009-09-17
1,207 commits to main branch, last one 2 months ago
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Created
2023-04-13
769 commits to main branch, last one 15 days ago
Neural Network Distiller by Intel AI Lab: a Python package for neural network compression research. https://intellabs.github.io/distiller
This repository has been archived
(exclude archived)
Created
2018-04-24
643 commits to master branch, last one about a year ago
Fast inference engine for Transformer models
Created
2019-09-23
2,194 commits to master branch, last one 4 days ago
Sparsity-aware deep learning inference runtime for CPUs
Created
2020-12-14
1,052 commits to main branch, last one 8 months ago
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Created
2019-12-02
162 commits to master branch, last one about a year ago
A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks
This repository has been archived
(exclude archived)
Created
2018-05-17
957 commits to master branch, last one 2 years ago
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
Created
2021-07-20
1,190 commits to main branch, last one 25 days ago
Base pretrained models and datasets in pytorch (MNIST, SVHN, CIFAR10, CIFAR100, STL10, AlexNet, VGG16, VGG19, ResNet, Inception, SqueezeNet)
Created
2017-04-28
17 commits to master branch, last one 4 years ago
Build, customize and control you own LLMs. From data pre-processing to fine-tuning, xTuring provides an easy way to personalize open-source LLMs. Join our discord community: https://discord.gg/TgHXuSJ...
Created
2023-03-19
593 commits to main branch, last one 6 months ago
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Created
2020-07-21
3,739 commits to master branch, last one 12 hours ago
Run Mixtral-8x7B models in Colab or consumer desktops
Created
2023-12-15
86 commits to master branch, last one about a year ago
AIMET is a library that provides advanced quantization and compression techniques for trained neural network models.
Created
2020-04-21
2,659 commits to develop branch, last one 23 hours ago
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Ari...
bnn
twn
onnx
dorefa
pruning
pytorch
tensorrt
xnor-net
quantization
network-slimming
group-convolution
model-compression
network-in-network
tensorrt-int8-python
convolutional-networks
neuromorphic-computing
integer-arithmetic-only
batch-normalization-fuse
post-training-quantization
quantization-aware-training
Created
2019-12-04
295 commits to master branch, last one 3 years ago
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (pape...
Created
2018-10-18
301 commits to master branch, last one 28 days ago
PyTorch native quantization and sparsity for training and inference
Created
2023-11-03
1,168 commits to main branch, last one 4 days ago
A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
Created
2020-04-15
2,493 commits to main branch, last one 5 days ago
PPL Quantization Tool (PPQ) is a powerful offline neural network quantization tool.
Created
2021-12-30
291 commits to master branch, last one about a year ago
PaddleSlim is an open-source library for deep model compression and architecture search.
Created
2019-12-16
1,246 commits to develop branch, last one 3 months ago
OpenMMLab Model Compression Toolbox and Benchmark.
Created
2021-12-22
229 commits to main branch, last one about a year ago
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Created
2018-10-31
837 commits to master branch, last one about a month ago
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
Created
2023-03-30
443 commits to master branch, last one 9 days ago
Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
Created
2023-09-12
72 commits to main branch, last one 3 months ago
Brevitas: neural network quantization in PyTorch
Created
2018-07-10
1,406 commits to master branch, last one 5 months ago
Efficient computing methods developed by Huawei Noah's Ark Lab
Created
2019-09-04
157 commits to master branch, last one 4 months ago