2 results found Sort:

A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.
Created 2023-04-26
640 commits to main branch, last one 17 days ago
17
120
apache-2.0
9
SOTA Weight-only Quantization Algorithm for LLMs. This is official implementation of "Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs"
Created 2024-01-04
209 commits to main branch, last one 13 hours ago