Search Results - RepositoryStats

2 results found Sort:

auto-round intel

410

apache-2.0

Advanced Quantization Algorithm for LLMs/VLMs.

awq gptq int4 rounding quantization neural-compressor

Created 2024-01-04

424 commits to main branch, last one 8 days ago

optimum-benchmark huggingface

290

apache-2.0

🏋️ A unified multi-backend utility for benchmarking Transformers, Timm, PEFT, Diffusers and Sentence-Transformers with full support of Optimum's hardware optimizations & quantization schemes.

pytorch openvino benchmark onnxruntime tensorrt-llm neural-compressor text-generation-inference

Created 2023-04-26

713 commits to main branch, last one about a month ago