Search Results - RepositoryStats

71

892

apache-2.0

9

A pytorch quantization backend for optimum

optimum pytorch quantization

Created 2023-09-19

716 commits to main branch, last one 3 days ago

51

337

apache-2.0

4

Production ready LLM model compression/quantization toolkit with accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.

gptq peft vllm sglang optimum quantization transformers

Created 2024-06-17

2,040 commits to main branch, last one 20 hours ago

8

125

gpl-3.0

3

Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

nlp onnx optimum infinity pipeline benchmark huggingface onnxruntime transformers natural-language-processing

Created 2022-03-16

154 commits to master branch, last one 2 years ago