Search Results - RepositoryStats

258

2.3k

apache-2.0

33

SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime

awq fp4 gptq int4 int8 pruning mxformat sparsity sparsegpt auto-tuning smoothquant quantization low-precision large-language-models knowledge-distillation post-training-quantization quantization-aware-training

Created 2020-07-21

3,699 commits to master branch, last one 2 days ago

neural-speed intel

38

349

apache-2.0

8

An innovative library for efficient LLM inference via low-bit quantization

This repository has been archived (exclude archived)

Created 2023-11-20

345 commits to main branch, last one 3 months ago

yolov5_tensorrt_int8_tools Wulingtian

42

177

unknown

2

tensorrt int8 量化yolov5 onnx模型

int8 onnx yolov5 tensorrt

Created 2021-01-31

4 commits to master branch, last one 3 years ago

yolov5_tensorrt_int8 Wulingtian

26

166

unknown

3

TensorRT int8 量化部署 yolov5s 模型，实测3.3ms一帧！

int8 yolov5 tensorrt

Created 2021-01-31

3 commits to master branch, last one 3 years ago

RepVGG_TensorRT_int8 Wulingtian

15

61

unknown

2

RepVGG TensorRT int8 量化，实测推理不到1ms一帧！

int8 repvgg tensorrt

Created 2021-02-04

8 commits to master branch, last one 3 years ago

Tensorrt-int8-quantization-pipline xuanandsix

3

56

unknown

1

a simple pipline of int8 quantization based on tensorrt.

int8 yolox tensorrt quantization classifaction

Created 2022-08-22

23 commits to main branch, last one 2 years ago

YOLOv8-ONNX-TensorRT the0807

3

42

agpl-3.0

1

👀 Apply YOLOv8 exported with ONNX or TensorRT(FP16, INT8) to the Real-time camera

fp16 int8 onnx yolov8 tensorrt computer-vision object-detection

Created 2024-05-14

73 commits to main branch, last one 7 months ago

nanodet_tensorrt_int8 Wulingtian

7

37

unknown

2

nanodet int8 量化，实测推理2ms一帧！

int8 nanodet tensorrt

Created 2021-02-09

4 commits to master branch, last one 3 years ago