16 results found Sort:

[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
Created 2019-03-27
48 commits to master branch, last one 4 months ago
[ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment
Created 2020-01-05
109 commits to master branch, last one about a year ago
[ICLR 2019] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
Created 2018-12-01
67 commits to master branch, last one 4 months ago
114
430
mit
17
[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Created 2019-06-15
17 commits to master branch, last one 12 months ago
86
370
mit
20
[CVPR 2019, Oral] HAQ: Hardware-Aware Automated Quantization with Mixed Precision
Created 2019-06-14
15 commits to master branch, last one 3 years ago
A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.
Created 2021-04-26
528 commits to main branch, last one 5 months ago
[ACL'20] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
Created 2020-05-01
18 commits to master branch, last one 4 months ago
26
305
unknown
13
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
Created 2024-01-31
12 commits to main branch, last one 4 months ago
56
274
unknown
17
[CVPR'20] ZeroQ: A Novel Zero Shot Quantization Framework
Created 2019-11-20
21 commits to master branch, last one 4 years ago
[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
Created 2020-09-22
1,540 commits to ibert branch, last one 3 years ago
[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Created 2019-04-08
7 commits to master branch, last one 3 years ago
Efficient 3D Backbone Network for Temporal Modeling
Created 2020-12-28
6 commits to main branch, last one 3 years ago
17
93
apache-2.0
3
[KDD'22] Learned Token Pruning for Transformers
Created 2021-07-01
7 commits to ltp/main branch, last one about a year ago
8
63
unknown
2
S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)
Created 2021-07-12
10 commits to main branch, last one 3 years ago
5
53
unknown
3
Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models".
Created 2023-06-13
17 commits to main branch, last one 8 months ago
[MICCAI 2021] BiX-NAS: Searching Efficient Bi-directional Architecture for Medical Image Segmentation
Created 2021-06-26
7 commits to main branch, last one 2 years ago