16 results found Sort:
- Filter by Primary Language:
- Python (15)
- C++ (1)
- +
[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
Created
2019-03-27
48 commits to master branch, last one 4 months ago
[ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment
Created
2020-01-05
109 commits to master branch, last one about a year ago
[ICLR 2019] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
Created
2018-12-01
67 commits to master branch, last one 4 months ago
[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Created
2019-06-15
17 commits to master branch, last one 12 months ago
[CVPR 2019, Oral] HAQ: Hardware-Aware Automated Quantization with Mixed Precision
Created
2019-06-14
15 commits to master branch, last one 3 years ago
A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.
Created
2021-04-26
528 commits to main branch, last one 5 months ago
[ACL'20] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
Created
2020-05-01
18 commits to master branch, last one 4 months ago
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
Created
2024-01-31
12 commits to main branch, last one 4 months ago
[CVPR'20] ZeroQ: A Novel Zero Shot Quantization Framework
Created
2019-11-20
21 commits to master branch, last one 4 years ago
[ICML'21 Oral] I-BERT: Integer-only BERT Quantization
Created
2020-09-22
1,540 commits to ibert branch, last one 3 years ago
[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Created
2019-04-08
7 commits to master branch, last one 3 years ago
Efficient 3D Backbone Network for Temporal Modeling
Created
2020-12-28
6 commits to main branch, last one 3 years ago
[KDD'22] Learned Token Pruning for Transformers
Created
2021-07-01
7 commits to ltp/main branch, last one about a year ago
S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration (CVPR 2021)
Created
2021-07-12
10 commits to main branch, last one 3 years ago
Code for the AAAI 2024 Oral paper "OWQ: Outlier-Aware Weight Quantization for Efficient Fine-Tuning and Inference of Large Language Models".
Created
2023-06-13
17 commits to main branch, last one 8 months ago
[MICCAI 2021] BiX-NAS: Searching Efficient Bi-directional Architecture for Medical Image Segmentation
Created
2021-06-26
7 commits to main branch, last one 2 years ago