Search Results - RepositoryStats

718

4.2k

unknown

52

Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.

pytorch ghostnet imagenet tensorflow transformer model-compression pretrained-models vision-transformer efficient-inference convolutional-neural-networks

Created 2019-11-16

153 commits to master branch, last one about a month ago

LLMCompiler SqueezeAILab

123

1.7k

mit

24

[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling

llm nlp llms llama llama2 llm-agent llm-agents transformer llm-framework function-calling efficient-inference large-language-models parallel-function-call natural-language-processing

Created 2023-12-06

53 commits to main branch, last one 9 months ago

EfficientFormer snap-research

92

1.0k

other

35

EfficientFormerV2 [ICCV 2023] & EfficientFormer [NeurIPs 2022]

pytorch imagenet detection transformer transformers deep-learning mobile-devices efficient-inference semantic-segmentation efficient-neural-networks

Created 2022-06-02

23 commits to main branch, last one about a year ago

AdderNet huawei-noah

185

958

bsd-3-clause

24

Code for paper " AdderNet: Do We Really Need Multiplications in Deep Learning?"

pytorch cvpr2020 imagenet efficient-inference convolutional-neural-networks

Created 2020-02-25

31 commits to master branch, last one 3 years ago

DeepCache horseee

43

886

apache-2.0

14

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

training-free diffusion-models stable-diffusion model-compression efficient-inference

Created 2023-12-01

125 commits to master branch, last one 9 months ago

SqueezeLLM SqueezeAILab

45

685

mit

18

[ICML 2024] SqueezeLLM: Dense-and-Sparse Quantization

llm llama localllm transformer quantization small-models text-generation model-compression efficient-inference large-language-models post-training-quantization natural-language-processing

Created 2023-06-12

50 commits to main branch, last one about a year ago

LightGaussian VITA-Group

65

677

other

32

[NeurIPS 2024 Spotlight]"LightGaussian: Unbounded 3D Gaussian Compression with 15x Reduction and 200+ FPS", Zhiwen Fan, Kevin Wang, Kairun Wen, Zehao Zhu, Dejia Xu, Zhangyang Wang

nurips neurips-2024 3d-reconstruction gaussian-splatting efficient-inference

Created 2023-11-26

73 commits to main branch, last one 3 months ago

Awesome-Quantization-Papers Zhen-Dong

47

591

mit

16

List of papers related to neural network quantization in recent AI conferences and journals.

papers awesome-list quantization edge-computing neural-networks diffusion-models model-compression efficient-inference large-language-models

Created 2022-01-01

35 commits to main branch, last one 21 days ago

KVQuant SqueezeAILab

30

340

unknown

10

[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

llm llama mistral localllm localllama compression transformer quantization small-models efficient-model text-generation model-compression efficient-inference large-language-models natural-language-processing

Created 2024-01-31

12 commits to main branch, last one 9 months ago

speculative-decoding lucidrains

19

253

mit

9

Explorations into some recent techniques surrounding speculative decoding

transformers deep-learning efficient-inference artificial-intelligence

Created 2023-08-27

74 commits to main branch, last one 3 months ago

SMSR SYSU-SAIL

30

239

unknown

7

[CVPR 2021] Exploring Sparsity in Image Super-Resolution for Efficient Inference

sparsity super-resolution efficient-inference

Created 2020-07-26

38 commits to master branch, last one 3 years ago

picollm Picovoice

12

233

apache-2.0

8

On-device LLM Inference Powered by X-Bit Quantization

llm llms gemma llama llama2 llama3 mistral mixtral compression self-hosted quantization generative-ai llm-inference language-model language-models model-compression efficient-inference large-language-model natural-language-processing

Created 2024-04-09

61 commits to main branch, last one 6 days ago

DS-Net changlin31

19

229

unknown

9

(CVPR 2021, Oral) Dynamic Slimmable Network

pruning dynamic-pruning network-pruning dynamic-networks model-compression efficient-inference

Created 2021-03-23

12 commits to main branch, last one 3 years ago

ELAN xindongzhang

20

222

apache-2.0

7

[ECCV2022] Efficient Long-Range Attention Network for Image Super-resolution

transformer super-resolution efficient-inference

Created 2022-03-12

8 commits to main branch, last one 2 years ago

Awesome-Generation-Acceleration xuyang-liu16

6

202

unknown

8

📚 Collection of awesome generation acceleration resources.

text-to-image text-to-video diffusion-models image-generation video-generation model-acceleration efficient-inference efficient-deep-learning

Created 2024-07-14

230 commits to main branch, last one a day ago

AsyncDiff czg1225

12

196

apache-2.0

3

[NeurIPS 2024] AsyncDiff: Parallelizing Diffusion Models by Asynchronous Denoising

text-to-image text-to-video training-free diffusion-models stable-diffusion efficient-inference distributed-computing inference-acceleration

Created 2024-05-31

64 commits to main branch, last one about a month ago

DeciWatch cure-lab

16

179

apache-2.0

10

[ECCV 2022] Official implementation of the paper "DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation"

eccv pytorch eccv2022 efficiency 2d-human-pose deep-learning pose-estimation 3d-body-recovery 3d-pose-estimation body-reconstruction efficient-inference human-pose-estimation efficient-neural-networks

Created 2022-04-25

47 commits to main branch, last one 2 years ago

SoT SimonAytes

21

107

mit

5

Official code repository for Sketch-of-Thought (SoT)

ai llm prompting llm-inference efficient-inference

Created 2025-03-03

21 commits to main branch, last one 15 days ago

learning-to-cache horseee

3

101

unknown

2

[NeurIPS 2024] Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching

diffusion-models efficient-inference

Created 2024-06-03

22 commits to main branch, last one 9 months ago

BigLittleDecoder kssteven418

10

90

apache-2.0

5

[NeurIPS'23] Speculative Decoding with Big Little Decoder

llm decoding fast-inference efficient-inference speculative-decoding speculative-execution

Created 2023-02-10

11,217 commits to main branch, last one about a year ago

STR RAIVNLab

11

89

apache-2.0

6

Soft Threshold Weight Reparameterization for Learnable Sparsity

cnn str icml icml2020 imagenet sparsity icml-2020 soft-thresholding learnable-sparsity resource-efficient efficient-inference edge-machine-learning sparsity-optimization soft-threshold-reparameterization

Created 2020-04-11

53 commits to master branch, last one 3 years ago

graphless-neural-networks snap-research

21

88

mit

7

[ICLR 2022] Code for Graph-less Neural Networks: Teaching Old MLPs New Tricks via Distillation (GLNN)

gnn pytorch scalability distillation deep-learning graph-algorithm efficient-inference graph-neural-networks knowledge-distillation

Created 2021-10-27

33 commits to main branch, last one 5 months ago

AAR qiuk2

6

67

unknown

7

[Official Implementation] Acoustic Autoregressive Modeling 🔥

audio-tokenizer efficient-inference next-scale-prediction autoregressive-generation

Created 2024-08-16

28 commits to main branch, last one 7 months ago

AdaptiveDiffusion Alpha-Innovator

3

64

apache-2.0

3

[NeurIPS'24] Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy

training-free diffusion-models stable-diffusion adaptive-inference model-acceleration efficient-inference

Created 2024-10-11

17 commits to master branch, last one 2 months ago

fast_robust_early_exit raymin0223

7

57

unknown

2

Fast and Robust Early-Exiting Framework for Autoregressive Language Models with Synchronized Parallel Decoding (EMNLP 2023 Long)

nlp llms early-exiting efficient-inference autoregressive-models

Created 2023-06-26

19 commits to main branch, last one 6 months ago

Partially-Observed-TreeCRFs FranxYao

7

52

unknown

4

Implementation of AAAI 21 paper: Nested Named Entity Recognition with Partially Observed TreeCRFs

crf tree-crf sum-product tree-structure efficient-inference sum-product-algorithm named-entity-recognition nested-named-entity-recognition

Created 2020-12-10

10 commits to main branch, last one 3 years ago

AdaMML IBM

9

50

apache-2.0

5

Official implementation of AdaMML. https://arxiv.org/abs/2105.05165.

deep-learning computer-vision efficient-inference multimodal-learning

This repository has been archived (exclude archived)

Created 2021-10-07

16 commits to main branch, last one 3 years ago

lzu tchittesh

5

47

mit

1

Code for Learning to Zoom and Unzoom (CVPR 2023)

3d-detection spatial-attention autonomous-driving efficient-inference

Created 2023-03-16

7 commits to main branch, last one about a year ago

TinyML-Benchmark-NNs-on-MCUs bharathsudharsan

11

35

mit

2

Code for WF-IoT paper 'TinyML Benchmark: Executing Fully Connected Neural Networks on Commodity Microcontrollers'

tflite tinyml tfmicro arduinio cmsis-nn mcu-boards armcortexm0 armcortexm4 armcortexm7 c-code-generator machine-learning tinyml-benchmark raspberry-pi-pico efficient-inference

Created 2021-05-08

40 commits to main branch, last one 2 years ago

LayerMerge snu-mllab

1

29

mit

6

Official PyTorch implementation of "LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging" (ICML'24)

efficient-inference neural-network-pruning efficient-deep-learning neural-network-compression

Created 2024-05-31

10 commits to main branch, last one 8 months ago