Trending repositories for topic model-compression
Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
a collection of computer vision projects&tools. 计算机视觉方向项目和工具集合。
a collection of computer vision projects&tools. 计算机视觉方向项目和工具集合。
Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (pape...
a collection of computer vision projects&tools. 计算机视觉方向项目和工具集合。
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Efficient computing methods developed by Huawei Noah's Ark Lab
Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"
The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
Pytorch implementation of various Knowledge Distillation (KD) methods.
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcom...
Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)
Gather research papers, corresponding codes (if having), reading notes and any other related materials about Hot🔥🔥🔥 fields in Computer Vision based on Deep Learning.
[ICLR 2021] HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients
a collection of computer vision projects&tools. 计算机视觉方向项目和工具集合。
Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"
Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)
The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>
Gather research papers, corresponding codes (if having), reading notes and any other related materials about Hot🔥🔥🔥 fields in Computer Vision based on Deep Learning.
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcom...
[ICLR 2021] HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
Knowledge distillation in text classification with pytorch. 知识蒸馏,中文文本分类,教师模型BERT、XLNET,学生模型biLSTM。
Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (pape...
Efficient computing methods developed by Huawei Noah's Ark Lab
Awesome machine learning model compression research papers, quantization, tools, and learning material.
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (pape...
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
List of papers related to neural network quantization in recent AI conferences and journals.
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
Pytorch implementation of various Knowledge Distillation (KD) methods.
A PyTorch implementation for exploring deep and shallow knowledge distillation (KD) experiments with flexibility
a collection of computer vision projects&tools. 计算机视觉方向项目和工具集合。
Efficient computing methods developed by Huawei Noah's Ark Lab
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
[CVPR 2024 Highlight] Logit Standardization in Knowledge Distillation
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"
The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".
Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)
😎 A curated list of tensor decomposition resources for model compression.
Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"
a collection of computer vision projects&tools. 计算机视觉方向项目和工具集合。
The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>
The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcom...
List of papers related to neural network quantization in recent AI conferences and journals.
[CVPR 2024 Highlight] Logit Standardization in Knowledge Distillation
A list of papers, docs, codes about diffusion distillation.This repo collects various distillation methods for the Diffusion model. Welcome to PR the works (papers, repositories) missed by the repo.
Resources of our survey paper "A Comprehensive Survey on AI Integration at the Edge: Techniques, Applications, and Challenges"
Gather research papers, corresponding codes (if having), reading notes and any other related materials about Hot🔥🔥🔥 fields in Computer Vision based on Deep Learning.
Vocabulary Trimming (VT) is a model compression technique, which reduces a multilingual LM vocabulary to a target language by deleting irrelevant tokens from its vocabulary. This repository contains a...
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
[CVPR 2024 Highlight] Logit Standardization in Knowledge Distillation
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".
The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>
Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"
The official implementation of the paper "Demystifying the Compression of Mixture-of-Experts Through a Unified Framework".
Official Repo for SparseLLM: Global Pruning of LLMs (NeurIPS 2024)
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (pape...
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
[CVPR 2024 Highlight] Logit Standardization in Knowledge Distillation
[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
List of papers related to neural network quantization in recent AI conferences and journals.
Pytorch implementation of various Knowledge Distillation (KD) methods.
Efficient computing methods developed by Huawei Noah's Ark Lab
Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
A PyTorch implementation for exploring deep and shallow knowledge distillation (KD) experiments with flexibility
The official implementation of the paper "What Matters in Transformers? Not All Attention is Needed".
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"
a collection of computer vision projects&tools. 计算机视觉方向项目和工具集合。
[NeurIPS 2024] SlimSAM: 0.1% Data Makes Segment Anything Slim
List of papers related to neural network quantization in recent AI conferences and journals.
PyTorch Lightning implementation of the paper Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. This repository allows to reproduce the main fin...
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
A list of papers, docs, codes about efficient AIGC. This repo is aimed to provide the info for efficient AIGC research, including language and vision, we are continuously improving the project. Welcom...
Less is More: Task-aware Layer-wise Distillation for Language Model Compression (ICML2023)
Resources of our survey paper "A Comprehensive Survey on AI Integration at the Edge: Techniques, Applications, and Challenges"
This is a collection of our research on efficient AI, covering hardware-aware NAS and model compression.
Vocabulary Trimming (VT) is a model compression technique, which reduces a multilingual LM vocabulary to a target language by deleting irrelevant tokens from its vocabulary. This repository contains a...
Gather research papers, corresponding codes (if having), reading notes and any other related materials about Hot🔥🔥🔥 fields in Computer Vision based on Deep Learning.
[ICML 2023] UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers.