9 results found Sort:
- Filter by Primary Language:
- Python (6)
- C (1)
- Jupyter Notebook (1)
- Shell (1)
- +
Running large language models on a single GPU for throughput-oriented scenarios.
This repository has been archived
(exclude archived)
Created
2023-02-15
107 commits to main branch, last one 4 months ago
Run Mixtral-8x7B models in Colab or consumer desktops
Created
2023-12-15
86 commits to master branch, last one about a year ago
PyTorch native quantization and sparsity for training and inference
Created
2023-11-03
1,145 commits to main branch, last one a day ago
A QoE-Oriented Computation Offloading Algorithm based on Deep Reinforcement Learning (DRL) for Mobile Edge Computing (MEC) | This algorithm captures the dynamics of the MEC environment by integrating ...
Created
2023-07-31
193 commits to main branch, last one 18 hours ago
LLM Inference on consumer devices
Created
2024-12-25
149 commits to v0.1.0 branch, last one 5 days ago
dpdk infrastructure for software acceleration. Currently working on RX and ACL pre-filter
Created
2019-06-10
107 commits to master branch, last one 4 years ago
DPU-Powered File System Virtualization over virtio-fs
Created
2022-07-20
371 commits to master branch, last one about a year ago
A collection of tests for the Open vSwitch HW offload.
Created
2020-05-27
5,649 commits to master branch, last one 4 months ago
ZO2 (Zeroth-Order Offloading): Full Parameter Fine-Tuning 175B LLMs with 18GB GPU Memory
Created
2025-01-07
163 commits to main branch, last one 3 days ago