Search Results - RepositoryStats

Liger-Kernel linkedin

284

4.7k

bsd-2-clause

47

Efficient Triton Kernels for LLM Training

llms phi3 llama gemma2 llama3 triton mistral finetuning llm-training triton-kernels

Created 2024-08-06

430 commits to main branch, last one 16 hours ago

kernl ELS-RD

96

1.6k

apache-2.0

27

Kernl lets you run PyTorch transformer models several times faster on GPU with a single line of code, and is designed to be easily hackable.

cuda triton pytorch cuda-kernel transformer

Created 2022-08-05

122 commits to main branch, last one about a year ago

SageAttention thu-ml

73

1.2k

apache-2.0

25

Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.

llm cuda mlsys triton attention quantization video-generation efficient-attention inference-acceleration

Created 2024-10-03

84 commits to main branch, last one a day ago

containerpilot TritonDataCenter

134

1.1k

mpl-2.0

81

A service for autodiscovery and configuration of applications running in containers

consul docker joyent triton containers orchestration containerpilot service-discovery

Created 2015-10-22

697 commits to master branch, last one 4 years ago

Tigress_protection JonathanSalwan

143

823

unknown

38

Playing with the Tigress software protection. Break some of its protections and solve their reverse engineering challenges. Automatic deobfuscation using symbolic execution, taint analysis and LLVM.

llvm triton tigress deobfuscation taint-analysis symbolic-execution reverse-engineering tigress-protections solution-tigress-challenge

Created 2016-10-28

39 commits to master branch, last one about a year ago

awesome-llm-and-aigc coderonion

57

638

unknown

14

🚀🚀🚀A collection of some wesome public projects about Large Language Model(LLM), Vision Language Model(VLM), Vision Language Action(VLA), AI Generated Content(AIGC), the related Datasets and Applica...

Created 2023-02-15

160 commits to main branch, last one 5 days ago

attorch BobMcDear

28

523

mit

9

A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.

cuda openai triton pytorch deep-learning openai-triton machine-learning

Created 2023-07-13

164 commits to main branch, last one 29 days ago

FlagGems FlagOpen

73

456

apache-2.0

19

FlagGems is an operator library for large language models implemented in Triton Language.

triton pytorch triton-kernels

Created 2024-03-21

445 commits to master branch, last one 5 hours ago

acer-predator-turbo-and-rgb-keyboard-linux-module JafarAkhondali

78

426

gpl-3.0

20

Linux kernel module to support Turbo mode and RGB Keyboard for Acer Predator notebook series

led rgb acer linux turbo helios triton rgb-led predator hacktoberfest

Created 2021-05-13

202 commits to main branch, last one 3 months ago

triton-resources rkinas

20

306

unknown

5

A curated list of resources for learning and exploring Triton, OpenAI's programming language for writing efficient GPU code.

cuda triton

Created 2024-12-07

83 commits to main branch, last one 10 days ago

DI-hpc opendilab

7

225

apache-2.0

4

OpenDILab RL HPC OP Lib, including CUDA and Triton kernel

hpc cuda lstm triton pytorch reinforcement-learning

Created 2021-07-05

10 commits to main branch, last one 8 months ago

awesome-cuda-triton-hpc coderonion

27

221

unknown

5

🔥🔥🔥 A collection of some awesome public CUDA, cuBLAS, cuDNN, CUTLASS, TensorRT, TensorRT-LLM, Triton, TVM, MLIR and High Performance Computing (HPC) projects.

Created 2023-02-23

31 commits to main branch, last one 5 days ago

Dna Colton1skees

20

220

gpl-3.0

5

LLVM based static binary analysis framework

x86 llvm binary lifter triton x86-64 llvm-ir analysis deobfuscation static-analysis program-analysis instruction-semantics

Created 2022-03-12

146 commits to master branch, last one 5 months ago

trident kakaobrain

12

183

apache-2.0

3

A performance library for machine learning applications.

ai python triton library pytorch performance deep-learning machine-learning

This repository has been archived (exclude archived)

Created 2023-04-30

234 commits to main branch, last one about a year ago

bspwm-dots mmsaeed509

10

165

gpl-3.0

5

Ozoz dotfiles for bspwm, i3WM

acer arch i3wm rofi bspwm linux turbo helios neovim triton polybar dotfiles exodiaos neofetch predator archlinux

Created 2022-04-16

261 commits to master branch, last one 8 months ago

clearml-serving clearml

42

146

apache-2.0

10

ClearML - Model-Serving Orchestration and Repository Solution

ai mlops devops triton clearml serving kubernetes serving-ml deep-learning model-serving machine-learning tensorflow-serving serving-pytorch-models triton-inference-server

Created 2021-04-12

143 commits to main branch, last one 2 months ago

isaac_ros_object_detection NVIDIA-ISAAC-ROS

32

143

apache-2.0

3

NVIDIA-accelerated, deep learned model support for image space object detection

gpu ros ros2 jetson nvidia triton tensorrt inference ros2-humble deep-learning machine-learning object-detection

Created 2022-03-22

43 commits to main branch, last one 21 days ago

Savior novioleo

28

137

bsd-2-clause

7

(WIP)The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework for algorithm service that ensures reliability, high concurrency and scalability of...

rpa triton workflow deployment distributed deeplearning

Created 2021-03-07

408 commits to master branch, last one 3 years ago

hip-attention DeepAuto-AI

14

127

other

8

Training-free Post-training Efficient Sub-quadratic Complexity Attention. Implemented with OpenAI Triton.

triton attention hip-attention openai-triton attention-mechanism sub-quadratic-attention

Created 2024-02-29

1,250 commits to deepauto/dev branch, last one a day ago

isaac_ros_dnn_inference NVIDIA-ISAAC-ROS

16

111

apache-2.0

4

NVIDIA-accelerated DNN model inference ROS 2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU

ai dnn gpu ros tao ros2 jetson nvidia triton tensorrt ros2-humble deeplearning deep-learning tensorrt-inference triton-inference-server

Created 2021-10-13

46 commits to main branch, last one 21 days ago

flashattention2-custom-mask alexzhang13

11

102

apache-2.0

4

Triton implementation of FlashAttention2 that adds Custom Masks.

triton attention triton-lang cuda-kernels deep-learning flash-attention flash-attention-2 attention-mechanism

Created 2024-07-20

18 commits to main branch, last one 7 months ago

fastDeploy notAI-tech

17

97

mit

7

Deploy DL/ ML inference pipelines with minimal extra code.

Created 2020-04-09

503 commits to master branch, last one 4 months ago

triton triton

9

67

mit

12

Triton Operating System

nix linux nixos atomic triton nixpkgs systemd packages atomic-updates operating-system linux-distribution declarative-language

Created 2015-06-02

94,340 commits to master branch, last one 5 years ago

triton-cn hyperai

6

60

mit

4

Triton Documentation in Chinese Simplified / Triton 中文文档

gpu nvidia openai triton translation deep-learning documentation machine-learning chinese-simplified

Created 2024-09-19

57 commits to master branch, last one 3 months ago

triton-bn ergrelet

4

58

apache-2.0

7

Binary Ninja plugin that can be used to apply Triton's dead store eliminitation pass on basic blocks or functions.

cpp triton binary-ninja deobfuscation binary-ninja-plugin reverse-engineering

Created 2022-06-06

21 commits to main branch, last one 8 months ago

redis-nvidia-recsys redis-developer

6

57

bsd-3-clause

5

Three examples of recommendation system pipelines with NVIDIA Merlin and Redis

dlrm redis triton nvidia-merlin vector-search recommendation vector-database recommender-system

Created 2022-11-21

10 commits to master branch, last one 2 years ago

nixos-nvidia-cuda-python-docker-compose suvash

5

41

unknown

2

A step-by-step guide to setting up Nvidia GPUs with CUDA support running on Docker (and Compose) containers on NixOS host

jax nixos nvidia python triton pytorch nvidia-smi tensorflow nvidia-cuda deep-learning nvidia-docker docker-compose

Created 2021-08-07

24 commits to main branch, last one 8 days ago

fast-audiomentations Lallapallooza

1

33

mit

2

⚡ Blazing fast audio augmentation in Python, powered by GPU for high-efficiency processing in machine learning and audio analysis tasks.

dsp gpu audio python triton pytorch audio-effects augmentations machine-learning data-augmentation audio-augmentation audio-data-augmentation

Created 2024-01-19

2 commits to main branch, last one about a year ago

Triformer dame-cell

0

32

mit

1

Transformers components but in Triton

gpt2 triton transformer-architecture

Created 2024-10-14

255 commits to main branch, last one 2 days ago

Triton_RAT WhiteeRabbit

4

25

mit

1

🦎Triton_RAT is free and easy to use, one of the best remote administration tools written in Python, fully integrated with Telegram🦎

rat python triton python3 windows pc-control windows-rat telegram-bot remote-control windows-python remote-access-tool windows-python-rat remote-administrator-tool remote-administration-tool remote-administration-trojan telegram-remote-administration-tool

Created 2024-05-27

60 commits to release branch, last one about a month ago