15 results found Sort:

169
560
bsd-3-clause
10
[VLDB'22] Anomaly Detection using Transformers, self-conditioning and adversarial training.
Created 2021-03-01
110 commits to main branch, last one about a year ago
Achieve the llama3 inference step-by-step, grasp the core concepts, master the process derivation, implement the code.
Created 2025-02-19
9 commits to main branch, last one 26 days ago
PyTorch implementation of some attentions for Deep Learning Researchers.
Created 2020-03-21
89 commits to master branch, last one 3 years ago
127
507
mpl-2.0
25
Deep Xi: A deep learning approach to a priori SNR estimation implemented in TensorFlow 2/Keras. For speech enhancement and robust ASR.
Created 2018-05-25
425 commits to master branch, last one 3 years ago
35
359
mit
5
Exploring attention weights in transformer-based models with linguistic knowledge.
Created 2020-10-30
186 commits to master branch, last one about a year ago
"Attention, Learn to Solve Routing Problems!"[Kool+, 2019], Capacitated Vehicle Routing Problem solver
Created 2020-06-24
89 commits to master branch, last one 4 years ago
This repository contain various types of attention mechanism like Bahdanau , Soft attention , Additive Attention , Hierarchical Attention etc in Pytorch, Tensorflow, Keras
Created 2018-07-04
59 commits to master branch, last one 3 years ago
A Faster Pytorch Implementation of Multi-Head Self-Attention
Created 2020-07-28
6 commits to master branch, last one 2 years ago
Multi^2OIE: Multilingual Open Information Extraction Based on Multi-Head Attention with BERT (Findings of ACL: EMNLP 2020)
Created 2020-09-17
24 commits to master branch, last one 2 years ago
10
49
apache-2.0
4
Self-Supervised Vision Transformers for multiplexed imaging datasets
Created 2023-01-16
33 commits to master branch, last one 8 months ago
10
47
unknown
2
several types of attention modules written in PyTorch for learning purposes
Created 2023-06-28
45 commits to main branch, last one 5 months ago
The original transformer implementation from scratch. It contains informative comments on each block
Created 2024-06-15
41 commits to main branch, last one 9 months ago
Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.
Created 2023-08-16
1 commits to master branch, last one 23 days ago
Decoding Attention is specially optimized for MHA, MQA, GQA and MLA using CUDA core for the decoding stage of LLM inference.
Created 2024-08-14
2 commits to master branch, last one 13 days ago