2 results found Sort:

No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)
Created 2022-02-04
4 commits to main branch, last one 3 years ago
1
27
unknown
1
AdaTask: A Task-Aware Adaptive Learning Rate Approach to Multi-Task Learning. AAAI, 2023.
Created 2023-03-24
5 commits to master branch, last one about a year ago