2 results found Sort:
No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models (ICLR 2022)
Created
2022-02-04
4 commits to main branch, last one 3 years ago
AdaTask: A Task-Aware Adaptive Learning Rate Approach to Multi-Task Learning. AAAI, 2023.
Created
2023-03-24
5 commits to master branch, last one about a year ago