10 results found Sort:
- Filter by Primary Language:
- Python (7)
- Jupyter Notebook (3)
- +
What can Large Language Models do in chemistry? A comprehensive benchmark on eight tasks
Created
2023-05-21
62 commits to main branch, last one 4 months ago
Official repository of MMGenBench
Created
2024-11-18
6 commits to main branch, last one 11 days ago
BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks on Large Language Models
Created
2024-08-21
79 commits to main branch, last one 3 months ago
Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)
Created
2023-07-24
1,067 commits to main branch, last one 2 months ago
How good are LLMs at chemistry?
Created
2023-05-16
1,083 commits to main branch, last one about a month ago
Language Model for Mainframe Modernization
Created
2024-08-02
30 commits to main branch, last one 3 months ago
CompBench evaluates the comparative reasoning of multimodal large language models (MLLMs) with 40K image pairs and questions across 8 dimensions of relative comparison: visual attribute, existence, st...
Created
2024-07-23
4 commits to main branch, last one 3 months ago
The data and implementation for the experiments in the paper "Flows: Building Blocks of Reasoning and Collaborating AI".
Created
2023-08-02
6 commits to main branch, last one 9 months ago
Restore safety in fine-tuned language models through task arithmetic
Created
2024-02-17
83 commits to main branch, last one 8 months ago
Training and Benchmarking LLMs for Code Preference.
Created
2024-10-22
10 commits to main branch, last one 17 days ago