65 results found Sort:

965
10.7k
other
34
🪢 Open source LLM engineering platform: LLM Observability, metrics, evals, prompt management, playground, datasets. Integrates with OpenTelemetry, Langchain, OpenAI SDK, LiteLLM, and more. 🍊YC W23
Created 2023-05-18
4,030 commits to main branch, last one 14 hours ago
488
6.8k
apache-2.0
62
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
Created 2023-05-10
1,602 commits to main branch, last one 10 hours ago
514
6.3k
mit
20
Test your prompts, agents, and RAGs. Red teaming, pentesting, and vulnerability scanning for LLMs. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command ...
Created 2023-04-28
4,238 commits to main branch, last one 21 hours ago
531
6.1k
apache-2.0
27
The LLM Evaluation Framework
Created 2023-08-10
4,806 commits to main branch, last one 14 hours ago
403
5.5k
other
37
AI Observability & Evaluation
Created 2022-11-09
4,909 commits to main branch, last one 24 hours ago
318
4.5k
apache-2.0
35
🐢 Open-Source Evaluation & Testing for AI & LLM systems
Created 2022-03-06
10,306 commits to main branch, last one 2 days ago
426
4.3k
apache-2.0
40
the LLM vulnerability scanner
Created 2023-05-10
1,924 commits to main branch, last one 11 hours ago
304
3.9k
apache-2.0
32
AutoRAG: An Open-Source Framework for Retrieval-Augmented Generation (RAG) Evaluation & Optimization with AutoML-Style Automation
Created 2024-01-10
849 commits to main branch, last one 18 hours ago
363
3.6k
apache-2.0
22
🧊 Open source LLM observability platform. One line of code to monitor, evaluate, and experiment. YC W23 🍓
Created 2023-01-31
3,747 commits to main branch, last one 18 hours ago
The LLM's practical guide: From the fundamentals to deploying advanced LLM and RAG apps to AWS using LLMOps best practices
Created 2024-04-09
199 commits to main branch, last one about a month ago
308
2.6k
mit
29
The open-source LLMOps platform: prompt playground, prompt management, LLM evaluation, and LLM observability all in one place.
Created 2023-04-26
12,083 commits to main branch, last one 2 days ago
113
1.9k
apache-2.0
11
Laminar - open-source all-in-one platform for engineering AI products. Crate data flywheel for you AI app. Traces, Evals, Datasets, Labels. YC S24.
Created 2024-08-29
540 commits to main branch, last one a day ago
206
1.3k
apache-2.0
18
Agentic LLM Vulnerability Scanner / AI red teaming kit 🧪
Created 2024-04-11
581 commits to main branch, last one 15 days ago
Prompty makes it easy to create, manage, debug, and evaluate LLM prompts for your AI applications. Prompty is an asset class and format for LLM prompts designed to enhance observability, understandab...
Created 2024-04-22
451 commits to main branch, last one 10 hours ago
53
523
apache-2.0
9
A powerful tool for automated LLM fuzzing. It is designed to help developers and security researchers identify and mitigate potential jailbreaks in their LLM APIs.
Created 2024-12-03
187 commits to main branch, last one a day ago
Awesome-LLM-Eval: a curated list of tools, datasets/benchmark, demos, leaderboard, papers, docs and models, mainly for Evaluation on LLMs. 一个由工具、基准/数据、演示、排行榜和大模型等组成的精选列表,主要面向基础大模型评测,旨在探求生成式AI的技术边界.
Created 2023-04-26
263 commits to main branch, last one 6 months ago
34
489
apache-2.0
4
Data-Driven Evaluation for LLM-Powered Applications
Created 2023-12-08
106 commits to main branch, last one 3 months ago
Awesome papers involving LLMs in Social Science.
Created 2023-10-15
161 commits to main branch, last one 3 days ago
A curated list of 🌌 Azure OpenAI, 🦙 Large Language Models (incl. RAG, Agent), and references with memos.
Created 2023-04-13
190 commits to main branch, last one a day ago
Build, Improve Performance, and Productionize your LLM Application with an Integrated Framework
Created 2024-03-12
419 commits to main branch, last one 4 months ago
Python SDK for running evaluations on LLM generated responses
Created 2023-11-22
791 commits to main branch, last one 9 days ago
All-in-one Web Agent framework for post-training. Start building with a few clicks!
Created 2024-06-06
526 commits to main branch, last one 2 months ago
26
227
apache-2.0
5
A list of LLMs Tools & Projects
Created 2023-05-09
70 commits to main branch, last one 11 days ago
LangFair is a Python library for conducting use-case level LLM bias and fairness assessments
Created 2024-09-20
319 commits to main branch, last one 3 days ago
A comprehensive set of LLM benchmark scores and provider prices.
Created 2024-09-07
87 commits to main branch, last one about a month ago
A comprehensive guide to LLM evaluation methods designed to assist in identifying the most suitable evaluation techniques for various use cases, promote the adoption of best practices in LLM assessmen...
Created 2024-04-02
393 commits to main branch, last one 2 days ago
Framework for LLM evaluation, guardrails and security
Created 2024-03-02
9 commits to main branch, last one 7 months ago
Superpipe - optimized LLM pipelines for structured data
Created 2024-02-07
99 commits to main branch, last one 10 months ago
Initiative to evaluate and rank the most popular LLMs across common task types based on their propensity to hallucinate.
Created 2023-11-15
7 commits to main branch, last one 7 months ago