Search Results - RepositoryStats

10 results found Sort:

Filter by Primary Language:
Python (4)
TypeScript (4)
Jupyter Notebook (2)
+

phoenix Arize-ai

342

4.6k

other

32

AI Observability & Evaluation

llms evals agents llmops openai datasets llm-eval anthropic langchain llamaindex smolagents ai-monitoring aiengineering llm-evaluation ai-observability prompt-engineering

Created 2022-11-09

4,134 commits to main branch, last one a day ago

agentops AgentOps-AI

274

2.7k

mit

35

Python SDK for AI agent monitoring, LLM cost tracking, benchmarking, and more. Integrates with most LLMs and agent frameworks including CrewAI, Langchain, Autogen, AG2, and CamelAI

ai llm groq agent evals crewai ollama openai autogen mistral agentops anthropic langchain cost-estimation evaluation-metrics

Created 2023-08-15

597 commits to main branch, last one 4 days ago

83

1.5k

apache-2.0

9

Laminar - open-source all-in-one platform for engineering AI products. Crate data flywheel for you AI app. Traces, Evals, Datasets, Labels. YC S24.

ai rag aiops evals agents llmops analytics rust-lang evaluation monitoring open-source self-hosted llm-workflow observability llm-evaluation developer-tools ai-observability pipeline-builder llm-observability

Created 2024-08-29

325 commits to main branch, last one 21 hours ago

mastra mastra-ai

63

1.4k

other

12

The TypeScript AI framework.

ai llm mcp tts evals agents nextjs nodejs reactjs chatbots workflows javascript typescript

Created 2024-08-06

6,318 commits to main branch, last one 3 hours ago

raglite superlinear-ai

60

759

mpl-2.0

6

🥤 RAGLite is a Python toolkit for Retrieval-Augmented Generation (RAG) with PostgreSQL or SQLite

Created 2024-06-10

63 commits to main branch, last one 27 days ago

evalite mattpocock

13

416

unknown

13

Test your LLM-powered apps with TypeScript. No API key required.

ai evals typescript

Created 2024-11-12

345 commits to main branch, last one 4 days ago

25

77

mit

4

Vivaria is METR's tool for running evaluations and conducting agent elicitation research.

ai evals elicitation ai-evaluation

Created 2024-08-08

488 commits to main branch, last one 2 days ago

HourVideo keshik6

2

62

apache-2.0

3

[NeurIPS 2024] Official code for HourVideo: 1-Hour Video Language Understanding

evals gpt-4 reasoning gemini-pro navigation perception neurips-2024 summarization visual-reasoning benchmark-dataset egocentric-videos spatial-intelligence multiple-choice-questions long-context-understanding video-language-understanding multimodal-large-language-models 1-hour-video-language-understanding long-form-video-language-understanding

Created 2024-11-27

9 commits to main branch, last one 24 days ago

rag-evaluator AIAnytime

15

30

mit

2

A library for evaluating Retrieval-Augmented Generation (RAG) systems (The traditional ways).

rag eval evals

Created 2024-05-21

31 commits to main branch, last one 5 months ago

evalica dustalov

3

29

apache-2.0

2

Evalica, your favourite evaluation toolkit

Created 2024-06-15

345 commits to master branch, last one 7 days ago