deshwalmahesh / PHUDGE

Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute, relative and much more. It contains a list of all the available tool, methods, repo, code etc to detect hallucination, LLM evaluation, grading and much more.

Date Created 2024-05-11 (8 months ago)

Commits 30 (last one 6 months ago)

Stargazers 48 (0 this week)

Watchers 1 (0 this week)

Forks 7

License unknown

Ranking

RepositoryStats indexes 609,066 repositories, of these deshwalmahesh/PHUDGE is ranked #488,789 (20th percentile) for total stargazers, and #554,446 for total watchers. Github reports the primary language for this repository as Jupyter Notebook, for repositories using this language it is ranked #13,049/18,075.

deshwalmahesh/PHUDGE is also tagged with popular topics, for these it's ranked: pytorch (#5,101/6095), ai (#3,260/4341), llm (#2,355/3111), nlp (#2,089/2463), ml (#512/635)

Other Information

deshwalmahesh/PHUDGE has Github issues enabled, there is 1 open issue and 0 closed issues.

Homepage URL: https://arxiv.org/abs/2405.08029

All Topics

ai ml llm nlp sota judge phi-3 pytorch evaluation finetuning hallucination custom-dataset llm-evaluation feedback-collection hallucination-detection

Star History

Github stargazers over time

Watcher History

Github watchers over time, collection started in '23

Recent Commit History

30 commits on the default branch (main) since jan '22

Yearly Commits

Commits to the default branch (main) per year

Issue History

Languages

The primary language is Jupyter Notebook but there's also others...

updated: 2025-01-10 @ 12:57pm, id: 799117410 / R_kgDOL6GQYg