Trending repositories for topic information-retrieval
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. W...
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
AdalFlow: The library to build & auto-optimize LLM applications.
A list of software that allows searching the web with the assistance of AI.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a c...
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
The code used to train and run inference with the ColPali architecture.
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
Chatbot for documentation, that allows you to chat with your data. Privately deployable, provides AI knowledge sharing and integrates knowledge into your AI workflow
Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & S...
Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit distances, alignment scores, etc 🦖
telegram group scraper tool. fetch all information about group members
Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.
A list of software that allows searching the web with the assistance of AI.
AdalFlow: The library to build & auto-optimize LLM applications.
The code used to train and run inference with the ColPali architecture.
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
Minimalist web-searching platform with an AI assistant that runs directly from your browser. Uses WebLLM, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.space
Perform forward and backward citation chasing as part of an evidence synthesis project
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
A minimalist yet highly performant, lightweight, lightning fast, multisource, multimodal and local embedding solution, built in rust.
Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & S...
RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing hig...
🍁 Sycamore is an LLM-powered search and analytics platform for unstructured data.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. W...
STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases (NeurIPS D&B 2024)
Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit distances, alignment scores, etc 🦖
A list of software that allows searching the web with the assistance of AI.
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. W...
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
AdalFlow: The library to build & auto-optimize LLM applications.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a c...
Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
The code used to train and run inference with the ColPali architecture.
Chatbot for documentation, that allows you to chat with your data. Privately deployable, provides AI knowledge sharing and integrates knowledge into your AI workflow
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit distances, alignment scores, etc 🦖
🍁 Sycamore is an LLM-powered search and analytics platform for unstructured data.
A compute framework for building Search, RAG, Recommendations and Analytics over complex (structured+unstructured) data, with ultra-modal vector embeddings.
A list of software that allows searching the web with the assistance of AI.
The code used to train and run inference with the ColPali architecture.
AdalFlow: The library to build & auto-optimize LLM applications.
Coeus 🌐 is an OSINT ToolBox empowering users with tools for effective intelligence gathering from open sources. From social media monitoring 📱 to data analysis 📊, it offers a centralized platform f...
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
🍁 Sycamore is an LLM-powered search and analytics platform for unstructured data.
Minimalist web-searching platform with an AI assistant that runs directly from your browser. Uses WebLLM, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.space
A minimalist yet highly performant, lightweight, lightning fast, multisource, multimodal and local embedding solution, built in rust.
A compute framework for building Search, RAG, Recommendations and Analytics over complex (structured+unstructured) data, with ultra-modal vector embeddings.
Code, datasets, and checkpoints for the paper "Improving Passage Retrieval with Zero-Shot Question Generation (EMNLP 2022)"
The idea is to calculate the similarity between the resume and the job description and then return the resumes with the highest similarity score.
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Open-source ecosystem for building AI-powered conversational solutions using RAG, agents, FSMs, and LLMs.
Perform forward and backward citation chasing as part of an evidence synthesis project
A list of software that allows searching the web with the assistance of AI.
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. W...
AdalFlow: The library to build & auto-optimize LLM applications.
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a c...
A list of software that allows searching the web with the assistance of AI.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
The code used to train and run inference with the ColPali architecture.
Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
Chatbot for documentation, that allows you to chat with your data. Privately deployable, provides AI knowledge sharing and integrates knowledge into your AI workflow
Minimalist web-searching platform with an AI assistant that runs directly from your browser. Uses WebLLM, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.space
Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit distances, alignment scores, etc 🦖
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
Track any ip address with IP-Tracer. IP-Tracer is developed for Linux and Termux. you can retrieve any ip address information using IP-Tracer.
A list of software that allows searching the web with the assistance of AI.
Minimalist web-searching platform with an AI assistant that runs directly from your browser. Uses WebLLM, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.space
All-in-One: Text Embedding, Retrieval, Reranking and RAG in Transformers
The code used to train and run inference with the ColPali architecture.
AdalFlow: The library to build & auto-optimize LLM applications.
Coeus 🌐 is an OSINT ToolBox empowering users with tools for effective intelligence gathering from open sources. From social media monitoring 📱 to data analysis 📊, it offers a centralized platform f...
A minimalist yet highly performant, lightweight, lightning fast, multisource, multimodal and local embedding solution, built in rust.
🍁 Sycamore is an LLM-powered search and analytics platform for unstructured data.
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
A compute framework for building Search, RAG, Recommendations and Analytics over complex (structured+unstructured) data, with ultra-modal vector embeddings.
Open-source ecosystem for building AI-powered conversational solutions using RAG, agents, FSMs, and LLMs.
Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & S...
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
🔎 SimilaritySearchKit is a Swift package providing on-device text embeddings and semantic search functionality for iOS and macOS applications.
STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases (NeurIPS D&B 2024)
The code used to train and run inference with the ColPali architecture.
AgentSearch is a framework for powering search agents and enabling customizable local search.
A list of software that allows searching the web with the assistance of AI.
STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases (NeurIPS D&B 2024)
A minimalist yet highly performant, lightweight, lightning fast, multisource, multimodal and local embedding solution, built in rust.
This is the repository for our paper "INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning"
This is the repository for the generative information retrieval survey.
Coeus 🌐 is an OSINT ToolBox empowering users with tools for effective intelligence gathering from open sources. From social media monitoring 📱 to data analysis 📊, it offers a centralized platform f...
This is the repo for the survey of Bias and Fairness in IR with LLMs.
All-in-One: Text Embedding, Retrieval, Reranking and RAG in Transformers
Gen-AI Chat for Teams - Think ChatGPT if it had access to your team's unique knowledge.
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. W...
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a c...
AdalFlow: The library to build & auto-optimize LLM applications.
Chatbot for documentation, that allows you to chat with your data. Privately deployable, provides AI knowledge sharing and integrates knowledge into your AI workflow
Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit distances, alignment scores, etc 🦖
The code used to train and run inference with the ColPali architecture.
Unified embedding generation and search engine. Also available on cloud - cloud.marqo.ai
Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & S...
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
A compute framework for building Search, RAG, Recommendations and Analytics over complex (structured+unstructured) data, with ultra-modal vector embeddings.
Minimalist web-searching platform with an AI assistant that runs directly from your browser. Uses WebLLM, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.space
STaRK: Benchmarking LLM Retrieval on Textual and Relational Knowledge Bases (NeurIPS D&B 2024)
The RAG Experiment Accelerator is a versatile tool designed to expedite and facilitate the process of conducting experiments and evaluations using Azure Cognitive Search and RAG pattern.
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard
A list of software that allows searching the web with the assistance of AI.
🍁 Sycamore is an LLM-powered search and analytics platform for unstructured data.
Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & S...
Open-source ecosystem for building AI-powered conversational solutions using RAG, agents, FSMs, and LLMs.
Efficient Retrieval Augmentation and Generation Framework
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
A large-scale multilingual dataset for Information Retrieval. Thorough human-annotations across 18 diverse languages.
Up to 10x faster strings for C, C++, Python, Rust, and Swift, leveraging NEON, AVX2, AVX-512, and SWAR to accelerate search, sort, edit distances, alignment scores, etc 🦖