Trending repositories for topic bioinformatics
quickly filter fastq files by matching sequences to a set of regex patterns
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))
Unix, R and python tools for genomics and data science
Empower Large Language Models (LLM) using Knowledge Graph based Retrieval-Augmented Generation (KG-RAG) for knowledge intensive tasks
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Download sequencing data and metadata from GSA, SRA, ENA, and DDBJ databases.
Rapids_singlecell: A GPU-accelerated tool for scRNA analysis. Offers seamless scverse compatibility for efficient single-cell data processing and analysis.
A python library for multi omics included bulk, single cell and spatial RNA-seq analysis.
Foldseek enables fast and sensitive comparisons of large structure sets.
quickly filter fastq files by matching sequences to a set of regex patterns
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
An accurate and sensitive bacterial plasmid identification tool based on deep machine-learning of shared k-mers and genomic features.
Download sequencing data and metadata from GSA, SRA, ENA, and DDBJ databases.
Rapids_singlecell: A GPU-accelerated tool for scRNA analysis. Offers seamless scverse compatibility for efficient single-cell data processing and analysis.
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))
Democratizing ML-powered DNA analysis through efficient on-device computation and interpretive tools.
A genome completeness evaluation tool based on miniprot
Declarative creation of composable visualization for Python (Complex heatmap, Upset plot, Oncoprint and more~)
A python library for multi omics included bulk, single cell and spatial RNA-seq analysis.
Protein-protein, protein-peptide and protein-DNA docking framework based on the GSO algorithm
quickly filter fastq files by matching sequences to a set of regex patterns
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
Kun-peng: an ultra-fast, low-memory footprint and accurate taxonomy classifier for all
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
Declarative creation of composable visualization for Python (Complex heatmap, Upset plot, Oncoprint and more~)
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))
Official git repository for Biopython (originally converted from CVS)
Download sequencing data and metadata from GSA, SRA, ENA, and DDBJ databases.
Foldseek enables fast and sensitive comparisons of large structure sets.
Unix, R and python tools for genomics and data science
A curated list of awesome Bioinformatics libraries and software.
A python library for multi omics included bulk, single cell and spatial RNA-seq analysis.
quickly filter fastq files by matching sequences to a set of regex patterns
Kun-peng: an ultra-fast, low-memory footprint and accurate taxonomy classifier for all
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Download sequencing data and metadata from GSA, SRA, ENA, and DDBJ databases.
FastOMA is a scalable software package to infer orthology relationship.
Declarative creation of composable visualization for Python (Complex heatmap, Upset plot, Oncoprint and more~)
Local version of the virus identification and analysis web server (tool set)
An accurate and sensitive bacterial plasmid identification tool based on deep machine-learning of shared k-mers and genomic features.
GraffiTE is a pipeline that finds polymorphic transposable elements in genome assemblies and/or long reads, and genotypes the discovered polymorphisms in read sets using genome-graphs.
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))
Democratizing ML-powered DNA analysis through efficient on-device computation and interpretive tools.
Kun-peng: an ultra-fast, low-memory footprint and accurate taxonomy classifier for all
Declarative creation of composable visualization for Python (Complex heatmap, Upset plot, Oncoprint and more~)
A python library for multi omics included bulk, single cell and spatial RNA-seq analysis.
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))
A curated list of awesome Bioinformatics libraries and software.
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
quickly filter fastq files by matching sequences to a set of regex patterns
FastOMA is a scalable software package to infer orthology relationship.
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
Official git repository for Biopython (originally converted from CVS)
Python library to facilitate genome assembly, annotation, and comparative genomics
Foldseek enables fast and sensitive comparisons of large structure sets.
Kun-peng: an ultra-fast, low-memory footprint and accurate taxonomy classifier for all
FastOMA is a scalable software package to infer orthology relationship.
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Declarative creation of composable visualization for Python (Complex heatmap, Upset plot, Oncoprint and more~)
Rust crates for working with Workflow Description Language (WDL) documents.
Feature-rich Python implementation of the tximport package for gene count estimation.
A python library for multi omics included bulk, single cell and spatial RNA-seq analysis.
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))
Awesome-Biomolecule-Language-Cross-Modeling: a curated list of resources for paper "Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey"
CodonTransformer: The ultimate tool for codon optimization, optimizing DNA sequences for heterologous protein expression across 164 species.
A curated list of awesome curated lists of awesome softwares and resources in bioinformatics and affiliated areas
A bioinformatics tool written in Rust to find palindromic sequences in DNA
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Feature-rich Python implementation of the tximport package for gene count estimation.
Fast AlphaFold-Multimer based pipeline for Protein-Protein Interaction (PPI) screening
kmer based feature extraction tool for bioinformatics, metagenomics, AI/ML and more
Empower Large Language Models (LLM) using Knowledge Graph based Retrieval-Augmented Generation (KG-RAG) for knowledge intensive tasks
Official git repository for Biopython (originally converted from CVS)
A curated list of awesome Bioinformatics libraries and software.
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
A python library for multi omics included bulk, single cell and spatial RNA-seq analysis.
Foldseek enables fast and sensitive comparisons of large structure sets.
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
Circular visualization in Python (Circos Plot, Chord Diagram, Radar Chart)
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))
Unix, R and python tools for genomics and data science
A cross-platform and ultrafast toolkit for FASTA/Q file manipulation
Multiple Protein Structure Alignment at Scale with FoldMason
What should perfect bioinformatic tools be like?
Declarative creation of composable visualization for Python (Complex heatmap, Upset plot, Oncoprint and more~)
A bioinformatics workflow engine built on top of the Workflow Description Language (WDL).
Empower Large Language Models (LLM) using Knowledge Graph based Retrieval-Augmented Generation (KG-RAG) for knowledge intensive tasks
FastOMA is a scalable software package to infer orthology relationship.
Open-ST: profile and analyze tissue transcriptomes in 3D with high resolution in your lab
Fiora is an in silico fragmentation algorithm for small compounds that produces simulated tandem mass spectra (MS/MS). The framework employs a graph neural network to predict bond cleavages and fragme...
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))