Trending repositories for topic bioinformatics
Official git repository for Biopython (originally converted from CVS)
Foldseek enables fast and sensitive comparisons of large structure sets.
[ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Aggregate results from bioinformatics analyses across many samples into a single report.
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
A curated list of awesome Bioinformatics libraries and software.
Blazing-Fast Bioinformatic Operations on Python DataFrames
CodonTransformer: The ultimate tool for codon optimization, optimizing DNA sequences for heterologous protein expression across 164 species.
PhysiCell: Scientist end users should use latest release! Developers please fork the development branch and submit PRs to the dev branch. Thanks!
TOGA (Tool to infer Orthologs from Genome Alignments): implements a novel paradigm to infer orthologous genes. TOGA integrates gene annotation, inferring orthologs and classifying genes as intact or l...
Benchmarking programming languages/implementations for common tasks in Bioinformatics
Robs manual for the computational genomics and bioinformatics class.
[ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Blazing-Fast Bioinformatic Operations on Python DataFrames
CodonTransformer: The ultimate tool for codon optimization, optimizing DNA sequences for heterologous protein expression across 164 species.
PhysiCell: Scientist end users should use latest release! Developers please fork the development branch and submit PRs to the dev branch. Thanks!
TOGA (Tool to infer Orthologs from Genome Alignments): implements a novel paradigm to infer orthologous genes. TOGA integrates gene annotation, inferring orthologs and classifying genes as intact or l...
Benchmarking programming languages/implementations for common tasks in Bioinformatics
Robs manual for the computational genomics and bioinformatics class.
Foldseek enables fast and sensitive comparisons of large structure sets.
Aggregate results from bioinformatics analyses across many samples into a single report.
Pysam is a Python package for reading, manipulating, and writing genomics data such as SAM/BAM/CRAM and VCF/BCF files. It's a lightweight wrapper of the HTSlib API, the same one that powers samtools, ...
Scripts to download genomes from the NCBI FTP servers
Therapeutics Commons (TDC): Multimodal Foundation for Therapeutic Science
Burrow-Wheeler Aligner for short-read alignment (see minimap2 for long-read alignment)
Foldseek enables fast and sensitive comparisons of large structure sets.
A curated list of awesome Bioinformatics libraries and software.
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
Official git repository for Biopython (originally converted from CVS)
Blazing-Fast Bioinformatic Operations on Python DataFrames
CodonTransformer: The ultimate tool for codon optimization, optimizing DNA sequences for heterologous protein expression across 164 species.
Therapeutics Commons (TDC): Multimodal Foundation for Therapeutic Science
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
A full spaCy pipeline and models for scientific/biomedical documents.
Blazing-Fast Bioinformatic Operations on Python DataFrames
[ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Explore a comprehensive collection of basic theories, applications, papers, and best practices about Large Language Models (LLMs) in genomes.
Local version of the virus identification and analysis web server (tool set)
CodonTransformer: The ultimate tool for codon optimization, optimizing DNA sequences for heterologous protein expression across 164 species.
📛 A Python package for using ontologies, terminologies, and biomedical nomenclatures
A Snakemake workflow and MrBiomics module for performing genomic region set and gene set enrichment analyses using LOLA, GREAT, GSEApy, pycisTarget and RcisTarget.
Rust crates for working with Workflow Description Language (WDL) documents.
Open-ST: profile and analyze tissue transcriptomes in 3D with high resolution in your lab
PDB2PQR - determining titration states, adding missing atoms, and assigning charges/radii to biomolecules.
A deep learning model (EVO2-500M) for predicting host specificity of eukaryote-infecting viruses CDNA sequence
A curated list of awesome Bioinformatics libraries and software.
Blazing-Fast Bioinformatic Operations on Python DataFrames
Foldseek enables fast and sensitive comparisons of large structure sets.
Official git repository for Biopython (originally converted from CVS)
Empower Large Language Models (LLM) using Knowledge Graph based Retrieval-Augmented Generation (KG-RAG) for knowledge intensive tasks
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
A deep learning model (EVO2-500M) for predicting host specificity of eukaryote-infecting viruses CDNA sequence
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
A comprehensive library for computational molecular biology
A deep learning model (EVO2-500M) for predicting host specificity of eukaryote-infecting viruses CDNA sequence
Blazing-Fast Bioinformatic Operations on Python DataFrames
A bioinformatics workflow engine built on top of the Workflow Description Language (WDL).
Explore a comprehensive collection of basic theories, applications, papers, and best practices about Large Language Models (LLMs) in genomes.
Rust crates for working with Workflow Description Language (WDL) documents.
protein structure generation with sparse all-atom denoising models
CodonTransformer: The ultimate tool for codon optimization, optimizing DNA sequences for heterologous protein expression across 164 species.
[ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
A Rust-based, headless workflow execution framework supporting local, cloud, and HPC.
Scikit-learn compatible library for molecular fingerprints
A Snakemake workflow and MrBiomics module for performing genomic region set and gene set enrichment analyses using LOLA, GREAT, GSEApy, pycisTarget and RcisTarget.
What should perfect bioinformatic tools be like?
quickly filter fastq files by matching sequences to a set of regex patterns
CodonTransformer: The ultimate tool for codon optimization, optimizing DNA sequences for heterologous protein expression across 164 species.
Affinity Protein-Protein Transformers—State of the art protein-protein binding affinity in seconds!
[ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Explore a comprehensive collection of basic theories, applications, papers, and best practices about Large Language Models (LLMs) in genomes.
A list of awesome awesomeness related to bioinformatics and associated fields
A Rust-based, headless workflow execution framework supporting local, cloud, and HPC.
Fast AlphaFold-Multimer based pipeline for Protein-Protein Interaction (PPI) screening
A curated list of awesome Bioinformatics libraries and software.
Empower Large Language Models (LLM) using Knowledge Graph based Retrieval-Augmented Generation (KG-RAG) for knowledge intensive tasks
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
Official git repository for Biopython (originally converted from CVS)
A python library for multi omics included bulk, single cell and spatial RNA-seq analysis.
Foldseek enables fast and sensitive comparisons of large structure sets.
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
Circular visualization in Python (Circos Plot, Chord Diagram, Radar Chart)
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn)
A comprehensive library for computational molecular biology
Declarative creation of composable visualization for Python (Complex heatmap, Upset plot, Oncoprint and more~)
MrBiomics: Modules & Recipes augment Bioinformatics for Multi-Omics Analyses
Multiple Protein Structure Alignment at Scale with FoldMason
Scikit-learn compatible library for molecular fingerprints
Affinity Protein-Protein Transformers—State of the art protein-protein binding affinity in seconds!
A bioinformatics workflow engine built on top of the Workflow Description Language (WDL).
protein structure generation with sparse all-atom denoising models
Declarative creation of composable visualization for Python (Complex heatmap, Upset plot, Oncoprint and more~)
A Rust-based, headless workflow execution framework supporting local, cloud, and HPC.
A deep learning model (EVO2-500M) for predicting host specificity of eukaryote-infecting viruses CDNA sequence
Open-ST: profile and analyze tissue transcriptomes in 3D with high resolution in your lab
FastOMA is a scalable software package to infer orthology relationship.
Python library for array programming on biological datasets. Documentation available at: https://bionumpy.github.io/bionumpy/