Trending repositories for topic bioinformatics
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))
A simple toolset for BED files (warning: CLI may change before bedtk becomes stable)
Official git repository for Biopython (originally converted from CVS)
Metabuli: specific and sensitive metagenomic classification via joint analysis of DNA and amino acid.
GraffiTE is a pipeline that finds polymorphic transposable elements in genome assemblies and/or long reads, and genotypes the discovered polymorphisms in read sets using genome-graphs.
A python library for multi omics included bulk and single cell RNA-seq analysis.
GTDB-Tk: a toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes.
Awesome-Biomolecule-Language-Cross-Modeling: a curated list of resources for paper "Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey"
P2Rank: Protein-ligand binding site prediction tool based on machine learning. Stand-alone command line program / Java library for predicting ligand binding pockets from protein structure.
Scripts to download genomes from the NCBI FTP servers
A cross-platform and ultrafast toolkit for FASTA/Q file manipulation
Simple phylogenetic tree visualization python package for phylogenetic analysis
Metabuli: specific and sensitive metagenomic classification via joint analysis of DNA and amino acid.
Simple phylogenetic tree visualization python package for phylogenetic analysis
GraffiTE is a pipeline that finds polymorphic transposable elements in genome assemblies and/or long reads, and genotypes the discovered polymorphisms in read sets using genome-graphs.
A simple toolset for BED files (warning: CLI may change before bedtk becomes stable)
Awesome-Biomolecule-Language-Cross-Modeling: a curated list of resources for paper "Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey"
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))
Detecting methylation using signal-level features from Nanopore sequencing reads of plants
A python library for multi omics included bulk and single cell RNA-seq analysis.
A Julia package to read, write and manipulate macromolecular structures
P2Rank: Protein-ligand binding site prediction tool based on machine learning. Stand-alone command line program / Java library for predicting ligand binding pockets from protein structure.
GTDB-Tk: a toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes.
Clone with Python! Data structures for double stranded DNA & simulation of homologous recombination, Gibson assembly, cut & paste cloning.
A C/C++ library for fast interval overlap queries (with a "bedtools coverage" example)
Empower Large Language Models (LLM) using Knowledge Graph based Retrieval-Augmented Generation (KG-RAG) for knowledge intensive tasks
A curated list of awesome Bioinformatics libraries and software.
Official git repository for Biopython (originally converted from CVS)
A python library for multi omics included bulk and single cell RNA-seq analysis.
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))
Foldseek enables fast and sensitive comparisons of large structure sets.
Scripts to download genomes from the NCBI FTP servers
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
GraffiTE is a pipeline that finds polymorphic transposable elements in genome assemblies and/or long reads, and genotypes the discovered polymorphisms in read sets using genome-graphs.
A simple toolset for BED files (warning: CLI may change before bedtk becomes stable)
Circular visualization in Python (Circos Plot, Chord Diagram, Radar Chart)
A cross-platform and ultrafast toolkit for FASTA/Q file manipulation
Burrow-Wheeler Aligner for short-read alignment (see minimap2 for long-read alignment)
GraffiTE is a pipeline that finds polymorphic transposable elements in genome assemblies and/or long reads, and genotypes the discovered polymorphisms in read sets using genome-graphs.
Metabuli: specific and sensitive metagenomic classification via joint analysis of DNA and amino acid.
Simple phylogenetic tree visualization python package for phylogenetic analysis
A simple toolset for BED files (warning: CLI may change before bedtk becomes stable)
A lightweight platform-accelerated library for biological motif scanning using position weight matrices.
A python library for multi omics included bulk and single cell RNA-seq analysis.
Empower Large Language Models (LLM) using Knowledge Graph based Retrieval-Augmented Generation (KG-RAG) for knowledge intensive tasks
Awesome-Biomolecule-Language-Cross-Modeling: a curated list of resources for paper "Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey"
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))
Detecting methylation using signal-level features from Nanopore sequencing reads of plants
Empower Large Language Models (LLM) using Knowledge Graph based Retrieval-Augmented Generation (KG-RAG) for knowledge intensive tasks
A curated list of awesome Bioinformatics libraries and software.
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))
Official git repository for Biopython (originally converted from CVS)
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
Circular visualization in Python (Circos Plot, Chord Diagram, Radar Chart)
Foldseek enables fast and sensitive comparisons of large structure sets.
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
Metabuli: specific and sensitive metagenomic classification via joint analysis of DNA and amino acid.
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
A python library for multi omics included bulk and single cell RNA-seq analysis.
A cross-platform and ultrafast toolkit for FASTA/Q file manipulation
MMseqs2: ultra fast and sensitive search and clustering suite
Metabuli: specific and sensitive metagenomic classification via joint analysis of DNA and amino acid.
Empower Large Language Models (LLM) using Knowledge Graph based Retrieval-Augmented Generation (KG-RAG) for knowledge intensive tasks
RawHash is the first mechanism that can accurately and efficiently map raw nanopore signals to large reference genomes (e.g., a human reference genome) in real-time without using powerful computation...
GraffiTE is a pipeline that finds polymorphic transposable elements in genome assemblies and/or long reads, and genotypes the discovered polymorphisms in read sets using genome-graphs.
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))
Awesome-Biomolecule-Language-Cross-Modeling: a curated list of resources for paper "Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey"
Exon is an OLAP query engine specifically for biology and life science applications.
Open-ST: profile and analyze tissue transcriptomes in 3D with high resolution in your lab
a UNIX shell toolkit for processing and analyzing multiple sequence alignments and phylogenies
ClairS - a deep-learning method for long-read somatic small variant calling
A python library for multi omics included bulk and single cell RNA-seq analysis.
Empower Large Language Models (LLM) using Knowledge Graph based Retrieval-Augmented Generation (KG-RAG) for knowledge intensive tasks
Awesome-Biomolecule-Language-Cross-Modeling: a curated list of resources for paper "Leveraging Biomolecule and Natural Language through Multi-Modal Learning: A Survey"
DeepSomatic is an analysis pipeline that uses a deep neural network to call somatic variants from tumor-normal sequencing data.
BioT5: Enriching Cross-modal Integration in Biology with Chemical Knowledge and Natural Language Associations (EMNLP 2023)
𝐠𝐠𝐯𝐨𝐥𝐜 effortlessly translates differential expression datasets and RNAseq data into informative volcano plots. Highlight genes of interest with unprecedented ease. With just a single line of co...
A Quantum Computing and Machine Learning Model that accelerates the Drug Research and Development process
Searching for structural similarities across billions of molecules in milliseconds
ClairS-TO - a deep-learning method for tumor-only somatic variant calling
Open-ST: profile and analyze tissue transcriptomes in 3D with high resolution in your lab
Official git repository for Biopython (originally converted from CVS)
A curated list of awesome Bioinformatics libraries and software.
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
Empower Large Language Models (LLM) using Knowledge Graph based Retrieval-Augmented Generation (KG-RAG) for knowledge intensive tasks
Circular visualization in Python (Circos Plot, Chord Diagram, Radar Chart)
Foldseek enables fast and sensitive comparisons of large structure sets.
MMseqs2: ultra fast and sensitive search and clustering suite
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))
A full spaCy pipeline and models for scientific/biomedical documents.
A python library for multi omics included bulk and single cell RNA-seq analysis.
Unix, R and python tools for genomics and data science
A python library for multi omics included bulk and single cell RNA-seq analysis.
multiPrime is a mismatch-tolerant minimal primer set design tool for large and diverse sequences (e.g. Virus). Here is a web-based version (test: http://multiPrime.cn))
Cell2Sentence turns scRNA-seq data into text for LLM training.
Rapids_singlecell: A GPU-accelerated tool for scRNA analysis. Offers seamless scverse compatibility for efficient single-cell data processing and analysis.
What should perfect bioinformatic tools be like?
𝐠𝐠𝐯𝐨𝐥𝐜 effortlessly translates differential expression datasets and RNAseq data into informative volcano plots. Highlight genes of interest with unprecedented ease. With just a single line of co...
Python tool for alignment of spatial transcriptomics (ST) data using diffeomorphic metric mapping
Detection of remote homology by comparison of protein language model representations
🌿: GIS for philological, archaeological, and anthropological data.
A genome completeness evaluation tool based on miniprot
Metabuli: specific and sensitive metagenomic classification via joint analysis of DNA and amino acid.
Read specialized NGS formats as data frames in R, Python, and more.
RawHash is the first mechanism that can accurately and efficiently map raw nanopore signals to large reference genomes (e.g., a human reference genome) in real-time without using powerful computation...