Trending repositories for topic high-performance-computing
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
Open Source Platform for developing, scaling and deploying serious ML, AI, and data science systems
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
Lightweight, general, scalable C++ library for finite element methods
Implementation of SYCL and C++ standard parallelism for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications ada...
A list of awesome compiler projects and papers for tensor computation and deep learning.
FLEXI: A high order discontinuous Galerkin framework for hyperbolic–parabolic conservation laws
SPHinXsys provides C++ APIs for engineering simulation and optimization. It aims at complex systems driven by fluid, structure, multi-body dynamics and beyond. The multi-physics library is based on a ...
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sg...
A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Training and serving large-scale neural networks with auto parallelization.
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
FLEXI: A high order discontinuous Galerkin framework for hyperbolic–parabolic conservation laws
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
SPHinXsys provides C++ APIs for engineering simulation and optimization. It aims at complex systems driven by fluid, structure, multi-body dynamics and beyond. The multi-physics library is based on a ...
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
Implementation of SYCL and C++ standard parallelism for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications ada...
Lightweight, general, scalable C++ library for finite element methods
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sg...
Open Source Platform for developing, scaling and deploying serious ML, AI, and data science systems
A list of awesome compiler projects and papers for tensor computation and deep learning.
A General-purpose Task-parallel Programming System using Modern C++
A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
Training and serving large-scale neural networks with auto parallelization.
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
Open Source Platform for developing, scaling and deploying serious ML, AI, and data science systems
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
A list of awesome compiler projects and papers for tensor computation and deep learning.
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
Lightweight, general, scalable C++ library for finite element methods
Implementation of SYCL and C++ standard parallelism for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications ada...
High-performance TensorFlow library for quantitative finance.
DASH, the C++ Template Library for Distributed Data Structures with Support for Hierarchical Locality for HPC and Data-Driven Science
DASH, the C++ Template Library for Distributed Data Structures with Support for Hierarchical Locality for HPC and Data-Driven Science
This is the official github mirror repository of FrontISTR, Open-Source Large-Scale Parallel FEM Program for Nonlinear Structural Analysis. Active developments of FrontISTR are hosted on https://gitl...
A Taichi-powered high-performance numerical simulator for multiscale and multifield geophysical problems
:gem: Feel++: Finite Element Embedded Language and Library in C++
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
Mt-KaHyPar (Multi-Threaded Karlsruhe Hypergraph Partitioner) is a shared-memory multilevel graph and hypergraph partitioner equipped with parallel implementations of techniques used in the best sequen...
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
FLEXI: A high order discontinuous Galerkin framework for hyperbolic–parabolic conservation laws
A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.
Open Source Platform for developing, scaling and deploying serious ML, AI, and data science systems
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
A list of awesome compiler projects and papers for tensor computation and deep learning.
High-performance TensorFlow library for quantitative finance.
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
Lightweight, general, scalable C++ library for finite element methods
Implementation of SYCL and C++ standard parallelism for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications ada...
Parallel algorithms and data structures for tree-based adaptive mesh refinement (AMR) with arbitrary element shapes.
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
SPHinXsys provides C++ APIs for engineering simulation and optimization. It aims at complex systems driven by fluid, structure, multi-body dynamics and beyond. The multi-physics library is based on a ...
Parallel algorithms and data structures for tree-based adaptive mesh refinement (AMR) with arbitrary element shapes.
A Taichi-powered high-performance numerical simulator for multiscale and multifield geophysical problems
SaunaFS is a free-and open source, distributed POSIX file system inspired by Google File System.
TwinGraph is a Python framework for distributed container orchestration using Kubernetes clusters, Docker Compose/Swarm or cloud resources on AWS (AWS Lambda, AWS Batch, Amazon EKS). Applications incl...
This is the official github mirror repository of FrontISTR, Open-Source Large-Scale Parallel FEM Program for Nonlinear Structural Analysis. Active developments of FrontISTR are hosted on https://gitl...
Mt-KaHyPar (Multi-Threaded Karlsruhe Hypergraph Partitioner) is a shared-memory multilevel graph and hypergraph partitioner equipped with parallel implementations of techniques used in the best sequen...
SPHinXsys provides C++ APIs for engineering simulation and optimization. It aims at complex systems driven by fluid, structure, multi-body dynamics and beyond. The multi-physics library is based on a ...
High performance algorithms in C#: SIMD/SSE, multi-core and faster
A small OpenCL benchmark program to measure peak GPU/CPU performance.
A code for fast, massively-parallel direct numerical simulations (DNS) of canonical flows
A list of awesome compiler projects and papers for tensor computation and deep learning.
A Taichi-powered high-performance numerical simulator for multiscale and multifield geophysical problems
SaunaFS is a free-and open source, distributed POSIX file system inspired by Google File System.
Open source digital rocks software platform for micro-CT, CT, thin sections and borehole image analysis. Includes tools for: annotation, AI, HPC, porous media flow simulation, porosity analysis, perme...
A General-purpose Task-parallel Programming System using Modern C++
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
Open Source Platform for developing, scaling and deploying serious ML, AI, and data science systems
Implementation of SYCL and C++ standard parallelism for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications ada...
High-performance TensorFlow library for quantitative finance.
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
A list of awesome compiler projects and papers for tensor computation and deep learning.
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sg...
Lightweight, general, scalable C++ library for finite element methods
Training and serving large-scale neural networks with auto parallelization.
A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
A modern, fast, lightweight thread pool library based on C++20
GAL-DAWN: An Novel High performance computing Library of Graph Algorithms based on DAWN, CUDA/C++
SaunaFS is a free-and open source, distributed POSIX file system inspired by Google File System.
Open source digital rocks software platform for micro-CT, CT, thin sections and borehole image analysis. Includes tools for: annotation, AI, HPC, porous media flow simulation, porosity analysis, perme...
A small OpenCL benchmark program to measure peak GPU/CPU performance.
Parallel algorithms and data structures for tree-based adaptive mesh refinement (AMR) with arbitrary element shapes.
crew launcher plugins for traditional high-performance computing clusters
An ongoing & curated collection of awesome software best practices and techniques, libraries and frameworks, E-books and videos, websites, blog posts, links to github Repositories, technical guideline...
Supercomputing @ GT has compiled a list of organizations that offer internships and experiences in HPC and applications of HPC.
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sg...
CP3d is a comprehensive Euler-Lagrange solver for the direct numerical simulations of particle-laden flows.
High-performance and differentiation-enabled nonlinear solvers (Newton methods), bracketed rootfinding (bisection, Falsi), with sparsity and Newton-Krylov support.
A curated list of awesome projects and papers for distributed training or inference
Mt-KaHyPar (Multi-Threaded Karlsruhe Hypergraph Partitioner) is a shared-memory multilevel graph and hypergraph partitioner equipped with parallel implementations of techniques used in the best sequen...
Implementation of SYCL and C++ standard parallelism for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programming models. Lets applications ada...