Trending repositories for topic high-performance-computing
Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance p...
Lightweight, general, scalable C++ library for finite element methods
BS::thread_pool: a fast, lightweight, modern, and easy-to-use C++17 / C++20 / C++23 thread pool library
High-performance TensorFlow library for quantitative finance.
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sg...
Training and serving large-scale neural networks with auto parallelization.
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
:ocean: Framework for studying fluid dynamics with numerical simulations using Python (publish-only mirror). The main repo is hosted on https://foss.heptapod.net (Gitlab fork supporting Mercurial).
Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programmi...
Acceleration package for neural networks on multi-core CPUs
Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance p...
:ocean: Framework for studying fluid dynamics with numerical simulations using Python (publish-only mirror). The main repo is hosted on https://foss.heptapod.net (Gitlab fork supporting Mercurial).
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sg...
(WIP) A small but powerful, homemade PyTorch from scratch.
Lightweight, general, scalable C++ library for finite element methods
BS::thread_pool: a fast, lightweight, modern, and easy-to-use C++17 / C++20 / C++23 thread pool library
A General-purpose Task-parallel Programming System using Modern C++
Training and serving large-scale neural networks with auto parallelization.
Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programmi...
High-performance TensorFlow library for quantitative finance.
Acceleration package for neural networks on multi-core CPUs
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance p...
BS::thread_pool: a fast, lightweight, modern, and easy-to-use C++17 / C++20 / C++23 thread pool library
Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programmi...
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sg...
A list of awesome compiler projects and papers for tensor computation and deep learning.
Lightweight, general, scalable C++ library for finite element methods
High-performance TensorFlow library for quantitative finance.
IPPL is a C++ library to develop performance portable code for fully Eulerian, Lagrangian or hybrid Eulerian-Lagrangian methods.
Training and serving large-scale neural networks with auto parallelization.
A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.
IPPL is a C++ library to develop performance portable code for fully Eulerian, Lagrangian or hybrid Eulerian-Lagrangian methods.
A scalable reinforcement learning framework for CFD on HPC systems
Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance p...
:ocean: Framework for studying fluid dynamics with numerical simulations using Python (publish-only mirror). The main repo is hosted on https://foss.heptapod.net (Gitlab fork supporting Mercurial).
Supercomputing @ GT has compiled a list of organizations that offer internships and experiences in HPC and applications of HPC.
A framework for the automated derivation and parallel execution of finite difference solvers on a range of computer architectures.
Notes and tutorials on Density Functional Theory calculation using Quantum Espresso.
A fast poisson image editing implementation that can utilize multi-core CPU or GPU to handle a high-resolution image input.
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sg...
This repo is a curated library to help you achieve a deeper understanding of what drives success and continuous improvement. Dive in, and discover content that can expand your thinking, sharpen your e...
Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programmi...
A curated list of awesome projects and papers for distributed training or inference
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
High-performance TensorFlow library for quantitative finance.
BS::thread_pool: a fast, lightweight, modern, and easy-to-use C++17 / C++20 / C++23 thread pool library
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sg...
Large-scale Auto-Distributed Training/Inference Unified Framework | Memory-Compute-Control Decoupled Architecture | Multi-language SDK & Heterogeneous Hardware Support
Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programmi...
A list of awesome compiler projects and papers for tensor computation and deep learning.
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
Lightweight, general, scalable C++ library for finite element methods
Training and serving large-scale neural networks with auto parallelization.
High performance data processing employs high performance computing (HPC) to process data, which is then translated into information and knowledge. The advent of high-performance computing and data an...
Large-scale Auto-Distributed Training/Inference Unified Framework | Memory-Compute-Control Decoupled Architecture | Multi-language SDK & Heterogeneous Hardware Support
IPPL is a C++ library to develop performance portable code for fully Eulerian, Lagrangian or hybrid Eulerian-Lagrangian methods.
High performance data processing employs high performance computing (HPC) to process data, which is then translated into information and knowledge. The advent of high-performance computing and data an...
A scalable reinforcement learning framework for CFD on HPC systems
CP3d is a comprehensive Euler-Lagrange solver for the direct numerical simulations of particle-laden flows.
A Taichi-powered high-performance numerical simulator for multiscale and multifield geophysical problems
Open source digital rocks software platform for micro-CT, CT, thin sections and borehole image analysis. Includes tools for: annotation, AI, HPC, porous media flow simulation, porosity analysis, perme...
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sg...
Main repository for QMCPACK, an open-source production level many-body ab initio Quantum Monte Carlo code for computing the electronic structure of atoms, molecules, and solids with full performance p...
A small OpenCL benchmark program to measure peak GPU/CPU performance.
A framework for the automated derivation and parallel execution of finite difference solvers on a range of computer architectures.
An ECS-based power system steady-state power flow calculation software written in rust.
Parallel algorithms and data structures for tree-based adaptive mesh refinement (AMR) with arbitrary element shapes.
High-performance and differentiation-enabled nonlinear solvers (Newton methods), bracketed rootfinding (bisection, Falsi), with sparsity and Newton-Krylov support.
Supercomputing @ GT has compiled a list of organizations that offer internships and experiences in HPC and applications of HPC.
Open source digital rocks software platform for micro-CT, CT, thin sections and borehole image analysis. Includes tools for: annotation, AI, HPC, porous media flow simulation, porosity analysis, perme...
Large-scale Auto-Distributed Training/Inference Unified Framework | Memory-Compute-Control Decoupled Architecture | Multi-language SDK & Heterogeneous Hardware Support
An ECS-based power system steady-state power flow calculation software written in rust.
A General-purpose Task-parallel Programming System using Modern C++
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
BS::thread_pool: a fast, lightweight, modern, and easy-to-use C++17 / C++20 / C++23 thread pool library
Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programmi...
(WIP) A small but powerful, homemade PyTorch from scratch.
High-performance TensorFlow library for quantitative finance.
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
A list of awesome compiler projects and papers for tensor computation and deep learning.
Lightweight, general, scalable C++ library for finite element methods
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sg...
A fast, ergonomic and portable tensor library in Nim with a deep learning focus for CPU, GPU and embedded devices via OpenMP, Cuda and OpenCL backends
A Taichi-powered high-performance numerical simulator for multiscale and multifield geophysical problems
A code for fast, massively-parallel simulations of two-phase flows with heat transfer
Open source digital rocks software platform for micro-CT, CT, thin sections and borehole image analysis. Includes tools for: annotation, AI, HPC, porous media flow simulation, porosity analysis, perme...
GAL-DAWN: An Novel High performance computing Library of Graph Algorithms based on DAWN, CUDA/C++
IPPL is a C++ library to develop performance portable code for fully Eulerian, Lagrangian or hybrid Eulerian-Lagrangian methods.
SaunaFS is a free-and open source, distributed POSIX file system inspired by Google File System.
A small OpenCL benchmark program to measure peak GPU/CPU performance.
Parallel algorithms and data structures for tree-based adaptive mesh refinement (AMR) with arbitrary element shapes.
High performance data processing employs high performance computing (HPC) to process data, which is then translated into information and knowledge. The advent of high-performance computing and data an...
A scalable reinforcement learning framework for CFD on HPC systems
Supercomputing @ GT has compiled a list of organizations that offer internships and experiences in HPC and applications of HPC.
CP3d is a comprehensive Euler-Lagrange solver for the direct numerical simulations of particle-laden flows.
Compiler for multiple programming models (SYCL, C++ standard parallelism, HIP/CUDA) for CPUs and GPUs from all vendors: The independent, community-driven compiler for C++-based heterogeneous programmi...
crew launcher plugins for traditional high-performance computing clusters