Trending repositories for topic parallel-computing
Lightweight, general, scalable C++ library for finite element methods
A bleeding-edge, lock-free, wait-free, continuation-stealing tasking library built on C++20's coroutines
Xiao's CUDA Optimization Guide [Active Adding New Contents]
Symbolic programming for the next generation of numerical software
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
PyTorch Implementation of Implicit Quantile Networks (IQN) for Distributional Reinforcement Learning with additional extensions like PER, Noisy layer, N-step bootstrapping, Dueling architecture and p...
Parallel algorithms and data structures for tree-based adaptive mesh refinement (AMR) with arbitrary element shapes.
:rocket: R package: future: Unified Parallel and Distributed Processing in R for Everyone
PyTorch Implementation of Implicit Quantile Networks (IQN) for Distributional Reinforcement Learning with additional extensions like PER, Noisy layer, N-step bootstrapping, Dueling architecture and p...
Lightweight, general, scalable C++ library for finite element methods
Xiao's CUDA Optimization Guide [Active Adding New Contents]
A bleeding-edge, lock-free, wait-free, continuation-stealing tasking library built on C++20's coroutines
Parallel algorithms and data structures for tree-based adaptive mesh refinement (AMR) with arbitrary element shapes.
Symbolic programming for the next generation of numerical software
:rocket: R package: future: Unified Parallel and Distributed Processing in R for Everyone
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
A General-purpose Task-parallel Programming System using Modern C++
Lightweight, general, scalable C++ library for finite element methods
A bleeding-edge, lock-free, wait-free, continuation-stealing tasking library built on C++20's coroutines
Xiao's CUDA Optimization Guide [Active Adding New Contents]
A tool for running android and iOS appium tests in parallel across devices... U like it STAR it !
https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching
Lightweight fast function pipeline (DAG) creation in pure Python for scientific workflows 🕸️🧪
Parallel, highly efficient code (CPU and GPU) for DEM and CFD-DEM simulations.
Lightweight, general, scalable C++ library for finite element methods
A framework for the automated derivation and parallel execution of finite difference solvers on a range of computer architectures.
PyTorch Implementation of Implicit Quantile Networks (IQN) for Distributional Reinforcement Learning with additional extensions like PER, Noisy layer, N-step bootstrapping, Dueling architecture and p...
Xiao's CUDA Optimization Guide [Active Adding New Contents]
A bleeding-edge, lock-free, wait-free, continuation-stealing tasking library built on C++20's coroutines
https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching
Lightweight fast function pipeline (DAG) creation in pure Python for scientific workflows 🕸️🧪
Parallel algorithms and data structures for tree-based adaptive mesh refinement (AMR) with arbitrary element shapes.
Lightweight, general, scalable C++ library for finite element methods
https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
A bleeding-edge, lock-free, wait-free, continuation-stealing tasking library built on C++20's coroutines
Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation
A package which efficiently applies any function to a pandas dataframe or series in the fastest available manner
Fluid-particle coupling for multiphase flow based on PhasicFlow and OpenFOAM
https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching
Parallel, highly efficient code (CPU and GPU) for DEM and CFD-DEM simulations.
standard, high-dimensional, parallel, constrained, and multiobjective Bayesian optimization algorithms
The Gas Dynamics Toolkit (GDTk) is a set of software tools for simulating high speed fluid flow, maintained at The University of Queensland and the University of Southern Queensland, Australia.
A framework for the automated derivation and parallel execution of finite difference solvers on a range of computer architectures.
Fierro is a C++ code designed to aid the research and development of numerical methods, testing of user-specified models, and creating multi-scale models related to quasi-static solid mechanics and co...
Realtime cycle exact emulation of the C64 using multiple microcontrollers in parallel.
Parallel algorithms and data structures for tree-based adaptive mesh refinement (AMR) with arbitrary element shapes.
Lightweight fast function pipeline (DAG) creation in pure Python for scientific workflows 🕸️🧪
Transcode source media directly from DaVinci Resolve using multiple machines for encoding. Great for creating proxies quickly.
Build applications, scripts, and automations powered by high-performance multicore computing using Node.js
https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching
miniRT is the final C project of the 42 Common Core: our very first ray-tracer. Our miniRT focused on optimising CPU-rendered graphics, to achieve a real-time renderer with movement controls and extra...
Realtime cycle exact emulation of the C64 using multiple microcontrollers in parallel.
University of Toronto / ECE1782 - Programming Massively Parallel Multiprocessors and Heterogeneous Systems / Project: an optimized CUDA Implementation of AES 128-bit Encryption, support any file types...
Build applications, scripts, and automations powered by high-performance multicore computing using Node.js
A General-purpose Task-parallel Programming System using Modern C++
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
Lightweight, general, scalable C++ library for finite element methods
Lightweight fast function pipeline (DAG) creation in pure Python for scientific workflows 🕸️🧪
An easy-to-use and fast library for task-based parallelism, utilizing coroutines.
A bleeding-edge, lock-free, wait-free, continuation-stealing tasking library built on C++20's coroutines
https://wavespeed.ai/ Context parallel attention that accelerates DiT model inference with dynamic caching
Must read research papers and links to tools and datasets that are related to using machine learning for compilers and systems optimisation
Evolutionary algorithm toolbox and framework with high performance for Python
Lightweight fast function pipeline (DAG) creation in pure Python for scientific workflows 🕸️🧪
An easy-to-use and fast library for task-based parallelism, utilizing coroutines.
miniRT is the final C project of the 42 Common Core: our very first ray-tracer. Our miniRT focused on optimising CPU-rendered graphics, to achieve a real-time renderer with movement controls and extra...
Fluid-particle coupling for multiphase flow based on PhasicFlow and OpenFOAM
Parallel, highly efficient code (CPU and GPU) for DEM and CFD-DEM simulations.
Fierro is a C++ code designed to aid the research and development of numerical methods, testing of user-specified models, and creating multi-scale models related to quasi-static solid mechanics and co...
Digital Image Correlation & Digital Volume Correlation Library
Create and control multiple Julia processes remotely for distributed computing. Ships as a Julia stdlib.
Slides, exercises and resources for the 2023-2024 course "High Performance Computing" under the "Scientific and Data-Intensive Computing" Naster Program at University of Trieste
Light and self-contained implementation of C++17 parallel algorithms.
Parallel algorithms and data structures for tree-based adaptive mesh refinement (AMR) with arbitrary element shapes.
Xiao's CUDA Optimization Guide [Active Adding New Contents]
Reactive Network of Operators In Rust. Framework for Parallel and distributed computation inspired from the DataFlow model