This project implements a simplified Smoothed Particle Hydrodynamics (SPH) simulation using CUDA for parallel computation and raylib for real-time visualization. It uses a uniform grid to speed up neighbor searches, allowing the simulation to scale to thousands of particles.
The simulation showcases fluid approximating properties like water. This simulation contains 15 000 particles and runs at consistent 60 FPS (with theoratical maximum of 90 FPS) on a RTX 3080. It also showcases mouse interaction, where the user can push or swirl the fluid by clicking and dragging the mouse. The different colors represent the velocity of the particles, with white being the fastest and blue being the slowest.
This repository is separated into two parts:
- The implementation of the SPH simulation using CUDA and C++ for high performance.
- A jupyter notebook that is designed to explain the SPH algorithm and its implementation in detail.
SPH-CUDA
├── docs
├── include
│ ├── config.h
│ ├── grid.h
│ ├── kernels.h
│ ├── particle.h
│ └── renderer.h
├── python
│ ├── particle-based_fluid.ipynb
│ └── req.txt
├── src
│ ├── kernels.cu
│ ├── main.cu
│ └── renderer.cu
├── .gitattributes
├── .gitignore
├── CMakeLists.txt
└── README.md
- docs: Contains images and other documentation files.
- include: Holds header files for various parts of the simulation.
- python: Contains a jupyter notebook that explains the SPH algorithm and its implementation in detail.
- src: Contains the CUDA and C++ source files.
- .gitignore: Specifies which files and folders Git should ignore.
- .gitattributes: Configures Git attributes for the repository.
- CMakeLists.txt: Build configuration for CMake.
- README.md: This documentation.
This section explains how to set up the CUDA simulation as well as the jupyter notebook.
The project containes a CMakeLists.txt file for easy building. The project is designed to be built with CMake, which will handle the. The setup is designed for Windows, but it should work on Linux and MacOS with minor modifications.
- A CUDA-capable GPU with the NVIDIA driver installed
- NVIDIA CUDA Toolkit (tested with 12.8)
- CMake (3.30 or higher)
- Build tools (e.g., Visual Studio, GCC, etc.)
-
Clone the repository and navigate into it.
-
Create a build directory and run CMake:
mkdir build && cd build cmake ..
-
Build the project:
cmake --build . -
Run the generated executable:
.\Debug\SPH_CUDA.exe
-
Go into the
pythondirectory.cd python -
Create a virtual environment (optional but recommended):
python -m venv venv
-
Activate the virtual environment:
-
On Windows:
venv\Scripts\activate
-
On Linux/MacOS:
source venv/bin/activate
-
-
Install the required packages:
pip install -r req.txt
-
Launch Jupyter Notebook:
jupyter notebook
-
Open the
particle-based_fluid.ipynbnotebook to explore the SPH algorithm and its implementation.
This section describes the main components of the CUDA simulation and how they interact.
Defines all the simulation parameters, such as the domain size, number of particles, and time step. Keeping these constants in a single file makes them easy to manage.
Contains inline helper functions for converting particle positions to grid coordinates, retrieving neighboring cell indices, and more. These functions are used by the CUDA kernels to organize particles in a grid for faster neighbor searches.
Declares all CUDA kernels for:
- Updating grid structures
- Computing particle density, pressure, and forces
- Integrating particle positions
- Handling mouse interactions and boundaries
Defines the Particle struct. Each particle stores:
- Position
- Old position (for stable integration)
- Velocity
- Force
- Mass
- Density
- Pressure
Also contains inline vector math utilities (lengthF2, subtractF2) used by both host code and CUDA kernels.
Declares functions for rendering particles using raylib. This includes:
- Creating textures (e.g., circles for particles)
- Generating color based on particle velocity
The entry point of the simulation. It:
- Initializes the window (via raylib)
- Allocates memory for particles and grid data
- Randomizes initial particle positions
- Runs the main loop:
- Updates physics by launching CUDA kernels
- Renders the particles with raylib
- Cleans up and closes the window on exit
Implements the CUDA kernels declared in kernels.h. These kernels run in parallel on the GPU to handle:
- Grid updates and neighbor searches
- Density and pressure calculations
- Force accumulation (pressure, viscosity, mouse interaction, gravity)
- Integration of particle motion
- Boundary checks
Implements the rendering functions declared in renderer.h. These functions create textures and color gradients for particles and are called from the main loop to draw particles each frame.
-
Initialization:
main.cusets up the window and allocates memory.- Particles are randomly distributed in the simulation space.
-
Physics:
main.culaunches kernels fromkernels.cuto update the grid, compute density/pressure, and integrate positions.grid.hhelper functions are used in kernels to find neighboring cells efficiently.
-
Rendering:
- After the physics update,
renderer.cufunctions are called to draw each particle with a color based on its velocity.
- After the physics update,
-
Interaction:
- Mouse input is processed by specific kernels in
kernels.cu(push or swirl forces). - Boundary conditions prevent particles from leaving the domain.
- Mouse input is processed by specific kernels in
CUDA (Compute Unified Device Architecture) is NVIDIA’s parallel computing platform and programming model for GPUs. Below is a brief overview of CUDA concepts relevant to this project.
In CUDA, the CPU and its memory space is referred to as the host, while the GPU and its memory space is called the device. The host manages memory and launches kernels on the device. In this repository, unified memory is allocated once and used by both the host and device without explicit copying.
// main.cu
Particle *particles = nullptr;
cudaMallocManaged(&particles, N * sizeof(Particle));This single call makes particles accessible from both the host and device, simplifying memory management.
CUDA kernels are functions that run on the GPU. Each kernel is executed by multiple threads in parallel. Threads are organized into blocks, and blocks are organized into a grid. This hierarchy allows for efficient parallel computation. Kernels are defined with the __global__ keyword and can be launched from the host using the <<<blocks, threads>>> syntax.
Defining a kernel looks like this:
// kernels.cu
__global__ void computePressure(Particle *particles, int N, float K, float RHO0) {
unsigned int i = blockIdx.x * blockDim.x + threadIdx.x;
if (i < N) {
particles[i].pressure = K * (particles[i].density - RHO0);
}
}Calling this kernel from the host looks like this:
// main.cu
int threadsPerBlock = 256;
int blocks = (N + threadsPerBlock - 1) / threadsPerBlock;
// ...
computePressure<<<blocks, threadsPerBlock>>>(particles, N, K, RHO0);CUDA offers Unified Memory so the runtime migrates data automatically between host and device memory. In this project, after launching a kernel, we synchronize to ensure all threads have completed before proceeding. This is done using cudaDeviceSynchronize().
// main.cu
computePressure<<<blocks, threadsPerBlock>>>(particles, N, K, RHO0);
cudaDeviceSynchronize();If not using unified memory, explicit memory management is required. This involves allocating device memory with cudaMalloc(), copying data from host to device with cudaMemcpy(), and freeing device memory with cudaFree(). cudaFree() is called at the end of the program to clean up device memory.
To compile device code separately and link automatically, we enable separable compilation in the CMakeLists.txt file. This allows each .cu file to be compiled into object code and linked into the final executable. This is done by adding the following lines to the CMakeLists.txt file:
# CMakeLists.txt
set_target_properties(SPH_CUDA PROPERTIES
CUDA_SEPARABLE_COMPILATION ON
)Below is a overview of each CUDA kernel in the SPH simulation, along with its role in the overall workflow. The kernels are all defined in kernels.cu and are launched from the main loop in main.cu.
Assigns particles to cells in the uniform grid. For each particle, its cell index is computed from its position, the cell’s counter is atomically incremented, and the particle’s index is stored if the cell is not full.
Calculates each particle’s density by examining neighbors in the 3×3 block of adjacent cells. The Poly6 smoothing kernel is used to accumulate the density contributions of particles within the smoothing radius.
Converts the computed densities into pressures using the equation [ p = K \cdot (\rho - \rho_0). ]
Determines pressure forces by again searching the 3×3 neighborhood. The Spiky-gradient kernel gives both magnitude and direction of pressure forces between neighboring particles.
Computes viscosity forces using the Laplacian viscosity kernel. Velocity differences between neighbors are converted into viscous force contributions.
Adds a gravitational force to each particle in the positive y-direction, proportional to its density.
Stores the current position into oldPosition, enabling more stable integration by allowing a fallback if corrections are needed.
Advances particle positions by a small look-ahead factor of their current velocity, improving integration stability.
Performs Euler integration: updates velocity and position from accumulated forces and density, then resets forces to zero.
Reverts positions back to oldPosition if a stability correction is required.
Reflects particles at domain edges and applies damping to their velocities to prevent them from leaving the simulation area.
Applies a repulsive or attractive “push” based on left-click or right-click, affecting particles within a specified radius around the mouse position.
Imposes a swirling (rotational) force around the mouse, computed as a perpendicular vector to the direction from particle to mouse, creating vortex-like motion.
This work is mainly based on the following resources:
-
Particle-Based Fluid Simulation for Interactive Applications by Matthias Müller, David Charypar and Markus Gross.
-
Smoothed Particle Hydrodynamics Techniques for the Physics Based Simulation of Fluids and Solids by Dan Koschier, Jan Bender, Barbara Solenthaler, and Matthias Teschner.
-
Particle Simulation using CUDA by Simon Green.
