SAGESim - Scalable Agent-based GPU-Enabled Simulator

SAGESim is the first scalable, pure-Python, general-purpose agent-based modeling framework that supports both distributed computing and GPU acceleration. Designed for high-performance computing (HPC) environments, SAGESim enables simulations with millions of agents by combining MPI-level parallelism across multiple GPUs with GPU-level parallelism using thousands of threads per device.

Key Features

Dual-Level Parallelism: MPI distribution across multiple GPUs + GPU thread parallelism for individual agents
Pure Python: Write agent behaviors in Python using CuPy's JIT-compiled GPU kernels
Scalable: From laptop GPUs to HPC clusters with thousands of GPUs
Network-Based Models: Built-in support for agent networks with automatic neighbor data synchronization
Double Buffering: Race condition prevention for concurrent agent interactions
Graph Partitioning: Load pre-computed partitions to minimize cross-worker communication
Flexible Properties: Support for scalar and nested list properties with automatic padding

Requirements

Python 3.11+
NVIDIA GPU with CUDA drivers or AMD GPU with ROCm 5.7.1+
MPI implementation (OpenMPI, MPICH, etc.)

Installation

Your system might require specific steps to install mpi4py and/or cupy depending on your hardware. In that case, use your system's recommended instructions to install these dependencies first.

# Install SAGESim
pip install sagesim

# Or install from source
git clone https://github.com/ORNL/sagesim.git
cd sagesim
pip install -e .

Dependencies

cupy - GPU array computing
mpi4py - MPI bindings for Python
networkx - Graph/network handling
numpy - CPU array operations
awkward - Ragged array support

Quick Start

1. Define a Breed (Agent Type)

from cupyx import jit
from sagesim.breed import Breed

@jit.rawkernel(device="cuda")
def my_step_func(tick, agent_index, globals, agent_ids, breeds, locations, health):
    """Agent behavior: heal by 1 each tick"""
    health[agent_index] = health[agent_index] + 1

class MyBreed(Breed):
    def __init__(self):
        super().__init__("MyBreed")
        self.register_property("health", 100)  # Initial value
        self.register_step_func(my_step_func, __file__, priority=0)

2. Define a Model

from sagesim.model import Model
from sagesim.space import NetworkSpace

class MyModel(Model):
    def __init__(self):
        super().__init__(NetworkSpace())
        self._breed = MyBreed()
        self.register_breed(self._breed)

    def create_agent(self, health):
        return self.create_agent_of_breed(self._breed, health=health)

    def connect_agents(self, agent_a, agent_b):
        self.get_space().connect_agents(agent_a, agent_b)

3. Run the Simulation

# Create model and agents
model = MyModel()
for i in range(1000):
    model.create_agent(health=100)

# Connect agents in a network
for i in range(999):
    model.connect_agents(i, i + 1)

# Setup and run
model.setup(use_gpu=True)
model.simulate(ticks=100, sync_workers_every_n_ticks=1)

4. Run with MPI (Multiple GPUs)

mpirun -n 4 python my_simulation.py

Run Example: SIR Epidemic Model

git clone https://github.com/ORNL/sagesim.git
cd sagesim/examples/sir
mpirun -n 4 python run.py --num_agents 10000 --percent_init_connections 0.1 --num_nodes 1

Documentation

Comprehensive documentation is available in the docs/ directory:

Document	Description
Architecture Overview	System design, MPI distribution, GPU threading
Getting Started	Step-by-step guide to building models
Double Buffering	Race condition prevention mechanisms
Network Partitioning	Loading pre-computed partitions for load balancing
Runtime Optimizations	Performance tuning techniques
Selective Sync	Reducing MPI overhead
Property History	Tracking property changes over time
Ordered Neighbors	Ordered neighbor storage for agent networks
GPU-CPU Data Flow	Data flow between CPU and GPU

HPC Deployment

SAGESim is designed for HPC clusters. Example SLURM script for ORNL Frontier:

#!/bin/bash
#SBATCH -N 10
#SBATCH -t 00:30:00

num_nodes=10
num_mpi_ranks=$((8 * num_nodes))  # 8 GPUs per node

srun -N${num_nodes} -n${num_mpi_ranks} -c7 \
     --ntasks-per-gpu=1 --gpu-bind=closest \
     python3 -u ./run.py

CuPy JIT Kernel Limitations

When writing step functions, be aware of these cupyx.jit.rawkernel constraints:

NaN checks: Use x != x (inequality to self)
No dicts/objects: Only primitive types and arrays
No *args/**kwargs: Fixed argument lists only
No nested functions: Define helpers at module level
Use CuPy, not NumPy: Use cupy data types and routines in kernels
for loops: Must use range() iterator only
No return: Side effects via array writes only
No break/continue: Use boolean flags instead
No variable reassignment in scopes: Declare at top level
No -1 indexing: Use len(array) - 1 instead

See CuPy documentation for supported operations.

Project Structure

sagesim/
├── sagesim/           # Core library
│   ├── model.py       # Model class, simulation loop, GPU kernel generation
│   ├── agent.py       # Agent factory, MPI data synchronization
│   ├── breed.py       # Breed definition, property registration
│   ├── space.py       # NetworkSpace for agent topology
│   └── internal_utils.py  # Array conversion utilities
├── examples/          # Example models (SIR epidemic model)
├── docs/              # Comprehensive documentation
└── tests/             # Test suite

Contributing

Contributions are welcome! Please see the GitHub repository for issues and pull requests.

License

MIT License - Oak Ridge National Laboratory

Name		Name	Last commit message	Last commit date
Latest commit History 262 Commits
docs		docs
examples		examples
sagesim		sagesim
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
SAGESim-inline-tag-color.png		SAGESim-inline-tag-color.png
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAGESim - Scalable Agent-based GPU-Enabled Simulator

Key Features

Requirements

Installation

Dependencies

Quick Start

1. Define a Breed (Agent Type)

2. Define a Model

3. Run the Simulation

4. Run with MPI (Multiple GPUs)

Run Example: SIR Epidemic Model

Documentation

HPC Deployment

CuPy JIT Kernel Limitations

Project Structure

Contributing

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

ORNL/SAGESim

Folders and files

Latest commit

History

Repository files navigation

SAGESim - Scalable Agent-based GPU-Enabled Simulator

Key Features

Requirements

Installation

Dependencies

Quick Start

1. Define a Breed (Agent Type)

2. Define a Model

3. Run the Simulation

4. Run with MPI (Multiple GPUs)

Run Example: SIR Epidemic Model

Documentation

HPC Deployment

CuPy JIT Kernel Limitations

Project Structure

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages