Map Binning Tool

A Python package for spatial resampling and binning of geospatial data, specifically designed for oceanographic datasets. This tool enables efficient downsampling of high-resolution gridded data onto coarser grids while preserving spatial accuracy through intelligent neighborhood averaging.

Citation

If you use this tool in your research, please cite:

@software{map_binning_2025,
  author = {Chia-Wei Hsu},
  title = {Map Binning Tool: Spatial Resampling for Oceanographic Data},
  url = {https://github.com/chiaweh2/map_binning},
  doi = {10.5281/zenodo.17095448},
  year = {2025}
}

Overview

The Map Binning Tool provides a robust solution for spatial data aggregation, particularly useful for:

Downsampling high-resolution oceanographic data (e.g., sea level anomaly, ocean currents)
Creating consistent multi-resolution datasets
Reducing computational load while maintaining spatial representativeness
Processing time-series of gridded data efficiently

The package uses k-d tree algorithms for fast spatial queries and supports both in-memory processing and persistent caching of spatial indices for repeated operations.

Key Features

Efficient Spatial Binning: Uses scipy's cKDTree for fast nearest-neighbor searches
Flexible Grid Support: Works with any xarray-compatible gridded dataset
Automatic Radius Calculation: Intelligently determines search radius based on target grid spacing
Persistent Caching: Save and reuse spatial indices using pickle serialization
Time Series Support: Handles datasets with temporal dimensions
Memory Efficient: Processes large datasets without excessive memory usage
Oceanographic Focus: Optimized for CMEMS and similar oceanographic data formats

Installation

From conda-forge (Recommended)

conda install -c conda-forge map-binning

From PyPI

pip install map-binning

With optional dependencies for development

pip install map-binning[dev]

Developer Installation

From source

git clone <repository-url>
cd map_binning
pip install -e .

Quick Start

Basic Usage

import xarray as xr
from map_binning import Binning

# Load your datasets
ds_high = xr.open_dataset('high_resolution_data.nc')
ds_low = xr.open_dataset('low_resolution_grid.nc')

# Initialize the binning tool
binning = Binning(
    ds_high=ds_high,
    ds_low=ds_low,
    var_name='sla',  # variable in the dataset to bin (e.g., sea level anomaly)
    xdim_name='longitude',  # longitude dimension name
    ydim_name='latitude',   # latitude dimension name
    search_radius=0.1  # optional: search radius in degrees
)

# Perform binning
result = binning.mean_binning()

Advanced Usage with Caching

# Create binning index and save it for reuse
result = binning.mean_binning(
    precomputed_binning_index=False,
    pickle_filename="my_binning_index.pkl",
    pickle_location="./cache"
)

# Reuse the saved index for subsequent operations
result = binning.mean_binning(
    precomputed_binning_index=True,
    pickle_filename="my_binning_index.pkl",
    pickle_location="./cache"
)

Time Series Processing

The tool automatically handles time dimensions:

# Works seamlessly with time-varying datasets
# Input: (time, lat, lon) -> Output: (time, lat_low, lon_low)
result = binning.mean_binning()

Configuration for CMEMS data download

Environment Variables

Copy .env.template to .env and configure:

# Copernicus Marine Service credentials (if using CMEMS data)
COPERNICUSMARINE_SERVICE_USERNAME=<your_username>
COPERNICUSMARINE_SERVICE_PASSWORD=<your_password>

API Reference

Binning Class

Constructor Parameters

ds_high (xr.Dataset): High-resolution source dataset
ds_low (xr.Dataset): Low-resolution target grid dataset
var_name (str): Name of the variable to bin
xdim_name (str, optional): Longitude dimension name (default: 'lon')
ydim_name (str, optional): Latitude dimension name (default: 'lat')
search_radius (float, optional): Search radius in degrees (auto-calculated if None)

Methods

create_binning_index() Creates a spatial mapping between high and low resolution grids.

mean_binning(precomputed_binning_index=False, pickle_filename=None, pickle_location=None) Performs spatial binning using mean aggregation.

Parameters:

precomputed_binning_index (bool): Use pre-saved spatial index
pickle_filename (str): Filename for saving/loading spatial index
pickle_location (str): Directory path for pickle files

Returns: xr.DataArray with binned data on the target grid

Project Structure

map_binning/
├── map_binning/           # Main package directory
│   ├── __init__.py        # Package initialization
│   ├── binning.py         # Core binning algorithms
│   ├── index_store.py     # Pickle serialization utilities
│   └── main.py            # Command-line interface
├── notebooks/             # Jupyter notebooks for examples
│   └── cmems_nrt_coastal_bin.ipynb
├── tests/                 # Unit tests
│   ├── __init__.py
│   └── ...
├── pyproject.toml         # Project configuration
├── environment.yml        # Conda environment specification
├── .env.template          # Environment variables template
└── README.md              # This file

Performance Considerations

Memory Usage: The tool processes data in chunks and uses efficient numpy operations
Spatial Index Caching: Save computed spatial indices to avoid recalculation
Grid Resolution: Performance scales with the product of grid sizes
Search Radius: Smaller radii improve performance but may miss relevant data points

Contributing

We welcome contributions! Please follow these steps:

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Make your changes and add tests
Run the test suite (pytest)
Format your code (black map_binning/)
Submit a pull request

Development Setup

# Clone and setup development environment
git clone <repository-url>
cd map-binning-project
conda env create -f environment.yml
conda activate map-binning
pip install -e .[dev]

# Run tests
pytest

# Format code
black map_binning/

# Type checking
mypy map_binning/

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Issues: Please report bugs and feature requests via GitHub Issues
Documentation: Additional examples available in the notebooks/ directory
Contact: Chia-Wei Hsu ([email protected])

Acknowledgments

Built with support for Copernicus Marine Environment Monitoring Service (CMEMS) data
Utilizes scipy's efficient spatial algorithms
Designed for the oceanographic research community

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Map Binning Tool

Citation

Overview

Key Features

Installation

From conda-forge (Recommended)

From PyPI

With optional dependencies for development

Developer Installation

From source

Quick Start

Basic Usage

Advanced Usage with Caching

Time Series Processing

Configuration for CMEMS data download

Environment Variables

API Reference

Binning Class

Constructor Parameters

Methods

Project Structure

Performance Considerations

Contributing

Development Setup

License

Support

Acknowledgments

About

Uh oh!

Releases 7

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
.github/workflows		.github/workflows
doc		doc
map_binning		map_binning
notebooks		notebooks
tests		tests
.env.template		.env.template
.flake8		.flake8
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
pyproject.toml		pyproject.toml

License

chiaweh2/map_binning

Folders and files

Latest commit

History

Repository files navigation

Map Binning Tool

Citation

Overview

Key Features

Installation

From conda-forge (Recommended)

From PyPI

With optional dependencies for development

Developer Installation

From source

Quick Start

Basic Usage

Advanced Usage with Caching

Time Series Processing

Configuration for CMEMS data download

Environment Variables

API Reference

Binning Class

Constructor Parameters

Methods

Project Structure

Performance Considerations

Contributing

Development Setup

License

Support

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Languages