From 27c9271853ce629b54dc13beadf1141b7f6ab599 Mon Sep 17 00:00:00 2001 From: Rafi Al Attrach Date: Sat, 25 Oct 2025 15:26:12 +0200 Subject: [PATCH 1/2] Improve README with cleaner Quick Start and add llms-install.md - Add clean side-by-side Quick Start for SQLite/BigQuery options - Add llms-install.md for AI agents (Cline, etc.) - Add m3-mcp entry point for simpler uvx usage - Add client links (Claude Desktop, Cursor, Goose) - Add video tutorial reference - Move CLI help tip to pip install section (where it's relevant) - Reduce total lines while adding more useful info - Bump version to 0.3.0 --- README.md | 391 +++++++++++++++++++++------------------------ llms-install.md | 95 +++++++++++ pyproject.toml | 3 +- src/m3/__init__.py | 2 +- 4 files changed, 283 insertions(+), 208 deletions(-) create mode 100644 llms-install.md diff --git a/README.md b/README.md index 694f801..cda733e 100644 --- a/README.md +++ b/README.md @@ -14,189 +14,233 @@ Transform medical data analysis with AI! Ask questions about MIMIC-IV data in plain English and get instant insights. Choose between local demo data (free) or full cloud dataset (BigQuery). -## โœจ Features +## Features -- ๐Ÿ” **Natural Language Queries**: Ask questions about MIMIC-IV data in plain English -- ๐Ÿ  **Local SQLite**: Fast queries on demo database (free, no setup) -- โ˜๏ธ **BigQuery Support**: Access full MIMIC-IV dataset on Google Cloud -- ๐Ÿ”’ **Enterprise Security**: OAuth2 authentication with JWT tokens and rate limiting -- ๐Ÿ›ก๏ธ **SQL Injection Protection**: Read-only queries with comprehensive validation +- **Natural Language Queries**: Ask questions about MIMIC-IV data in plain English +- **Local SQLite**: Fast queries on demo database (free, no setup) +- **BigQuery Support**: Access full MIMIC-IV dataset on Google Cloud +- **Enterprise Security**: OAuth2 authentication with JWT tokens and rate limiting +- **SQL Injection Protection**: Read-only queries with comprehensive validation ## ๐Ÿš€ Quick Start -> ๐Ÿ’ก **Need more options?** Run `m3 --help` to see all available commands and options. +> ๐Ÿ“บ **Prefer video tutorials?** Check out [step-by-step video guides](https://rafiattrach.github.io/m3/) covering setup, PhysioNet configuration, and more. -### ๐Ÿ“ฆ Installation +### Install uv (required for `uvx`) -Choose your preferred installation method: +We use `uvx` to run the MCP server. Install `uv` from the official installer, then verify with `uv --version`. -#### Option A: Install from PyPI (Recommended) - -**Step 1: Create Virtual Environment** +**macOS and Linux:** ```bash -# Create virtual environment (recommended) -python -m venv .venv -source .venv/bin/activate # Windows: .venv\Scripts\activate +curl -LsSf https://astral.sh/uv/install.sh | sh +``` + +**Windows (PowerShell):** +```powershell +powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex" ``` -**Step 2: Install M3** +Verify installation: ```bash -# Install M3 -pip install m3-mcp +uv --version ``` -#### Option B: Docker +### BigQuery Setup (Optional - Full Dataset) -```bash -# Clone repo first -git clone https://github.com/rafiattrach/m3.git && cd m3 +**Skip this if using SQLite demo database.** -# SQLite (demo DB) -docker build -t m3:lite --target lite . -docker run -d --name m3-server m3:lite tail -f /dev/null +1. **Install Google Cloud SDK:** + - macOS: `brew install google-cloud-sdk` + - Windows/Linux: https://cloud.google.com/sdk/docs/install -# BigQuery (full dataset - requires GCP credentials) -docker build -t m3:bigquery --target bigquery . -docker run -d --name m3-server \ - -e M3_BACKEND=bigquery \ - -e M3_PROJECT_ID=YOUR_PROJECT_ID \ - -v $HOME/.config/gcloud:/root/.config/gcloud:ro \ - m3:bigquery tail -f /dev/null -``` +2. **Authenticate:** + ```bash + gcloud auth application-default login + ``` + *Opens your browser - choose the Google account with BigQuery access to MIMIC-IV.* + +### MCP Client Configuration + +Paste one of the following into your MCP client config, then restart your client. + +**Supported clients:** [Claude Desktop](https://www.claude.com/download), [Cursor](https://cursor.com/download), [Goose](https://block.github.io/goose/), and [more](https://github.com/punkpeye/awesome-mcp-clients). + + + + + + +
+ +**SQLite (Demo Database)** + +Free, local, no setup required. -**MCP client config** (Claude Desktop, LM Studio, etc.): ```json { "mcpServers": { "m3": { - "command": "docker", - "args": ["exec", "-i", "m3-server", "python", "-m", "m3.mcp_server"] + "command": "uvx", + "args": ["m3-mcp"], + "env": { + "M3_BACKEND": "sqlite" + } } } } ``` -Stop container: `docker stop m3-server && docker rm m3-server` +*Demo database (136MB, 100 patients, 275 admissions) downloads automatically on first query.* -#### Option C: Install from Source + -#### Using standard `pip` -**Step 1: Clone and Navigate** -```bash -# Clone the repository -git clone https://github.com/rafiattrach/m3.git -cd m3 -``` +**BigQuery (Full Dataset)** -**Step 2: Create Virtual Environment** -```bash -# Create virtual environment -python -m venv .venv -source .venv/bin/activate # Windows: .venv\Scripts\activate -``` +Requires GCP credentials and PhysioNet access. -**Step 3: Install M3** -```bash -# Install M3 -pip install . +```json +{ + "mcpServers": { + "m3": { + "command": "uvx", + "args": ["m3-mcp"], + "env": { + "M3_BACKEND": "bigquery", + "M3_PROJECT_ID": "your-project-id" + } + } + } +} ``` -#### Using `UV` (Recommended) -Assuming you have [UV](https://docs.astral.sh/uv/getting-started/installation/) installed. +*Replace `your-project-id` with your Google Cloud project ID.* -**Step 1: Clone and Navigate** -```bash -# Clone the repository -git clone https://github.com/rafiattrach/m3.git -cd m3 -``` +
-**Step 2: Create `UV` Virtual Environment** -```bash -# Create virtual environment -uv venv -``` +**That's it!** Restart your MCP client and ask: +- "What tools do you have for MIMIC-IV data?" +- "Show me patient demographics from the ICU" +- "What is the race distribution in admissions?" -**Step 3: Install M3** -```bash -uv sync -# Do not forget to use `uv run` to any subsequent commands to ensure you're using the `uv` virtual environment -``` +--- -### ๐Ÿ—„๏ธ Database Configuration +## Backend Comparison -After installation, choose your data source: +| Feature | SQLite (Demo) | BigQuery (Full) | +|---------|---------------|-----------------| +| **Cost** | Free | BigQuery usage fees | +| **Setup** | Zero config | GCP credentials required | +| **Data Size** | 100 patients, 275 admissions | 365k patients, 546k admissions | +| **Speed** | Fast (local) | Network latency | +| **Use Case** | Learning, development | Research, production | -#### Option A: Local Demo Database (Recommended for Beginners) +--- -**Perfect for learning and development - completely free!** +## Alternative Installation Methods -1. **Download demo database**: - ```bash - m3 init mimic-iv-demo - ``` +> Already have Docker or prefer pip? Here are other ways to run m3: -2. **Setup MCP Client**: - ```bash - m3 config - ``` +### ๐Ÿณ Docker (No Python Required) - *Alternative: For Claude Desktop specifically:* - ```bash - m3 config claude - ``` + + + + + +
+ +**SQLite:** +```bash +git clone https://github.com/rafiattrach/m3.git && cd m3 +docker build -t m3:lite --target lite . +docker run -d --name m3-server m3:lite tail -f /dev/null +``` -3. **Restart your MCP client** and ask: + - - "What tools do you have for MIMIC-IV data?" - - "Show me patient demographics from the ICU" +**BigQuery:** +```bash +git clone https://github.com/rafiattrach/m3.git && cd m3 +docker build -t m3:bigquery --target bigquery . +docker run -d --name m3-server \ + -e M3_BACKEND=bigquery \ + -e M3_PROJECT_ID=your-project-id \ + -v $HOME/.config/gcloud:/root/.config/gcloud:ro \ + m3:bigquery tail -f /dev/null +``` -#### Option B: BigQuery (Full Dataset) +
-**For researchers needing complete MIMIC-IV data** +**MCP config (same for both):** +```json +{ + "mcpServers": { + "m3": { + "command": "docker", + "args": ["exec", "-i", "m3-server", "python", "-m", "m3.mcp_server"] + } + } +} +``` -##### Prerequisites -- Google Cloud account and project with billing enabled -- Access to MIMIC-IV on BigQuery (requires PhysioNet credentialing) +Stop: `docker stop m3-server && docker rm m3-server` -##### Setup Steps +### pip Install + CLI Tools -1. **Install Google Cloud CLI**: +```bash +pip install m3-mcp +``` - **macOS (with Homebrew):** - ```bash - brew install google-cloud-sdk - ``` +> ๐Ÿ’ก **CLI commands:** Run `m3 --help` to see all available options. - **Windows:** Download from https://cloud.google.com/sdk/docs/install +**Useful CLI commands:** +- `m3 init mimic-iv-demo` - Download demo database +- `m3 config` - Generate MCP configuration interactively +- `m3 config claude --backend bigquery --project-id YOUR_PROJECT_ID` - Quick BigQuery setup - **Linux:** - ```bash - curl https://sdk.cloud.google.com | bash - ``` +**Example MCP config:** +```json +{ + "mcpServers": { + "m3": { + "command": "m3-mcp-server", + "env": { + "M3_BACKEND": "sqlite" + } + } + } +} +``` -2. **Authenticate**: - ```bash - gcloud auth application-default login - ``` - *This will open your browser - choose the Google account that has access to your BigQuery project with MIMIC-IV data.* +### Local Development -3. **Setup MCP Client for BigQuery**: - ```bash - m3 config - ``` +For contributors: - *Alternative: For Claude Desktop specifically:* - ```bash - m3 config claude --backend bigquery --project-id YOUR_PROJECT_ID - ``` +```bash +git clone https://github.com/rafiattrach/m3.git && cd m3 +python -m venv .venv +source .venv/bin/activate # Windows: .venv\Scripts\activate +pip install -e ".[dev]" +pre-commit install +``` -4. **Test BigQuery Access** - Restart your MCP client and ask: - ``` - Use the get_race_distribution function to show me the top 5 races in MIMIC-IV admissions. - ``` +**MCP config:** +```json +{ + "mcpServers": { + "m3": { + "command": "/path/to/m3/.venv/bin/python", + "args": ["-m", "m3.mcp_server"], + "cwd": "/path/to/m3", + "env": { + "M3_BACKEND": "sqlite" + } + } + } +} +``` -## ๐Ÿ”ง Advanced Configuration +## Advanced Configuration Need to configure other MCP clients or customize settings? Use these commands: @@ -218,7 +262,7 @@ m3 config --quick --backend sqlite --db-path /path/to/database.db m3 config --output my_config.json ``` -### ๐Ÿ” OAuth2 Authentication (Optional) +### OAuth2 Authentication (Optional) For production deployments requiring secure access to medical data: @@ -245,21 +289,9 @@ m3 config # Choose OAuth2 option during setup > ๐Ÿ“– **Complete OAuth2 Setup Guide**: See [`docs/OAUTH2_AUTHENTICATION.md`](docs/OAUTH2_AUTHENTICATION.md) for detailed configuration, troubleshooting, and production deployment guidelines. -### Backend Comparison - -**SQLite Backend (Default)** -- โœ… **Free**: No cloud costs -- โœ… **Fast**: Local queries -- โœ… **Easy**: No authentication needed -- โŒ **Limited**: Demo dataset only (~1k records) +--- -**BigQuery Backend** -- โœ… **Complete**: Full MIMIC-IV dataset (~500k admissions) -- โœ… **Scalable**: Google Cloud infrastructure -- โœ… **Current**: Latest MIMIC-IV version (3.1) -- โŒ **Costs**: BigQuery usage fees apply - -## ๐Ÿ› ๏ธ Available MCP Tools +## Available MCP Tools When your MCP client processes questions, it uses these tools automatically: @@ -270,7 +302,7 @@ When your MCP client processes questions, it uses these tools automatically: - **get_lab_results**: Laboratory test results - **get_race_distribution**: Patient race distribution -## ๐Ÿงช Example Prompts +## Example Prompts Try asking your MCP client these questions: @@ -291,12 +323,7 @@ Try asking your MCP client these questions: - `Prompt:` *What tables are available in the database?* - `Prompt:` *What tools do you have for MIMIC-IV data?* -## ๐ŸŽฉ Pro Tips - -- Do you want to pre-approve the usage of all tools in Claude Desktop? Use the prompt below and then select **Always Allow** - - `Prompt:` *Can you please call all your tools in a logical sequence?* - -## ๐Ÿ” Troubleshooting +## Troubleshooting ### Common Issues @@ -350,59 +377,11 @@ gcloud auth application-default login gcloud auth list ``` -## ๐Ÿ‘ฉโ€๐Ÿ’ป For Developers - -### Development Setup - -#### Option A: Standard `pip` Development Setup -**Step 1: Clone and Navigate** -```bash -# Clone the repository -git clone https://github.com/rafiattrach/m3.git -cd m3 -``` - -**Step 2: Create and Activate Virtual Environment** -```bash -# Create virtual environment -python -m venv .venv -source .venv/bin/activate # Windows: .venv\Scripts\activate -``` - -**Step 3: Install Development Dependencies** -```bash -# Install in development mode with dev dependencies -pip install -e ".[dev]" -# Install pre-commit hooks -pre-commit install -``` - -#### Option B: Development Setup with `UV` (Recommended) -**Step 1: Clone and Navigate** -```bash -# Clone the repository -git clone https://github.com/rafiattrach/m3.git -cd m3 -``` - -**Step 2: Create and Activate `UV` Virtual Environment** -```bash -# Create virtual environment -uv venv -``` +## For Developers -**Step 3: Install Development Dependencies** -```bash -# Install in development mode with dev dependencies (by default, UV runs in editable mode) -uv sync - -# Install pre-commit hooks -uv run pre-commit install - -# Do not forget to use `uv run` to any subsequent commands to ensure you're using the `uv` virtual environment -``` +> See "Local Development" section above for setup instructions. -### Testing +### Running Tests ```bash pytest # All tests (includes OAuth2 and BigQuery mocks) @@ -428,15 +407,15 @@ export M3_OAUTH2_TOKEN="Bearer your-test-token" m3-mcp-server ``` -## ๐Ÿ”ฎ Roadmap +## Roadmap -- ๐Ÿ  **Local Full Dataset**: Complete MIMIC-IV locally (no cloud costs) -- ๐Ÿ”ง **Advanced Tools**: More specialized medical data functions -- ๐Ÿ“Š **Visualization**: Built-in plotting and charting tools -- ๐Ÿ” **Enhanced Security**: Role-based access control, audit logging -- ๐ŸŒ **Multi-tenant Support**: Organization-level data isolation +- **Local Full Dataset**: Complete MIMIC-IV locally (no cloud costs) +- **Advanced Tools**: More specialized medical data functions +- **Visualization**: Built-in plotting and charting tools +- **Enhanced Security**: Role-based access control, audit logging +- **Multi-tenant Support**: Organization-level data isolation -## ๐Ÿค Contributing +## Contributing We welcome contributions! Please: diff --git a/llms-install.md b/llms-install.md new file mode 100644 index 0000000..d5480ea --- /dev/null +++ b/llms-install.md @@ -0,0 +1,95 @@ +# M3 MCP Server - Installation Guide for AI Agents + +This guide helps AI agents like Cline install and configure the M3 MCP server. + +## Installation Method + +Use `uvx` for zero-installation setup: + +```bash +uvx m3-mcp +``` + +## Backend Configuration + +M3 supports two backends. Choose one: + +### Option 1: SQLite (Demo Database - Recommended for Testing) + +**MCP Configuration:** +```json +{ + "mcpServers": { + "m3": { + "command": "uvx", + "args": ["m3-mcp"], + "env": { + "M3_BACKEND": "sqlite" + } + } + } +} +``` + +**Features:** +- No setup required +- Demo database (100 patients, 275 admissions) downloads automatically +- Perfect for testing and development + +### Option 2: BigQuery (Full MIMIC-IV Dataset) + +**Prerequisites:** +1. User must have Google Cloud credentials configured +2. User must have access to MIMIC-IV on BigQuery (requires PhysioNet credentialing) + +**MCP Configuration:** +```json +{ + "mcpServers": { + "m3": { + "command": "uvx", + "args": ["m3-mcp"], + "env": { + "M3_BACKEND": "bigquery", + "M3_PROJECT_ID": "user-project-id" + } + } + } +} +``` + +**Setup Steps:** +1. Install Google Cloud SDK: `brew install google-cloud-sdk` (macOS) +2. Authenticate: `gcloud auth application-default login` +3. Replace `user-project-id` with the user's actual GCP project ID + +## Verification + +After configuration, test by asking: +- "What tools do you have for MIMIC-IV data?" +- "Show me patient demographics from the ICU" + +## Troubleshooting + +**If SQLite backend fails:** +- The demo database downloads automatically on first query +- No manual `m3 init` needed + +**If BigQuery backend fails:** +- Verify GCP authentication: `gcloud auth list` +- Confirm PhysioNet access to MIMIC-IV dataset +- Check project ID is correct + +## Available Tools + +- `get_database_schema` - List available tables +- `get_table_info` - Get column info and sample data +- `execute_mimic_query` - Execute SQL queries +- `get_icu_stays` - ICU stay information +- `get_lab_results` - Laboratory test results +- `get_race_distribution` - Patient demographics + +## Additional Resources + +- Full documentation: https://github.com/rafiattrach/m3 +- Video tutorials: https://rafiattrach.github.io/m3/ diff --git a/pyproject.toml b/pyproject.toml index 923035b..66269dc 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -64,7 +64,8 @@ dev = [ [project.scripts] m3 = "m3.cli:app" -m3-mcp-server = "m3.mcp_server:main" +m3-mcp = "m3.mcp_server:main" # Primary entry point for simplicity (enables `uvx m3-mcp`) +m3-mcp-server = "m3.mcp_server:main" # Kept for backwards compatibility [project.urls] Homepage = "https://github.com/rafiattrach/m3" diff --git a/src/m3/__init__.py b/src/m3/__init__.py index 393f13d..c2c0b88 100644 --- a/src/m3/__init__.py +++ b/src/m3/__init__.py @@ -2,4 +2,4 @@ MIMIC-IV + MCP + Models (M3): Local MIMIC-IV querying with LLMs via Model Context Protocol """ -__version__ = "0.2.0" +__version__ = "0.3.0" From f220135442cfa830aa47db34487709c69ff4832f Mon Sep 17 00:00:00 2001 From: Rafi Al Attrach Date: Tue, 4 Nov 2025 20:13:24 +0100 Subject: [PATCH 2/2] Add Citation and Related Projects sections - Add BibTeX citation for the paper - Reference GitHub's citation button - Add Related Projects section for community adaptations --- README.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/README.md b/README.md index cda733e..0fb37c7 100644 --- a/README.md +++ b/README.md @@ -424,6 +424,28 @@ We welcome contributions! Please: 3. Add tests for new functionality 4. Submit a pull request +## Citation + +If you use M3 in your research, please cite: + +```bibtex +@article{attrach2025conversational, + title={Conversational LLMs Simplify Secure Clinical Data Access, Understanding, and Analysis}, + author={Attrach, Rafi Al and Moreira, Pedro and Fani, Rajna and Umeton, Renato and Celi, Leo Anthony}, + journal={arXiv preprint arXiv:2507.01053}, + year={2025} +} +``` + +You can also use the "Cite this repository" button at the top of the GitHub page for other formats. + +## Related Projects + +M3 has been forked and adapted by the community: +- [MCPStack-MIMIC](https://github.com/MCP-Pipeline/mcpstack-mimic) - Integrates M3 with other MCP servers (Jupyter, sklearn, etc.) + +--- + *Built with โค๏ธ for the medical AI community* **Need help?** Open an issue on GitHub or check our troubleshooting guide above.