Skip to content

Commit 6084393

Browse files
committed
feat(dynamodb-mcp-server): Adding direct connection support to include RDS / self-hosted MySQL DB in the Source DB Analyzer tool
1 parent d6344f8 commit 6084393

File tree

8 files changed

+2491
-1640
lines changed

8 files changed

+2491
-1640
lines changed

.secrets.baseline

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -215,18 +215,17 @@
215215
"filename": "src/dynamodb-mcp-server/README.md",
216216
"hashed_secret": "37b5ecd16fe6c599c85077c7992427df62b2ab71",
217217
"is_verified": false,
218-
"line_number": 200,
218+
"line_number": 223,
219219
"is_secret": false
220220
}
221221
],
222222
"src/dynamodb-mcp-server/awslabs/dynamodb_mcp_server/database_analyzers.py": [
223223
{
224224
"type": "Secret Keyword",
225225
"filename": "src/dynamodb-mcp-server/awslabs/dynamodb_mcp_server/database_analyzers.py",
226-
"hashed_secret": "38a2bae6275b4d868c488758d213827833cd8570",
226+
"hashed_secret": "90181b7e929dd123c6fb2bfb6917e005e9ed6992",
227227
"is_verified": false,
228-
"line_number": 107,
229-
"is_secret": false
228+
"line_number": 155
230229
}
231230
],
232231
"src/dynamodb-mcp-server/tests/test_dynamodb_server.py": [
@@ -906,5 +905,5 @@
906905
}
907906
]
908907
},
909-
"generated_at": "2025-12-01T14:26:01Z"
908+
"generated_at": "2025-12-05T13:59:20Z"
910909
}

src/dynamodb-mcp-server/README.md

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ The DynamoDB MCP server provides four tools for data modeling and validation:
1414

1515
**Example invocation:** "Validate my DynamoDB data model"
1616

17-
- `source_db_analyzer` - Analyzes existing MySQL/Aurora databases to extract schema structure, access patterns from Performance Schema, and generates timestamped analysis files for use with dynamodb_data_modeling. Requires AWS RDS Data API and credentials in Secrets Manager.
17+
- `source_db_analyzer` - Analyzes existing MySQL/Aurora databases to extract schema structure, access patterns from Performance Schema, and generates timestamped analysis files for use with dynamodb_data_modeling. Supports both AWS RDS Data API (Aurora MySQL) and direct connections (RDS/self-hosted MySQL).
1818

1919
**Example invocation:** "Analyze my MySQL database and help me design a DynamoDB data model"
2020

@@ -159,11 +159,24 @@ The tool automates the traditional manual validation process:
159159

160160
The `source_db_analyzer` tool analyzes existing MySQL/Aurora databases to extract schema and access patterns for DynamoDB modeling. This is useful when migrating from relational databases.
161161

162+
The tool supports two connection methods:
163+
- **AWS RDS Data API** (Aurora MySQL): Serverless connection using cluster ARN
164+
- **Direct Connection** (RDS/self-hosted MySQL): Traditional connection using hostname/port
165+
162166
#### Prerequisites for MySQL Integration
163167

168+
**For AWS RDS Data API (Aurora MySQL):**
164169
1. Aurora MySQL Cluster with credentials stored in AWS Secrets Manager
165170
2. Enable RDS Data API for your Aurora MySQL Cluster
166-
3. Enable Performance Schema for access pattern analysis (optional but recommended):
171+
3. AWS credentials with permissions to access RDS Data API and AWS Secrets Manager
172+
173+
**For Direct Connection (RDS/self-hosted MySQL):**
174+
1. MySQL server (RDS, Aurora, or self-hosted) accessible from your environment
175+
2. Database credentials stored in AWS Secrets Manager
176+
3. AWS credentials with permissions to access AWS Secrets Manager
177+
178+
**For both connection methods:**
179+
4. Enable Performance Schema for access pattern analysis (optional but recommended):
167180
- Set `performance_schema` parameter to 1 in your DB parameter group
168181
- Reboot the DB instance after changes
169182
- Verify with: `SHOW GLOBAL VARIABLES LIKE '%performance_schema'`
@@ -172,18 +185,28 @@ The `source_db_analyzer` tool analyzes existing MySQL/Aurora databases to extrac
172185
- `performance_schema_max_digest_length` - Maximum byte length per statement digest (default: 1024)
173186
- Without Performance Schema, analysis is based on information schema only
174187

175-
4. AWS credentials with permissions to access RDS Data API and AWS Secrets Manager
176-
177188
#### MySQL Environment Variables
178189

179190
Add these environment variables to enable MySQL integration:
180191

192+
**For AWS RDS Data API (Aurora MySQL):**
181193
- `MYSQL_CLUSTER_ARN`: Aurora MySQL cluster Resource ARN
182194
- `MYSQL_SECRET_ARN`: ARN of secret containing database credentials
183195
- `MYSQL_DATABASE`: Database name to analyze
184196
- `AWS_REGION`: AWS region of the Aurora MySQL cluster
197+
198+
**For Direct Connection (RDS/self-hosted MySQL):**
199+
- `MYSQL_HOSTNAME`: MySQL server hostname or endpoint
200+
- `MYSQL_PORT`: MySQL server port (optional, default: 3306)
201+
- `MYSQL_SECRET_ARN`: ARN of secret containing database credentials
202+
- `MYSQL_DATABASE`: Database name to analyze
203+
- `AWS_REGION`: AWS region where Secrets Manager is located
204+
205+
**Common options:**
185206
- `MYSQL_MAX_QUERY_RESULTS`: Maximum rows in analysis output files (optional, default: 500)
186207

208+
**Note:** Explicit tool parameters take precedence over environment variables. Only one connection method (cluster ARN or hostname) should be specified.
209+
187210
#### MCP Configuration with MySQL
188211

189212
```json

src/dynamodb-mcp-server/awslabs/dynamodb_mcp_server/database_analyzers.py

Lines changed: 93 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,17 @@
2121
get_schema_queries,
2222
)
2323
from awslabs.dynamodb_mcp_server.markdown_formatter import MarkdownFormatter
24-
from awslabs.mysql_mcp_server.server import DBConnection, DummyCtx
24+
from awslabs.mysql_mcp_server.connection.asyncmy_pool_connection import AsyncmyPoolConnection
25+
from awslabs.mysql_mcp_server.connection.rds_data_api_connection import RDSDataAPIConnection
26+
from awslabs.mysql_mcp_server.server import DummyCtx
2527
from awslabs.mysql_mcp_server.server import run_query as mysql_query
2628
from datetime import datetime
2729
from loguru import logger
2830
from typing import Any, Dict, List, Tuple
2931

3032

33+
DEFAULT_MYSQL_PORT = 3306
34+
DEFAULT_READONLY = True
3135
DEFAULT_ANALYSIS_DAYS = 30
3236
DEFAULT_MAX_QUERY_RESULTS = 500
3337
SECONDS_PER_DAY = 86400
@@ -63,11 +67,42 @@ def build_connection_params(source_db_type: str, **kwargs) -> Dict[str, Any]:
6367
)
6468
output_dir = user_provided_dir
6569

70+
# Validate port parameter
71+
port_value = kwargs.get('port') or os.getenv('MYSQL_PORT', str(DEFAULT_MYSQL_PORT))
72+
port = int(port_value) if str(port_value).isdigit() else DEFAULT_MYSQL_PORT
73+
74+
# Determine connection method - explicit parameters override environment variables
75+
explicit_cluster_arn = kwargs.get('aws_cluster_arn')
76+
explicit_hostname = kwargs.get('hostname')
77+
if explicit_cluster_arn:
78+
cluster_arn = explicit_cluster_arn
79+
hostname = None
80+
elif explicit_hostname:
81+
cluster_arn = None
82+
hostname = explicit_hostname
83+
else:
84+
# Fall back to environment variables
85+
env_cluster_arn = os.getenv('MYSQL_CLUSTER_ARN')
86+
env_hostname = os.getenv('MYSQL_HOSTNAME')
87+
88+
# Apply same exclusion to env vars as above
89+
if env_cluster_arn:
90+
cluster_arn = env_cluster_arn
91+
hostname = None
92+
elif env_hostname:
93+
cluster_arn = None
94+
hostname = env_hostname
95+
else:
96+
cluster_arn = None
97+
hostname = None
98+
6699
return {
67-
'cluster_arn': kwargs.get('aws_cluster_arn') or os.getenv('MYSQL_CLUSTER_ARN'),
100+
'cluster_arn': cluster_arn,
68101
'secret_arn': kwargs.get('aws_secret_arn') or os.getenv('MYSQL_SECRET_ARN'),
69102
'database': kwargs.get('database_name') or os.getenv('MYSQL_DATABASE'),
70103
'region': kwargs.get('aws_region') or os.getenv('AWS_REGION'),
104+
'hostname': hostname,
105+
'port': port,
71106
'max_results': kwargs.get('max_query_results')
72107
or int(os.getenv('MYSQL_MAX_QUERY_RESULTS', str(DEFAULT_MAX_QUERY_RESULTS))),
73108
'pattern_analysis_days': kwargs.get(
@@ -91,23 +126,37 @@ def validate_connection_params(
91126
Tuple of (missing_params, param_descriptions)
92127
"""
93128
if source_db_type == 'mysql':
94-
required_params = ['cluster_arn', 'secret_arn', 'database', 'region']
95-
missing_params = [
96-
param
97-
for param in required_params
98-
if not connection_params.get(param)
99-
or (
100-
isinstance(connection_params[param], str)
101-
and connection_params[param].strip() == ''
129+
missing_params = []
130+
param_descriptions = {}
131+
cluster_arn = connection_params.get('cluster_arn')
132+
hostname = connection_params.get('hostname')
133+
134+
# Check for either RDS Data API or direct connection parameters
135+
has_rds_data_api = bool(isinstance(cluster_arn, str) and cluster_arn.strip())
136+
has_direct_connection = bool(isinstance(hostname, str) and hostname.strip())
137+
138+
# Check that we have a connection method
139+
if not has_rds_data_api and not has_direct_connection:
140+
missing_params.append('cluster_arn OR hostname')
141+
param_descriptions['cluster_arn OR hostname'] = (
142+
'Required: Either aws_cluster_arn (Aurora MySQL cluster ARN for RDS Data API) OR hostname (MySQL server hostname for direct connection)'
102143
)
103-
]
104144

105-
param_descriptions = {
106-
'cluster_arn': 'AWS cluster ARN',
107-
'secret_arn': 'AWS secret ARN',
108-
'database': 'Database name',
109-
'region': 'AWS region',
110-
}
145+
# Check common required parameters
146+
common_required_params = ['secret_arn', 'database', 'region']
147+
for param in common_required_params:
148+
if not connection_params.get(param) or (
149+
isinstance(connection_params[param], str)
150+
and connection_params[param].strip() == ''
151+
):
152+
missing_params.append(param)
153+
param_descriptions.update(
154+
{
155+
'secret_arn': 'Secrets Manager secret ARN containing DB credentials',
156+
'database': 'Database name to analyze',
157+
'region': 'AWS region where your database instance and Secrets Manager are located',
158+
}
159+
)
111160
return missing_params, param_descriptions
112161
return [], {}
113162

@@ -232,15 +281,17 @@ class MySQLAnalyzer(DatabaseAnalyzer):
232281
def is_performance_schema_enabled(result):
233282
"""Check if MySQL performance schema is enabled from query result."""
234283
if result and len(result) > 0:
235-
performance_schema_value = str(
236-
result[0].get('', '0')
237-
) # Key is empty string by mysql package design, so checking only value here
284+
# MySQL MCP server uses col['label'] for column names, creating {"@@performance_schema": "1"}
285+
# Reference: https://github.com/awslabs/mcp/pull/1361
286+
performance_schema_value = str(result[0].get('@@performance_schema', '0'))
238287
return performance_schema_value == '1'
239288
return False
240289

241290
def __init__(self, connection_params):
242291
"""Initialize MySQL analyzer with connection parameters."""
243-
self.cluster_arn = connection_params['cluster_arn']
292+
self.cluster_arn = connection_params.get('cluster_arn')
293+
self.hostname = connection_params.get('hostname')
294+
self.port = connection_params.get('port', 3306)
244295
self.secret_arn = connection_params['secret_arn']
245296
self.database = connection_params['database']
246297
self.region = connection_params['region']
@@ -250,10 +301,27 @@ def __init__(self, connection_params):
250301
async def _run_query(self, sql, query_parameters=None):
251302
"""Internal method to run SQL queries against MySQL database."""
252303
try:
253-
# Create a new connection with current parameters
254-
db_connection = DBConnection(
255-
self.cluster_arn, self.secret_arn, self.database, self.region, True
256-
)
304+
# Create appropriate connection type based on available parameters
305+
if self.cluster_arn:
306+
# Use RDS Data API connection
307+
db_connection = RDSDataAPIConnection(
308+
cluster_arn=self.cluster_arn,
309+
secret_arn=self.secret_arn,
310+
database=self.database,
311+
region=self.region,
312+
readonly=DEFAULT_READONLY,
313+
)
314+
else:
315+
# Use direct asyncmy connection
316+
db_connection = AsyncmyPoolConnection(
317+
hostname=self.hostname,
318+
port=self.port,
319+
database=self.database,
320+
readonly=DEFAULT_READONLY,
321+
secret_arn=self.secret_arn,
322+
region=self.region,
323+
)
324+
257325
# Pass connection parameter directly to mysql_query
258326
result = await mysql_query(sql, DummyCtx(), db_connection, query_parameters)
259327
return result

src/dynamodb-mcp-server/awslabs/dynamodb_mcp_server/server.py

Lines changed: 26 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,16 @@ async def source_db_analyzer(
124124
ge=1,
125125
),
126126
aws_cluster_arn: Optional[str] = Field(
127-
default=None, description='AWS cluster ARN (overrides MYSQL_CLUSTER_ARN env var)'
127+
default=None,
128+
description='AWS cluster ARN for RDS Data API connection (overrides MYSQL_CLUSTER_ARN env var)',
129+
),
130+
hostname: Optional[str] = Field(
131+
default=None,
132+
description='Database hostname for direct connection (overrides MYSQL_HOSTNAME env var)',
133+
),
134+
port: Optional[int] = Field(
135+
default=None,
136+
description='Database port for direct connection (overrides MYSQL_PORT env var, default: 3306)',
128137
),
129138
aws_secret_arn: Optional[str] = Field(
130139
default=None, description='AWS secret ARN (overrides MYSQL_SECRET_ARN env var)'
@@ -150,18 +159,26 @@ async def source_db_analyzer(
150159
- Use these analysis files with the dynamodb_data_modeling tool to design your DynamoDB schema
151160
152161
Connection Requirements (MySQL/Aurora):
153-
- AWS RDS Data API enabled on your Aurora MySQL cluster
162+
Two connection methods are supported:
163+
1. RDS Data API (Aurora MySQL): Requires aws_cluster_arn
164+
2. Direct Connection (Aurora/RDS/self-hosted MySQL): Requires hostname and port
165+
166+
Note: Do not provide both CLUSTER_ARN and HOSTNAME- the tool will automatically use the first available option
167+
168+
Both methods require:
154169
- Database credentials stored in AWS Secrets Manager
155-
- Appropriate IAM permissions to access RDS Data API and Secrets Manager
170+
- Appropriate IAM permissions to access Secrets Manager
156171
- For complete analysis: MySQL Performance Schema must be enabled (set performance_schema=ON)
157172
- Without Performance Schema: Schema-only analysis is performed (no query pattern data)
158173
159174
Environment Variables (Optional):
160175
You can set these instead of passing parameters:
161176
- MYSQL_DATABASE: Database name to analyze
162-
- MYSQL_CLUSTER_ARN: Aurora cluster ARN
177+
- MYSQL_CLUSTER_ARN: Aurora cluster ARN (for RDS Data API)
178+
- MYSQL_HOSTNAME: Database hostname (for direct connection)
179+
- MYSQL_PORT: Database port (for direct connection, default: 3306)
163180
- MYSQL_SECRET_ARN: Secrets Manager secret ARN containing DB credentials
164-
- AWS_REGION: AWS region where your database is located
181+
- AWS_REGION: AWS region where your database instance and Secrets Manager are located
165182
- MYSQL_MAX_QUERY_RESULTS: Maximum rows per query (default: 500)
166183
167184
Typical Usage:
@@ -185,6 +202,8 @@ async def source_db_analyzer(
185202
pattern_analysis_days=pattern_analysis_days,
186203
max_query_results=max_query_results,
187204
aws_cluster_arn=aws_cluster_arn,
205+
hostname=hostname,
206+
port=port,
188207
aws_secret_arn=aws_secret_arn,
189208
aws_region=aws_region,
190209
output_dir=output_dir,
@@ -195,10 +214,9 @@ async def source_db_analyzer(
195214
source_db_type, connection_params
196215
)
197216
if missing_params:
217+
# Handle missing required parameters
198218
missing_descriptions = [param_descriptions[param] for param in missing_params]
199-
return (
200-
f'To analyze your {source_db_type} database, I need: {", ".join(missing_descriptions)}'
201-
)
219+
return f'Missing required parameters: {", ".join(missing_descriptions)}'
202220

203221
logger.info(
204222
f'Starting database analysis for {source_db_type} database: {connection_params.get("database")}'

src/dynamodb-mcp-server/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ dependencies = [
1313
"typing-extensions==4.14.1",
1414
"strands-agents>=1.5.0",
1515
"dspy-ai>=2.6.27",
16-
"awslabs.mysql-mcp-server==1.0.5",
16+
"awslabs.mysql-mcp-server==1.0.9",
1717
"awslabs-aws-api-mcp-server==1.0.2",
1818
]
1919
license = {text = "Apache-2.0"}

src/dynamodb-mcp-server/tests/test_dynamodb_server.py

Lines changed: 28 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ async def test_source_db_analyzer_missing_parameters(tmp_path):
7373
output_dir=str(tmp_path),
7474
)
7575

76-
assert 'To analyze your mysql database, I need:' in result
76+
assert 'Missing required parameters: Database name to analyze' in result
7777

7878

7979
@pytest.mark.asyncio
@@ -90,7 +90,10 @@ async def test_source_db_analyzer_empty_parameters(tmp_path):
9090
output_dir=str(tmp_path),
9191
)
9292

93-
assert 'To analyze your mysql database, I need:' in result
93+
assert (
94+
'Missing required parameters: Required: Either aws_cluster_arn (Aurora MySQL cluster ARN for RDS Data API) OR hostname (MySQL server hostname for direct connection)'
95+
in result
96+
)
9497

9598

9699
@pytest.mark.asyncio
@@ -112,7 +115,29 @@ async def test_source_db_analyzer_env_fallback(monkeypatch, tmp_path):
112115
)
113116

114117
# Should still fail due to missing cluster_arn, but covers env fallback lines
115-
assert 'To analyze your mysql database, I need:' in result
118+
assert 'Missing required parameters:' in result
119+
120+
121+
@pytest.mark.asyncio
122+
async def test_source_db_analyzer_connection_method_precedence(mysql_env_setup, tmp_path):
123+
"""Test that explicit connection parameters take precedence over environment variables."""
124+
# mysql_env_setup fixture sets MYSQL_CLUSTER_ARN, MYSQL_SECRET_ARN, AWS_REGION
125+
# Pass explicit hostname parameter - this should take precedence over env cluster_arn
126+
result = await source_db_analyzer(
127+
source_db_type='mysql',
128+
database_name='test',
129+
pattern_analysis_days=30,
130+
max_query_results=None,
131+
aws_cluster_arn=None, # No explicit cluster_arn
132+
hostname='explicit-hostname', # Explicit hostname should pass
133+
aws_secret_arn=None, # Will use env var
134+
aws_region=None, # Will use env var
135+
output_dir=str(tmp_path),
136+
)
137+
138+
# The test validates the precedence works: it used Asyncmy Direct connection (hostname)
139+
# instead of RDS Data API (cluster_arn), even though env had MYSQL_CLUSTER_ARN
140+
assert 'Database Analysis Failed' in result
116141

117142

118143
@pytest.mark.asyncio

0 commit comments

Comments
 (0)