Skip to content

Commit a5d1130

Browse files
committed
chore(sagemaker-unified-studio-spark-troubleshooting-mcp-server): Address Comments
1 parent c5a338c commit a5d1130

File tree

1 file changed

+4
-5
lines changed
  • src/sagemaker-unified-studio-spark-troubleshooting-mcp-server

1 file changed

+4
-5
lines changed

src/sagemaker-unified-studio-spark-troubleshooting-mcp-server/README.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,14 @@ A fully managed remote MCP server that provides specialized tools for troublesho
88

99
- **Intelligent Failure Analysis**: Automatically analyzes Spark event logs, error messages, and resource usage to pinpoint exact issues including memory problems, configuration errors, and code bugs
1010
- **Multi-Platform Support**: Troubleshoot PySpark and Scala applications across Amazon EMR on EC2, EMR Serverless, AWS Glue, and Amazon SageMaker Notebooks
11-
- **Automated Feature Extraction**: Connects to platform-specific UIs (EMR Persistent UI, Glue Studio Spark UI, EMR-Serverless Spark History Server) to extract comprehensive context
11+
- **Automated Feature Extraction**: Connects to platform-specific spark history server (EMR, Glue, EMR-Serverless) to extract comprehensive context
1212
- **GenAI Root Cause Analysis**: Leverages AI models and Spark knowledge base to correlate features and identify root causes of performance issues or failures
1313
- **Code Recommendation Engine**: Provides actionable code modifications, configuration adjustments, and architectural improvements with concrete examples
1414
- **Natural Language Interface**: Use conversational prompts to request troubleshooting analysis and code recommendations
15-
- **Cross-Region Processing**: Uses advanced inference capabilities to process natural language requests and generate intelligent responses
1615

1716
## Architecture
1817

19-
The troubleshooting agent has three main components: any MCP-compatible AI Assistant in your development environment for interaction, the [MCP Proxy for AWS](https://github.com/aws/mcp-proxy-for-aws) that handles secure communication between your client and the MCP server, and the Amazon SageMaker Unified Studio Managed MCP Server that provides specialized Spark troubleshooting tools for Amazon EMR, AWS Glue, and Amazon SageMaker Notebooks.
18+
The troubleshooting agent has three main components: an MCP-compatible AI Assistant in your development environment for interaction, the [MCP Proxy for AWS](https://github.com/aws/mcp-proxy-for-aws) that handles secure communication and authentication between your client and AWS services, and the Amazon SageMaker Unified Studio Remote MCP Server (preview) that provides specialized Spark troubleshooting tools for Amazon EMR, AWS Glue and Amazon SageMaker Notebooks. This diagram illustrates how you interact with the Amazon SageMaker Unified Studio Remote MCP Server through your AI Assistant.
2019

2120
![img](https://docs.aws.amazon.com/images/emr/latest/ReleaseGuide/images/spark-troubleshooting-agent-architecture.png)
2221

@@ -220,13 +219,13 @@ This server processes your Spark application logs and configuration files to pro
220219
The agent supports both PySpark and Scala Spark applications running on Amazon EMR on EC2, EMR Serverless, AWS Glue, and Amazon SageMaker Notebooks.
221220

222221
### 2. What happens if my Spark job is still running?
223-
The troubleshooting tools only support analysis of failed Spark workloads. You'll need to wait for the job to complete (and fail) before analysis can be performed.
222+
The troubleshooting tools only support analysis of failed Spark workloads.
224223

225224
### 3. Can I get code recommendations for successful jobs?
226225
Code recommendations are primarily focused on fixing issues in failed workloads, but you can request code-level suggestions for optimization even without a full failure analysis.
227226

228227
### 4. How does the agent access my Spark logs?
229-
The agent connects to platform-specific interfaces: EMR Persistent UI for EMR-EC2, Glue Studio Spark UI for AWS Glue, and Spark History Server for EMR Serverless to extract necessary telemetry data.
228+
The agent connects to platform-specific interfaces: EMR Persistent UI for EMR-EC2, Glue Studio Spark UI for AWS Glue, Spark History Server for EMR Serverless And S3/Cloudwatch logs to extract necessary telemetry data.
230229

231230
### 5. Is my data secure during the troubleshooting process?
232231
Yes, all processing follows AWS data protection standards. The agent analyzes logs and configurations temporarily to provide recommendations without permanently storing sensitive data.

0 commit comments

Comments
 (0)