A sophisticated, enterprise-grade tool for automatically capturing and categorizing backend API endpoints from web applications. Built with TypeScript and following SOLID principles for maximum maintainability and extensibility.
- Automated API Discovery: Automatically navigate through web applications and capture all backend API calls
- Intelligent Categorization: Organize captured endpoints based on application structure and modules
- Structured Output: Generate hierarchical JSON files mirroring the application's URL structure
- Enterprise Ready: Robust error handling, configuration management, and extensible architecture
- 🚀 Playwright-powered browser automation
- 🏗️ SOLID principles architecture
- 📁 Hierarchical file output matching URL structure
- 🔐 Authentication support for secured applications
- 🎯 Smart endpoint categorization with fallback inference
- ⚡ Configurable timeouts and capture parameters
- 🛡️ Comprehensive error handling and logging
- Node.js 16.0 or higher
- npm or yarn package manager
- Clone and Install Dependencies
git clone <repository-url>
cd api-capture-tool
npm install- Install Playwright Browsers
npx playwright install- Environment Configuration
Create a
.envfile (optional):
API_CAPTURE_USERNAME=your_username
API_CAPTURE_PASSWORD=your_password- Input File Setup Create the input structure:
mkdir -p InputPlace your microtec_erp_urls.json in the Input directory.
# Development mode
npm run dev
# Production build and run
npm run build
npm startThe tool uses a hierarchical configuration system:
- Environment Variables (Highest priority)
- Configuration Service defaults
- Input JSON structure for URLs
{
"base_url": "https://your-app.com",
"modules": {
"Module_Name": {
"Section_Name": ["/url/path/1", "/url/path/2"],
"Nested_Section": {
"SubSection": ["/nested/path"]
}
}
}
}src/
├── core/ # Domain Layer (SOLID Principles)
│ ├── interfaces/ # Abstraction contracts
│ ├── entities/ # Business objects
│ └── exceptions/ # Custom error types
├── infrastructure/ # Technical Implementation
│ ├── browser/ # Playwright wrappers
│ ├── file-system/ # File operations
│ └── config/ # Configuration management
├── application/ # Use Cases & Services
│ ├── services/ # Business logic
│ ├── use-cases/ # Application workflows
│ └── dtos/ # Data transfer objects
└── main/ # Composition & Entry Point
└── composition-root.ts
Input JSON → Composition Root → Use Case → Services → Output
↓ ↓ ↓ ↓ ↓
URL Structure Dependency Business Browser JSON Files
Injection Logic Automation
-
Initialization Phase
main.ts → CompositionRoot → BrowserFactory → ConfigurationService -
URL Loading Phase
Use Case → UrlRepository → FileSystemService → JSON Parser → UrlStructure Entities -
Authentication Phase
Use Case → AuthenticationService → Browser Page → Login Flow -
API Capture Phase
Use Case → ApiCaptureService → Browser Events → ApiEndpoint Entities -
Categorization Phase
Use Case → UrlCategorizationService → UrlStructure Matching → CategorizedEndpoint Entities -
Output Phase
Use Case → ApiEndpointRepository → FileSystemService → OrganizedEndpoints → JSON Files
Purpose: Orchestrates the entire API capture workflow
async execute(): Promise<void> {
1. Load URLs from repository
2. Authenticate with application
3. Capture APIs from all URLs
4. Categorize endpoints by module/section
5. Save organized results to file system
}Purpose: Navigates through URLs and captures API requests
async captureApisFromUrls(urls: UrlStructure[]): Promise<ApiEndpoint[]> {
for (const url of urls) {
- Navigate to URL using Playwright
- Wait for API calls with timeout
- Capture unique endpoints via request listeners
- Store in memory map to avoid duplicates
}
return Array.from(capturedEndpoints.values());
}Purpose: Intelligently categorizes endpoints based on source URL
categorizeEndpoint(endpoint: ApiEndpoint, urls: UrlStructure[]): CategorizedEndpoint {
1. Find exact URL match in navigation structure
2. If no match, infer from URL path segments
3. Apply normalization rules (masterdata → Master_data)
4. Handle special API patterns (SideMenu, CurrentUserInfo)
5. Return categorized endpoint with module/section/subsection
}Purpose: Initializes Playwright browser instance with proper configuration
async createBrowser(): Promise<IBrowserService> {
- Launch Chromium in non-headless mode with devtools
- Create new browser context
- Return wrapped browser service for abstraction
}Purpose: Parses input JSON and creates structured URL hierarchy
async loadUrls(): Promise<UrlStructure[]> {
- Read and validate JSON file
- Recursively parse module structure
- Create UrlStructure entities
- Sort by URL length for specificity matching
}Purpose: Transforms internal data structure to serializable JSON
toJSON(): any {
- Convert Map-based structure to plain objects
- Transform entities to DTOs
- Maintain hierarchical module/section/subsection structure
- Ensure proper JSON serialization
}The tool generates a hierarchical file structure:
08new_api_endpoints_output/
├── all_endpoints.json # Complete endpoint catalog
├── General_Settings/ # Module directory
│ ├── General_Settings_endpoints.json # Module-level endpoints
│ ├── Dashboard/ # Section directory
│ │ ├── Dashboard_endpoints.json # Section-level endpoints
│ │ └── SideMenu/ # Subsection directory
│ └── Master_data/
├── Accounting/
└── ... (other modules)
{
"General_Settings": {
"Dashboard": {
"SideMenu": [
{
"method": "GET",
"endpoint": "/api/menu",
"sourcePage": "/erp/dashboard",
"timestamp": "2024-01-15T10:30:00.000Z"
}
]
}
}
}// In ConfigurationService
CAPTURE_TIMEOUT: 15000, // Wait for APIs per page
NAVIGATION_TIMEOUT: 60000, // Page load timeout// Only capture requests matching:
- URL starts with TARGET_API_PREFIX
- Resource type is "xhr" or "fetch"
- Unique method-URL combinations-
Login Failures
- Verify credentials in configuration
- Check network connectivity to target application
- Update CSS selectors if login form changes
-
No APIs Captured
- Verify TARGET_API_PREFIX matches backend domain
- Check if APIs are triggered on page load
- Increase CAPTURE_TIMEOUT for slower applications
-
File System Errors
- Ensure write permissions in output directory
- Verify input JSON file exists and is valid
Enable verbose logging by setting environment variable:
DEBUG_API_CAPTURE=true npm run dev- Uses Map for O(1) endpoint lookups
- Automatic browser resource cleanup
- Streamed file writing for large datasets
- Parallelizable URL processing
- Smart deduplication of endpoints
- Configurable timeouts per environment
The architecture supports easy extensions:
- New Browser Support: Implement
IBrowserFactory - Additional Output Formats: Implement
IApiEndpointRepository - Custom Categorization: Extend
IUrlCategorizationService - Alternative Authentication: Implement
IAuthenticationService
This project is designed for educational and legitimate testing purposes. Ensure you have proper authorization before using against any applications.
Built with ❤️ following SOLID principles for enterprise-grade reliability and maintainability.