Try Demo 🤗HuggingFace | 🚀ModelScope
Join us on 🎮Discord | 💬WeChat
If you like WebQA Agent, please give us a ⭐ on GitHub!
🤖 WebQA Agent is an autonomous web browser agent that audits performance, functionality & UX for engineers and vibe-coding creators. ✨
- 🤖 AI-Powered Testing: Performs autonomous website testing with intelligent planning and reflection—explores pages, plans actions, and executes end-to-end flows without manual scripting. Features 2-stage architecture (lightweight filtering + comprehensive planning) and dynamic test generation for newly appeared UI elements.
- 📊 Multi-Dimensional Observation: Covers functionality, performance, user experience, and basic security; evaluates load speed, design details, and links to surface issues. Uses multi-modal analysis (screenshots + DOM structure + text content) and DOM diff detection to discover new test opportunities.
- 🎯 Actionable Recommendations: Runs in real browsers with smart element prioritization and automatic viewport management. Provides concrete suggestions for improvement with adaptive recovery mechanisms for robust test execution.
- 📈 Visual Reports: Generates detailed HTML test reports with clear, multi-dimensional views for analysis and tracking.
- 🤖 Conversational UI: Autonomously plans goals and interacts across a dynamic chat interface
- 🎨 Creative Page: Explores page structure, identifies elements
Try Demo: 🤗Hugging Face · 🚀ModelScope
Before starting, ensure Docker is installed. If not, please refer to the official installation guide: Docker Installation Guide.
Recommended versions: Docker >= 24.0, Docker Compose >= 2.32.
# 1. Download configuration template
mkdir -p config && curl -fsSL https://raw.githubusercontent.com/MigoXLab/webqa-agent/main/config/config.yaml.example -o config/config.yaml
# 2. Edit configuration file
# Set target.url, llm_config.api_key and other parameters
# 3. One-click start
curl -fsSL https://raw.githubusercontent.com/MigoXLab/webqa-agent/main/start.sh | bashgit clone https://github.com/MigoXLab/webqa-agent.git
cd webqa-agentInstall Python >= 3.10 and run the following commands:
pip install -r requirements.txt
playwright installPerformance Analysis - Lighthouse (Optional)
# Requires Node.js >= 18.0.0
npm installSecurity Scanning - Nuclei (Optional)
Download from: Nuclei Releases
# MacOS
brew install nuclei
# For other systems, download the appropriate version from the link above
# Update templates and verify installation
nuclei -ut -v # Update Nuclei templates
nuclei -version # Verify successful installationAfter configuring config/config.yaml (refer to "Usage > Test Configuration"), run:
python webqa-agent.pywebqa-agent uses YAML configuration for test parameters:
target:
url: https://example.com/ # Website URL to test
description: example description
# max_concurrent_tests: 2 # Optional, default parallel 2
test_config: # Test configuration
function_test: # Functional testing
enabled: True
type: ai # default or ai
business_objectives: example business objectives # Recommended to include test scope, e.g., test search functionality
dynamic_step_generation: # Optional, configuration for dynamic steps generation
enabled: True # Optional, default False, recommended to set True to enable dynamic step generation
max_dynamic_steps: 10 # Optional, default 5, this example uses 10
min_elements_threshold: 1 # Optional, default 2, this example uses 1 for higher sensitivity
ux_test: # User experience testing
enabled: True
performance_test: # Performance analysis
enabled: False
security_test: # Security scanning
enabled: False
llm_config: # Vision model configuration, currently supports OpenAI SDK compatible format only
model: gpt-4.1-2025-04-14 # Primary model for Stage 2 test planning (Recommended)
filter_model: gpt-4o-mini # Lightweight model for Stage 1 element filtering (cost-effective)
api_key: your_api_key
base_url: https://api.example.com/v1
temperature: 0.1 # Optional, default 0.1
# top_p: 0.9 # Optional, if not set, this parameter will not be passed
# max_tokens: 8192 # Optional, maximum output tokens (supports generating more test cases)
browser_config:
viewport: {"width": 1280, "height": 720}
headless: False # Automatically overridden to True in Docker environment
language: zh-CN
cookies: []
save_screenshots: False # Whether to save screenshots to local disk (default: False)
report:
language: en-US # zh-CN, en-US
log:
level: infoPlease note the following important considerations when configuring and running tests:
-
AI Mode: Uses a 2-stage planning architecture where Stage 1 (filter_model) prioritizes elements for efficient analysis, and Stage 2 (primary model) generates comprehensive test cases. The system may reflect and re-plan based on actual page conditions and test coverage, which may result in the final number of executed test cases differing from the initial configuration to ensure effectiveness. When
dynamic_step_generationis enabled, the system automatically generates additional test steps for newly appeared UI elements (e.g., dropdowns, modals) detected through DOM diff analysis. -
Default Mode: The
defaultmode focuses on whether UI interactions (e.g., clicks and navigations) complete successfully.
UX (User Experience) testing focuses on usability and user-friendliness. Uses multi-modal analysis combining screenshots, DOM structure, and text content to evaluate visual quality, detect typos/grammar issues, and validate layout rendering. The model output in the results provides suggestions based on best practices to guide optimization.
Based on our testing, these models work well with WebQA Agent:
| Model | Key Strengths | Notes |
|---|---|---|
| gpt-4.1-2025-04-14 ⭐ | High accuracy & reliability | Best choice |
| gpt-4.1-mini-2025-04-14 | Cost-effective | Economical and practical |
| qwen3-vl-235b-a22b-instruct | Open-source, GPT-4.1 level | Best for on-premise |
| doubao-seed-1-6-vision-250815 | Vision capabilities | Excellent web understanding |
Test results will be generated in the reports directory. Open the HTML report within the generated folder to view results.
- Continuous optimization of AI functional testing: Improve coverage and accuracy
- Functional traversal and page validation: Verify business logic correctness and data integrity
- Interaction and visualization: Test item visualization and local service real-time reasoning process display
- Capability expansion: Multi-model integration and more evaluation dimensions
- natbot: Drive a browser with GPT-3
- Midscene.js: AI Operator for Web, Android, Automation & Testing
- browser-use: AI Agent for Browser control
This project is licensed under the Apache 2.0 License.