-
Notifications
You must be signed in to change notification settings - Fork 76
feat: Migrate to native findings format with enhanced CLI and HTML export #36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
tduhamel42
wants to merge
16
commits into
dev
Choose a base branch
from
feature/native-findings-format-rebased
base: dev
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Priority 1 implementation: - Created native FuzzForge findings format schema with full support for: - 5-level severity (critical/high/medium/low/info) - Confidence levels - CWE and OWASP categorization - found_by attribution (module, tool, type) - LLM context tracking (model, prompt, temperature) - Updated ModuleFinding model with new fields: - Added rule_id for pattern identification - Added found_by for detection attribution - Added llm_context for LLM-detected findings - Added confidence, cwe, owasp, references - Added column_start/end for precise location - Updated create_finding() helper with new required fields - Enhanced _generate_summary() with confidence and source tracking - Fixed critical ID bug in CLI: - Changed 'ff finding show' to use --id (unique) instead of --rule - Added new show_findings_by_rule() function to show ALL findings matching a rule - Updated display_finding_detail() to support both native and SARIF formats - Now properly handles multiple findings with same rule_id Breaking changes: - create_finding() now requires rule_id and found_by parameters - show_finding() now uses --id instead of --rule flag
- Renamed FindingRecord.sarif_data to findings_data - Updated database schema: sarif_data column -> findings_data column - Updated all database methods to work with native format: - save_findings() - get_findings() - list_findings() - get_all_findings() - get_aggregated_stats() - Updated SQL queries to use native format JSON paths: - Changed from SARIF paths ($.runs[0].results) to native paths ($.findings) - Updated severity filtering from SARIF levels (error/warning/note) to native (critical/high/medium/low/info) - Updated CLI commands to support both formats during transition: - get_findings command now extracts summary from both native and SARIF formats - show_finding and show_findings_by_rule updated to use findings_data field - Format detection to handle data from API (still SARIF) and database (native) Breaking changes: - Database schema changed - existing databases will need recreation - FindingRecord.sarif_data renamed to findings_data
- Renamed sarif_reporter.py to native_reporter.py to reflect new functionality
- Updated WorkflowFindings model to use native format
- Field name 'sarif' kept for API compatibility but now contains native format
- Updated docstring to reflect native format usage
- Converted SARIFReporter to Native Reporter:
- Module name changed from sarif_reporter to native_reporter (v2.0.0)
- Updated metadata and input/output schemas
- Removed SARIF-specific config (tool_name, include_code_flows)
- Added native format config (workflow_name, run_id)
- Implemented native report generation:
- Added _generate_native_report() method
- Generates native FuzzForge format with full field support:
- Unique finding IDs
- found_by attribution (module, tool, type)
- LLM context when applicable
- Full severity scale (critical/high/medium/low/info)
- Confidence levels
- CWE and OWASP mappings
- Enhanced location info (columns, snippets)
- References and metadata
- Added _create_native_summary() for aggregated stats
- Summary includes counts by severity, confidence, category, source, and type
- Tracks affected files count
- Kept old SARIF generation methods for reference
- Will be moved to separate SARIF exporter module
Breaking changes:
- Reporter now outputs native format instead of SARIF
- Existing workflows using sarif_reporter will need updates
- Config parameters changed (tool_name -> workflow_name, etc.)
Priority 3 implementation:
**Improved Table Display:**
- Removed hardcoded 50-result limit
- Added pagination with --limit and --offset parameters
- New table columns: Finding ID (8 chars), Confidence (H/M/L), Found By
- Supports both native and SARIF formats with auto-detection
- Proper severity ordering (critical > high > medium > low > info)
- Pagination footer showing "Showing X-Y of Z results"
**Syntax Highlighting:**
- Added syntax-highlighted code snippets using Rich's Syntax
- Auto-detects language from file extension (20+ languages supported)
- Line numbers with correct start line from finding location
- Monokai theme for better readability
**Enhanced Detail View:**
- Confidence indicators with emoji (🟢 High, 🟡 Medium, 🔴 Low)
- Type-specific badges (🤖 LLM, 🔧 Tool, 🎯 Fuzzer, 👤 Manual)
- LLM context display with model name and prompt preview
- Better formatted found_by info with module and type
- Added suggestion to view all findings with same rule
- Cleaner recommendation display with 💡 icon
**New Commands:**
- Added 'ff findings by-rule' command to show all findings matching a rule
- Registered as @app.command("by-rule")
**Updated Related Commands:**
- all_findings: Updated to use 5-level severity (critical/high/medium/low/info)
- Table columns changed from Error/Warning/Note to Critical/High/Medium/Low
- Summary panel updated with proper severity mapping
- Support for both native and SARIF format findings_data
**Breaking Changes:**
- Severity display changed from 3-level (error/warning/note) to 5-level
- Table structure modified with new columns
- Old SARIF-only views deprecated in favor of format-agnostic displays
Fixed broken import after renaming sarif_reporter.py to native_reporter.py
Updated 10 modules to use the new create_finding() signature with required rule_id and found_by parameters: - llm_analyzer.py: Added FoundBy and LLMContext for AI-detected findings - bandit_analyzer.py: Added tool attribution and moved CWE/confidence to proper fields - security_analyzer.py: Updated all three finding types (secrets, SQL injection, dangerous functions) - mypy_analyzer.py: Added tool attribution and moved column info to column_start - mobsf_scanner.py: Updated all 6 finding types (permissions, manifest, code analysis, behavior) with proper line number handling - opengrep_android.py: Added tool attribution, proper CWE/OWASP formatting, and confidence mapping - dependency_scanner.py: Added pip-audit attribution for CVE findings - file_scanner.py: Updated both sensitive file and enumeration findings - cargo_fuzzer.py: Added fuzzer type attribution for crash findings - atheris_fuzzer.py: Added fuzzer type attribution for Python crash findings All modules now properly track: - Finding source (module, tool name, version, type) - Confidence levels (high/medium/low) - CWE and OWASP mappings where applicable - LLM context for AI-detected issues
Aligns main.py with the updated findings.py command that changed from --rule to --id for finding lookups by unique UUID.
Removes the Confidence column from the findings table display to eliminate confusion with the Severity column (both used High/Medium/Low terminology). Changes: - Removed 'Conf' column from table structure - Removed confidence extraction logic for both native and SARIF formats - Removed confidence badge creation and styling - Table now shows: ID | Severity | Rule | Message | Found By | Location Confidence data is still available in detailed finding view (ff finding show).
Removes the Rule column from findings table to simplify the view and reduce redundancy with the Message column. Rule ID is still available in: - Detailed finding view (ff finding show <run-id> --id <finding-id>) - By-rule grouping command (ff findings by-rule <run-id> --rule <rule-id>) Changes: - Removed 'Rule' column from table structure - Removed rule_text extraction and styling logic - Expanded Message column from 35 to 50 chars (more space available) - Expanded Location column from 18 to 20 chars - Table now shows: ID | Severity | Message | Found By | Location Benefits: - Cleaner, more scannable table - Message column has more room to show details - Less visual clutter while maintaining all functionality
- Rewrite export_to_html with Bootstrap 5 styling - Add Chart.js visualizations (severity, type, category, source) - Add executive summary dashboard with stat cards - Add interactive filtering by severity, type, and search - Add sortable table columns - Add expandable row details with full finding information - Add Prism.js syntax highlighting for code snippets - Display LLM context, confidence, CWE/OWASP, recommendations - Make responsive with print-friendly CSS - Update extract_simplified_findings to handle native format - Update export_to_csv to handle native format with more fields - Fix export functions to use findings_data instead of sarif_data - Add safe_escape helper to handle None values
- Fix OptionInfo bug causing 'ff finding <run_id>' to crash - Add explicit limit=None, offset=0 parameters in main.py calls - Prevents OptionInfo objects from being used in arithmetic operations - Fix command suggestions after workflow completion - Change 'fuzzforge findings' to 'ff finding' (correct syntax) - Add missing 'View findings' suggestion after submission - Fix --fail-on help text - Change from 'severity' to 'SARIF level' (error,warning,note,info) - Matches actual implementation - Update CLI documentation - Fix 'ff finding show' parameter from --rule to --id - Mark unimplemented AI commands as 'Coming Soon' - Correct 'ff ingest' documentation to match actual implementation - Remove fake subcommands, document actual options
Fixed multiple critical bugs identified during comprehensive code audit: **Critical Fixes:** - Fix file handle leaks in SDK client upload methods (sync and async) - Use context managers to ensure file handles are properly closed - Affects: sdk/src/fuzzforge_sdk/client.py lines 397, 484 **High Priority Fixes:** - Fix IndexError in OSS-Fuzz stats parsing when accessing array elements - Add bounds checking before accessing parts[i+1] - Affects: workers/ossfuzz/activities.py lines 372-376 - Fix IndexError in exception handling URL parsing - Add empty string validation before splitting URL segments - Prevents crash when parsing malformed URLs - Affects: sdk/src/fuzzforge_sdk/exceptions.py lines 419-426 **Medium Priority Fixes:** - Fix IndexError in Android workflow SARIF report parsing - Check if runs list is empty before accessing first element - Affects: backend/toolbox/workflows/android_static_analysis/workflow.py line 270 All fixes follow defensive programming practices with proper bounds checking and resource management using context managers.
Fixed failing unit tests that were using the old create_finding() signature. The native findings format refactoring added two new required parameters: - rule_id: Identifier for the rule/pattern that detected the finding - found_by: FoundBy object with module, tool, and detection type info Updated tests: - test_cargo_fuzzer.py::test_create_finding_from_crash - test_atheris_fuzzer.py::test_create_crash_finding Both tests now properly instantiate FoundBy objects with appropriate fuzzer metadata (module name, tool name, version, and type="fuzzer").
Fixed Pydantic validation error by importing FoundBy from modules.base instead of models.finding_schema. The ModuleFinding class in base.py uses its own FoundBy definition, and Pydantic requires the exact same class instance for validation to pass. This resolves the validation errors: - test_cargo_fuzzer.py::test_create_finding_from_crash - test_atheris_fuzzer.py::test_create_crash_finding
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This PR migrates FuzzForge from SARIF-based findings to a native findings format, with significant enhancements to CLI display and HTML export functionality.
Key Changes
🏗️ Core Architecture Changes
Native Findings Format - Introduced new native finding schema replacing SARIF dependency
FindingSchemadata model with required/optional fieldsDatabase Refactoring - Updated SQLite schema for native format
Reporter Module Rewrite - Converted
sarif_reporter.py→native_reporter.pycreate_findingsignature🎨 CLI Enhancements
Enhanced Findings Display
Modernized HTML Export
🐛 Critical Bug Fixes
File Handle Leaks (sdk/client.py:397, 484)
IndexError Fixes
CLI Command Issues
--fail-onhelp text descriptionsAffected Components
Testing
✅ All bug fixes tested and verified
✅ CLI commands working with new format
✅ HTML export renders correctly with new styling
✅ Finding creation and retrieval working across all modules
Breaking Changes
.fuzzforge/directorycreate_finding()calls now use new signatureMigration Notes
Existing users should:
.fuzzforge/findings.dbif neededff init --forceto reinitialize with new schemaRelated Issues
Addresses multiple issues identified in comprehensive code audit including critical resource leaks and error handling bugs.
Note: This PR supersedes the original
feature/native-findings-formatbranch which had unrelated git history after commit rewriting. All changes have been cleanly cherry-picked onto current dev.