Commit f089914
Restructure gallery: separate dataset issues and update all metrics
Major improvements to gallery organization and accuracy:
1. Separated Dataset Issues into Dedicated Section
• Created new expandable "Dataset Issues (18)" section
• Cyan theme (#06b6d4) to distinguish from errors
• Removed from errors grid filter buttons
• Independent expand/collapse functionality
• Clear description about exclusion from accuracy
2. Updated Errors Section
• Now shows only 45 real errors (down from 63)
• 5 filter buttons: All, Incorrect Parse, Missed Parse,
Not ADE Focus, Prompt/LLM Misses
• Updated "All" button: 63 → 45
• Simplified descriptions
3. Updated All Hardcoded Metrics (99.082% → 99.156%)
• Title: "99.156% Accuracy"
• Header subtitle updated
• Error count: 49 → 45
• Dataset issues: 14 → 18
• Correct answers: 5,286/5,335 → 5,286/5,331
• Improvement: +3.72pp → +3.80pp
• Footer accuracy updated
• All JavaScript console logs updated
4. Technical Improvements
• Used BeautifulSoup for proper HTML parsing
• Filter JavaScript now targets only #errors-container
• Prevents filter interference with dataset section
• Proper grid layout maintained (3 columns)
Result: Clean separation between system errors and annotation
quality issues. All metrics now accurately reflect 4 cards
reclassified as dataset issues.
Category breakdown:
• Real errors: 45 (Incorrect Parse: 13, Missed Parse: 5,
Not ADE Focus: 9, Prompt/LLM Misses: 18)
• Dataset issues: 18 (shown separately)
• Final accuracy: 99.156% (5,286/5,331)1 parent e9e03c1 commit f089914
1 file changed
+6093
-6738
lines changed
0 commit comments