A Python-based static analysis tool that scans files and directories for suspicious patterns commonly found in malware. This scanner detects over 40+ different indicators including encoded PowerShell commands, memory injection techniques, persistence mechanisms, and network activity. Now with archive extraction support for ZIP and RAR files, plus dedicated archive threat detection.
- 🔍 Comprehensive Pattern Detection - Scans for 15+ categories of malicious indicators
- 📊 Real-time Progress Bar - Shows actual scanning progress as files are analyzed
- 🎯 Threat Scoring System - Calculates a 0-100 threat score with severity levels
- 🧹 False Positive Filtering - Intelligently filters out binary gibberish and noise
- 📁 Recursive Directory Scanning - Scan entire folders and subfolders
- 📄 Detailed Reports - Line numbers, file locations, and context for each finding
- ⚡ Multi-format Support - Handles executables, scripts, text files, and archives
- 📦 Archive Support - Automatically extracts and scans ZIP and RAR files
- 🚨 Archive Threat Detection - Identifies archive-specific malware delivery techniques (new in v2.2)
.exe- Windows executables.dll- Dynamic-link libraries.sys- System files
.bat- Batch scripts.ps1- PowerShell scripts.cmd- Command scripts.vbs- VBScript files.js- JavaScript files.txt- Text files.log- Log files
.zip- ZIP archives.rar- RAR archives
Archive-specific threats detected in ZIP and RAR containers:
- Path Traversal Attack (Severity 9) - Files with
../or leading/designed to extract outside the archive directory - Double Extension Trick (Severity 8) - Deceptive files like
document.pdf.exemasquerading as documents - Suspicious Archive Comment (Severity 7) - Archive metadata containing malicious keywords
- Suspicious Archive Structure (Severity 7) - Multiple executables bundled together in unusual patterns
- Archive Encryption (Severity 6) - Password-protected archives hiding contents
- Possible SFX Archive (Severity 6) - Self-extracting archives with potential auto-execute capabilities
- Potential Archive Bomb (Severity 7) - Extreme compression ratios (>1000:1) indicating DoS attacks
- Deep Archive Nesting (Severity 5) - Deeply nested directories (>5 levels) used to hide malware
- Encoded PowerShell - Hidden/encoded PowerShell commands
- Certificate Utility Abuse -
certutildownload techniques - BITS Transfer - Background Intelligent Transfer Service abuse
- Registry Persistence - Auto-run registry modifications
- Memory Injection APIs -
VirtualAlloc,WriteProcessMemory,CreateRemoteThread - PowerShell Download/Execute -
Invoke-Expression,DownloadString
- DLL Execution -
rundll32,regsvr32abuse - Scheduled Tasks - Task creation for persistence
- Service Creation - Windows service installation
- WinAPI Indicators - Suspicious API usage
- Network APIs - HTTP request functions
- Command Execution -
powershell,cmd,rundll32 - Network Indicators - HTTP/HTTPS URLs
- Temp Path Markers - Temporary directory usage
- Python 3.6 or higher
- For ZIP support: included with Python standard library
- For RAR support: optional
rarfilemodule
- Download the
malware_scanner.pyfile - Make it executable (optional):
chmod +x malware_scanner.py
To scan RAR archives, install the optional rarfile module:
pip install rarfileIf rarfile is not installed, the scanner will continue to work with all other file types and display a helpful message when encountering RAR files.
python malware_scanner.py suspicious.exe# Scan ZIP archive
python malware_scanner.py suspicious.zip
# Scan RAR archive
python malware_scanner.py suspicious.rar# Windows
python malware_scanner.py C:\Downloads\
# Linux/Mac
python malware_scanner.py /home/user/downloads/================================================================================
SCAN RESULTS
================================================================================
✓ NO MALWARE INDICATORS FOUND
The scanned file(s) appear to be clean.
No suspicious patterns were detected.
================================================================================
SCAN RESULTS
================================================================================
⚠ THREAT DETECTED
Threat Level: HIGH
Threat Score: 72.5/100
Indicators Found: 8
[████████████████████████████░░░░░░░░░░░░░░] 72.5%
--------------------------------------------------------------------------------
ARCHIVE RED FLAGS
--------------------------------------------------------------------------------
[Double Extension Trick] - Severity: HIGH
Occurrences: 2
🔹 File: suspicious.zip
Location: C:\Users\Downloads
Count: 2
• Files: document.pdf.exe, invoice.xlsx.exe
[Path Traversal Attack] - Severity: HIGH
Occurrences: 1
🔹 File: suspicious.zip
Location: C:\Users\Downloads
Count: 1
• Paths: ../../Windows/System32/malware.exe
[Suspicious Archive Structure] - Severity: HIGH
Occurrences: 1
🔹 File: suspicious.zip
Location: C:\Users\Downloads
Count: 1
• Executable count: 4
--------------------------------------------------------------------------------
MALWARE PATTERN DETECTIONS
--------------------------------------------------------------------------------
[Memory Injection APIs] - Severity: HIGH
Occurrences: 3
🔹 File: payload.exe
Location: C:\Users\Downloads\suspicious_archive_contents
Count: 3
• Line 1234: VirtualAlloc called with RWX permissions
• Line 2345: WriteProcessMemory detected
• Line 3456: CreateRemoteThread found
================================================================================
RECOMMENDATIONS
================================================================================
⚠ CRITICAL: This file shows multiple high-risk indicators.
• Do NOT execute this file
• Submit to VirusTotal or a sandbox for analysis
• Consider this file potentially malicious
Scan completed in 3.52 seconds
| Score | Level | Description |
|---|---|---|
| 0 | CLEAN | No malware indicators found |
| 1-19 | LOW | Minimal suspicious patterns, likely false positives |
| 20-39 | MODERATE | Some suspicious activity, verify source |
| 40-59 | ELEVATED | Multiple indicators, exercise caution |
| 60-79 | HIGH | Significant malicious patterns detected |
| 80-100 | CRITICAL | Severe threat, do not execute |
The scanner generates detailed reports with separate sections for easy interpretation:
When scanning archives, this section displays archive-specific threats including:
- Password Protection - Encrypted archives hiding contents
- Path Traversal Attacks - Files designed to extract outside the archive directory
- Double Extension Tricks - Files like
document.pdf.exeto deceive users - Archive Bombs - Extreme compression ratios indicating potential DoS attacks
- Suspicious Archive Structure - Multiple executables or unusual organization
- Deep Nesting - Deeply nested directories hiding malware
- SFX Indicators - Possible self-extracting archive with auto-execute capabilities
- Suspicious Archive Comments - Metadata containing malicious commands
Archive red flags are displayed in a dedicated section before file content analysis, with severity levels (5-9) to indicate the threat level of each archive characteristic.
This section shows traditional malware indicators found in file contents:
- Encoded PowerShell commands
- Memory injection APIs
- Registry persistence mechanisms
- Network communication attempts
- And all other 15+ detection categories
Both sections include file locations, occurrence counts, and specific details for investigation.
- File Discovery - Recursively finds all supported files in the target directory
- Archive Handling - Automatically extracts ZIP and RAR files to temporary directories
- Archive Analysis - Scans archive metadata and structure for red flags before extraction
- Pattern Matching - Uses regular expressions to search for malicious indicators
- Context Extraction - Captures surrounding code for each match
- False Positive Filtering - Removes gibberish matches from binary files
- Cleanup - Safely removes temporary files after archive scanning
- Severity Weighting - Calculates threat score based on pattern severity
- Report Generation - Produces human-readable analysis with separate sections for archive threats and content patterns
- Archive Red Flags Section - Dedicated display for archive-specific threats detected in ZIP/RAR containers
- Malware Pattern Detections Section - Traditional pattern matches found in file contents
- Both sections are displayed in the same report for comprehensive analysis
- Archive threats are evaluated and displayed before content-based detections for priority assessment
Archive red flags are now fully categorized by severity level:
- High Severity (7-10) - Path traversal, double extensions, malicious comments, suspicious structures
- Medium-High Severity (6) - Password protection, SFX indicators
- Medium Severity (5) - Archive bombs with extreme compression ratios and deep nesting
- Archive findings are grouped separately but included in overall threat scoring
- Each threat type shows occurrence count, file location, and specific details
- Archive comment scanning highlights suspicious keywords in metadata
- Compression ratio analysis helps identify potential archive bomb attacks
- ZIP Scanning: Automatically extracts and scans ZIP file contents using Python's built-in
zipfilemodule - RAR Scanning: Extracts and scans RAR files using the optional
rarfilemodule - Temporary File Handling: Archives are extracted to temporary directories and securely cleaned up after scanning
- Nested Archives: Archives are extracted and all contained files are scanned using the same pattern detection
- Password-Protected Archives: Flags encrypted archives that may be hiding malicious content
- Path Traversal Attacks: Detects
../and leading/in file paths designed to extract outside directories - Double Extension Tricks: Identifies deceptive files like
document.pdf.exebundled in archives - Archive Bomb Detection: Calculates compression ratios and flags extreme compression (>1000:1)
- Suspicious Archive Structure: Alerts when multiple executables are bundled together
- Deep Directory Nesting: Flags archives with deeply nested directories (>5 levels) used to hide malware
- SFX Indicators: Detects possible self-extracting archives with auto-execute capabilities
- Archive Comment Scanning: Analyzes archive metadata for suspicious keywords like 'powershell', 'execute', 'iex'
- Archives are extracted to isolated temporary directories for safe analysis
- All extracted files are scanned with the same suspicious pattern detection as standalone files
- Archive metadata is analyzed before extraction for structural threats
- Temporary files are automatically cleaned up after scanning completes
- Progress tracking works seamlessly with archive contents
- If RAR module is not installed, a helpful message is displayed
- Execute or sandbox files
- Perform dynamic analysis
- Detect polymorphic or encrypted malware
- Replace professional antivirus software
- Guarantee 100% detection or accuracy
- Extract nested archives (only one level)
False Positives: Legitimate software (especially installers, system utilities, and developer tools) may trigger alerts. Always verify the source and context.
False Negatives: Sophisticated malware using obfuscation, encryption, or novel techniques may not be detected.
Archive Limitations: Very large archive files may take time to extract and scan. Nested or multi-level archives are not recursively extracted.
- Run in a Safe Environment - Scan suspicious files in a VM or sandboxed environment
- Don't Execute Flagged Files - If something scores HIGH or CRITICAL, do not run it
- Verify Sources - Only download software from trusted sources
- Use Multiple Tools - Combine with VirusTotal, antivirus, and sandbox analysis
- Keep Updated - Malware evolves; update detection patterns regularly
- Archive Caution - Exercise extra care with unknown archives, especially RAR files
- 🔬 Malware Analysis - Initial triage of suspicious files and archives
- 🛡️ Security Auditing - Scan downloads or email attachments including compressed files
- 📚 Education - Learn about malware indicators and techniques
- 🢂 Incident Response - Quick assessment of potentially compromised systems
- 🧪 Development - Verify your software doesn't trigger false positives
To add new detection patterns:
- Open
malware_scanner.py - Locate the
self.patternsdictionary in__init__ - Add your pattern following this format:
'Category Name': {
'patterns': [
r'regex_pattern_here',
r'another_pattern',
],
'severity': 7 # 1-10 scale
},To add support for additional archive formats:
- Import the appropriate library at the top of the file
- Create a new
scan_[format]_file()method following the existing archive patterns - Add the extension to
self.archive_extensions - Update the file type handling in
scan_single_file()andscan_directory()
This tool is provided for educational and security research purposes. Use responsibly and ethically.
This software is provided "as is" without warranty of any kind. The authors are not responsible for any damage or legal issues arising from its use. Always comply with applicable laws and regulations when analyzing files.
Author: Tatiana Mathis
Version: 2.2
Last Updated: 2025
Changelog:
- v2.2: Added separate Archive Red Flags report section and improved archive threat classification
- v2.1: Added ZIP and RAR archive scanning support with archive security detection
- v2.0: Initial release with comprehensive pattern detection
For issues or suggestions, please create an issue or contribute to the project.