Skip to content

A Python-based static analysis tool that scans files and directories for suspicious patterns commonly found in malware.

License

Notifications You must be signed in to change notification settings

mindfultatiana/malwarescanner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 

Repository files navigation

Malware Indicator Scanner v2.2

A Python-based static analysis tool that scans files and directories for suspicious patterns commonly found in malware. This scanner detects over 40+ different indicators including encoded PowerShell commands, memory injection techniques, persistence mechanisms, and network activity. Now with archive extraction support for ZIP and RAR files, plus dedicated archive threat detection.

Features

  • 🔍 Comprehensive Pattern Detection - Scans for 15+ categories of malicious indicators
  • 📊 Real-time Progress Bar - Shows actual scanning progress as files are analyzed
  • 🎯 Threat Scoring System - Calculates a 0-100 threat score with severity levels
  • 🧹 False Positive Filtering - Intelligently filters out binary gibberish and noise
  • 📁 Recursive Directory Scanning - Scan entire folders and subfolders
  • 📄 Detailed Reports - Line numbers, file locations, and context for each finding
  • Multi-format Support - Handles executables, scripts, text files, and archives
  • 📦 Archive Support - Automatically extracts and scans ZIP and RAR files
  • 🚨 Archive Threat Detection - Identifies archive-specific malware delivery techniques (new in v2.2)

Supported File Types

Binary Files

  • .exe - Windows executables
  • .dll - Dynamic-link libraries
  • .sys - System files

Script Files

  • .bat - Batch scripts
  • .ps1 - PowerShell scripts
  • .cmd - Command scripts
  • .vbs - VBScript files
  • .js - JavaScript files
  • .txt - Text files
  • .log - Log files

Archive Files

  • .zip - ZIP archives
  • .rar - RAR archives

Detection Categories

Archive Red Flags (New in v2.2)

Archive-specific threats detected in ZIP and RAR containers:

High Severity (7-10)

  • Path Traversal Attack (Severity 9) - Files with ../ or leading / designed to extract outside the archive directory
  • Double Extension Trick (Severity 8) - Deceptive files like document.pdf.exe masquerading as documents
  • Suspicious Archive Comment (Severity 7) - Archive metadata containing malicious keywords
  • Suspicious Archive Structure (Severity 7) - Multiple executables bundled together in unusual patterns

Medium-High Severity (6)

  • Archive Encryption (Severity 6) - Password-protected archives hiding contents
  • Possible SFX Archive (Severity 6) - Self-extracting archives with potential auto-execute capabilities

Medium Severity (5)

  • Potential Archive Bomb (Severity 7) - Extreme compression ratios (>1000:1) indicating DoS attacks
  • Deep Archive Nesting (Severity 5) - Deeply nested directories (>5 levels) used to hide malware

Malware Pattern Categories

High Severity (7-10)

  • Encoded PowerShell - Hidden/encoded PowerShell commands
  • Certificate Utility Abuse - certutil download techniques
  • BITS Transfer - Background Intelligent Transfer Service abuse
  • Registry Persistence - Auto-run registry modifications
  • Memory Injection APIs - VirtualAlloc, WriteProcessMemory, CreateRemoteThread
  • PowerShell Download/Execute - Invoke-Expression, DownloadString

Medium Severity (5-6)

  • DLL Execution - rundll32, regsvr32 abuse
  • Scheduled Tasks - Task creation for persistence
  • Service Creation - Windows service installation
  • WinAPI Indicators - Suspicious API usage
  • Network APIs - HTTP request functions

Low Severity (2-4)

  • Command Execution - powershell, cmd, rundll32
  • Network Indicators - HTTP/HTTPS URLs
  • Temp Path Markers - Temporary directory usage

Installation

Prerequisites

  • Python 3.6 or higher
  • For ZIP support: included with Python standard library
  • For RAR support: optional rarfile module

Setup

  1. Download the malware_scanner.py file
  2. Make it executable (optional):
    chmod +x malware_scanner.py

Optional: RAR Support

To scan RAR archives, install the optional rarfile module:

pip install rarfile

If rarfile is not installed, the scanner will continue to work with all other file types and display a helpful message when encountering RAR files.

Usage

Scan a Single File

python malware_scanner.py suspicious.exe

Scan an Archive File

# Scan ZIP archive
python malware_scanner.py suspicious.zip

# Scan RAR archive
python malware_scanner.py suspicious.rar

Scan a Directory (Recursive)

# Windows
python malware_scanner.py C:\Downloads\

# Linux/Mac
python malware_scanner.py /home/user/downloads/

Example Output

Clean File

================================================================================
SCAN RESULTS
================================================================================

✓ NO MALWARE INDICATORS FOUND

The scanned file(s) appear to be clean.
No suspicious patterns were detected.

Suspicious Archive

================================================================================
SCAN RESULTS
================================================================================

⚠ THREAT DETECTED

Threat Level: HIGH
Threat Score: 72.5/100
Indicators Found: 8

[████████████████████████████░░░░░░░░░░░░░░] 72.5%

--------------------------------------------------------------------------------
ARCHIVE RED FLAGS
--------------------------------------------------------------------------------

[Double Extension Trick] - Severity: HIGH
  Occurrences: 2

  🔹 File: suspicious.zip
     Location: C:\Users\Downloads
     Count: 2
     • Files: document.pdf.exe, invoice.xlsx.exe

[Path Traversal Attack] - Severity: HIGH
  Occurrences: 1

  🔹 File: suspicious.zip
     Location: C:\Users\Downloads
     Count: 1
     • Paths: ../../Windows/System32/malware.exe

[Suspicious Archive Structure] - Severity: HIGH
  Occurrences: 1

  🔹 File: suspicious.zip
     Location: C:\Users\Downloads
     Count: 1
     • Executable count: 4

--------------------------------------------------------------------------------
MALWARE PATTERN DETECTIONS
--------------------------------------------------------------------------------

[Memory Injection APIs] - Severity: HIGH
  Occurrences: 3

  🔹 File: payload.exe
     Location: C:\Users\Downloads\suspicious_archive_contents
     Count: 3
     • Line 1234: VirtualAlloc called with RWX permissions
     • Line 2345: WriteProcessMemory detected
     • Line 3456: CreateRemoteThread found

================================================================================
RECOMMENDATIONS
================================================================================

⚠ CRITICAL: This file shows multiple high-risk indicators.
  • Do NOT execute this file
  • Submit to VirusTotal or a sandbox for analysis
  • Consider this file potentially malicious

Scan completed in 3.52 seconds

Threat Levels

Score Level Description
0 CLEAN No malware indicators found
1-19 LOW Minimal suspicious patterns, likely false positives
20-39 MODERATE Some suspicious activity, verify source
40-59 ELEVATED Multiple indicators, exercise caution
60-79 HIGH Significant malicious patterns detected
80-100 CRITICAL Severe threat, do not execute

Report Sections

The scanner generates detailed reports with separate sections for easy interpretation:

Archive Red Flags (New in v2.2)

When scanning archives, this section displays archive-specific threats including:

  • Password Protection - Encrypted archives hiding contents
  • Path Traversal Attacks - Files designed to extract outside the archive directory
  • Double Extension Tricks - Files like document.pdf.exe to deceive users
  • Archive Bombs - Extreme compression ratios indicating potential DoS attacks
  • Suspicious Archive Structure - Multiple executables or unusual organization
  • Deep Nesting - Deeply nested directories hiding malware
  • SFX Indicators - Possible self-extracting archive with auto-execute capabilities
  • Suspicious Archive Comments - Metadata containing malicious commands

Archive red flags are displayed in a dedicated section before file content analysis, with severity levels (5-9) to indicate the threat level of each archive characteristic.

Malware Pattern Detections

This section shows traditional malware indicators found in file contents:

  • Encoded PowerShell commands
  • Memory injection APIs
  • Registry persistence mechanisms
  • Network communication attempts
  • And all other 15+ detection categories

Both sections include file locations, occurrence counts, and specific details for investigation.

How It Works

  1. File Discovery - Recursively finds all supported files in the target directory
  2. Archive Handling - Automatically extracts ZIP and RAR files to temporary directories
  3. Archive Analysis - Scans archive metadata and structure for red flags before extraction
  4. Pattern Matching - Uses regular expressions to search for malicious indicators
  5. Context Extraction - Captures surrounding code for each match
  6. False Positive Filtering - Removes gibberish matches from binary files
  7. Cleanup - Safely removes temporary files after archive scanning
  8. Severity Weighting - Calculates threat score based on pattern severity
  9. Report Generation - Produces human-readable analysis with separate sections for archive threats and content patterns

What's New in v2.2

Separated Report Sections

  • Archive Red Flags Section - Dedicated display for archive-specific threats detected in ZIP/RAR containers
  • Malware Pattern Detections Section - Traditional pattern matches found in file contents
  • Both sections are displayed in the same report for comprehensive analysis
  • Archive threats are evaluated and displayed before content-based detections for priority assessment

Archive Threat Severity Classification

Archive red flags are now fully categorized by severity level:

  • High Severity (7-10) - Path traversal, double extensions, malicious comments, suspicious structures
  • Medium-High Severity (6) - Password protection, SFX indicators
  • Medium Severity (5) - Archive bombs with extreme compression ratios and deep nesting

Enhanced Report Clarity

  • Archive findings are grouped separately but included in overall threat scoring
  • Each threat type shows occurrence count, file location, and specific details
  • Archive comment scanning highlights suspicious keywords in metadata
  • Compression ratio analysis helps identify potential archive bomb attacks

What's New in v2.1

Archive Support

  • ZIP Scanning: Automatically extracts and scans ZIP file contents using Python's built-in zipfile module
  • RAR Scanning: Extracts and scans RAR files using the optional rarfile module
  • Temporary File Handling: Archives are extracted to temporary directories and securely cleaned up after scanning
  • Nested Archives: Archives are extracted and all contained files are scanned using the same pattern detection

Archive Security Detection

  • Password-Protected Archives: Flags encrypted archives that may be hiding malicious content
  • Path Traversal Attacks: Detects ../ and leading / in file paths designed to extract outside directories
  • Double Extension Tricks: Identifies deceptive files like document.pdf.exe bundled in archives
  • Archive Bomb Detection: Calculates compression ratios and flags extreme compression (>1000:1)
  • Suspicious Archive Structure: Alerts when multiple executables are bundled together
  • Deep Directory Nesting: Flags archives with deeply nested directories (>5 levels) used to hide malware
  • SFX Indicators: Detects possible self-extracting archives with auto-execute capabilities
  • Archive Comment Scanning: Analyzes archive metadata for suspicious keywords like 'powershell', 'execute', 'iex'

Implementation Details

  • Archives are extracted to isolated temporary directories for safe analysis
  • All extracted files are scanned with the same suspicious pattern detection as standalone files
  • Archive metadata is analyzed before extraction for structural threats
  • Temporary files are automatically cleaned up after scanning completes
  • Progress tracking works seamlessly with archive contents
  • If RAR module is not installed, a helpful message is displayed

Limitations

⚠️ This is a static analysis tool - It only searches for patterns in file contents and does not:

  • Execute or sandbox files
  • Perform dynamic analysis
  • Detect polymorphic or encrypted malware
  • Replace professional antivirus software
  • Guarantee 100% detection or accuracy
  • Extract nested archives (only one level)

False Positives: Legitimate software (especially installers, system utilities, and developer tools) may trigger alerts. Always verify the source and context.

False Negatives: Sophisticated malware using obfuscation, encryption, or novel techniques may not be detected.

Archive Limitations: Very large archive files may take time to extract and scan. Nested or multi-level archives are not recursively extracted.

Security Best Practices

  1. Run in a Safe Environment - Scan suspicious files in a VM or sandboxed environment
  2. Don't Execute Flagged Files - If something scores HIGH or CRITICAL, do not run it
  3. Verify Sources - Only download software from trusted sources
  4. Use Multiple Tools - Combine with VirusTotal, antivirus, and sandbox analysis
  5. Keep Updated - Malware evolves; update detection patterns regularly
  6. Archive Caution - Exercise extra care with unknown archives, especially RAR files

Use Cases

  • 🔬 Malware Analysis - Initial triage of suspicious files and archives
  • 🛡️ Security Auditing - Scan downloads or email attachments including compressed files
  • 📚 Education - Learn about malware indicators and techniques
  • 🢂 Incident Response - Quick assessment of potentially compromised systems
  • 🧪 Development - Verify your software doesn't trigger false positives

Contributing

To add new detection patterns:

  1. Open malware_scanner.py
  2. Locate the self.patterns dictionary in __init__
  3. Add your pattern following this format:
'Category Name': {
    'patterns': [
        r'regex_pattern_here',
        r'another_pattern',
    ],
    'severity': 7  # 1-10 scale
},

To add support for additional archive formats:

  1. Import the appropriate library at the top of the file
  2. Create a new scan_[format]_file() method following the existing archive patterns
  3. Add the extension to self.archive_extensions
  4. Update the file type handling in scan_single_file() and scan_directory()

License

This tool is provided for educational and security research purposes. Use responsibly and ethically.

Disclaimer

This software is provided "as is" without warranty of any kind. The authors are not responsible for any damage or legal issues arising from its use. Always comply with applicable laws and regulations when analyzing files.


Author: Tatiana Mathis
Version: 2.2
Last Updated: 2025

Changelog:

  • v2.2: Added separate Archive Red Flags report section and improved archive threat classification
  • v2.1: Added ZIP and RAR archive scanning support with archive security detection
  • v2.0: Initial release with comprehensive pattern detection

For issues or suggestions, please create an issue or contribute to the project.

About

A Python-based static analysis tool that scans files and directories for suspicious patterns commonly found in malware.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages