Federated LogTracer is a cutting-edge framework designed to enhance the efficiency, privacy, and automation of log analysis in cybersecurity. By leveraging Decentralized Federated Learning (DFL) and advanced contextual data extraction techniques, this system addresses challenges such as centralized log analysis vulnerabilities, lack of contextual insights, and high manual workloads.
Overview Federated LogTracer introduces a decentralized approach to automated log analysis, focusing on: • Preserving log owner privacy. • Addressing risks of centralized architectures. • Enhancing contextual log data representation. • Reducing the manual workload of security analysts. The framework achieves a 99.4% detection accuracy on the DARPA Transparent Computing dataset and significantly reduces computational complexity while safeguarding privacy.
Features • Decentralized Federated Learning: A robust peer-to-peer communication model for model aggregation, eliminating Single Points of Failure (SPOFs). • Contextual Data Representation: Extracts and represents meaningful data from raw logs using advanced parse graph techniques. • Automated Malicious Path Detection: Efficiently identifies malicious activities within log files. • Privacy-Preserving Architecture: Keeps sensitive log data localized while enabling robust analysis.
Folder Structure The repository is organized as follows:
- Malicious path detection Contains scripts and models for detecting malicious log paths using the DFL approach.
- LogParser Houses the log parsing algorithms, including the specialized parse graph generation process that structures raw logs into a graph format.
- GroundTruth Includes ground truth files and rule definitions used for benchmarking and evaluation against datasets like the DARPA Transparent Computing dataset.
- Model Contains the Federated Learning (FL) and Decentralized Federated Learning (DFL) implementations, including the deep neural network (DNN) architecture and training scripts.
Installation To use this repository, follow these steps:
- Clone the repository: bash Copy code git clone https://github.com/yourusername/federated-logtracer.git cd federated-logtracer
- Ensure the necessary dataset (e.g., DARPA TRACE) is downloaded and placed in the appropriate folder.
Usage
- Log Parsing: Use the LogParser module to preprocess raw logs and generate parse graphs: bash Copy code python log_parser.py --input <log_file> --output <graph_file>
- Training the Model: Run the DFL model training script: bash Copy code python train_dfl.py --config <config_file>
- Malicious Path Detection: Execute the malicious path detection module: bash Copy code python detect_path.py --input <parsed_logs> --rules <ground_truth>
Datasets The framework has been validated using the DARPA Transparent Computing dataset. Please ensure compliance with the dataset's licensing and usage terms.