EigenFlow Profiler

An unsupervised, image-inspired PCA approach to distinguish benign vs. multiple attack types in NetFlow data.

Project Overview

Traditional signature-based detection often misses novel or subtle threats. Inspired by facial-recognition “eigenfaces,” we reshape each 77-feature NetFlow record into a 300×300 grayscale array and train group-specific PCA models (“eigenprofiles”) for four attack families—credential abuse, denial-of-service, exploit/malware, and application-layer abuse—plus benign traffic. By measuring reconstruction error (L2 norm) against each profile, we can both flag anomalies and infer attack type without any labeled training.

Key Highlights

Eigenprofile Modeling: Converts flow features into image-like inputs for PCA, enabling interpretable basis vectors.
Unsupervised Detection: Differentiates benign vs. attack solely via reconstruction error—no labels needed in inference.
Attack-Type Profiling: Four group-specific PCA models each best reconstruct their own attack class.
Clear Separation: Reconstruction error distributions show strong separation between benign and every attack family.
Scalable & Lightweight: PCA fitting and inference are linear in data size, suitable for high-volume SOC pipelines.

Pipeline Overview

Example Visualizations

Select outputs from the analysis:

Application-Layer Attack Samples

Original vs. PCA Reconstruction

Reconstruction Error by Group

Notebook

Complete Jupyter notebook demonstrating data prep, PCA modeling, and error-based classification.

View Notebook →

Full Report

Complete writeup including problem statement, detailed methodology and results.

Read the full report here →

Key Takeaways

PCA-based eigenprofiling robustly separates benign and multiple attack classes via error metrics.
Image-style encoding of flow data unlocks proven computer-vision tools for network security.
Unsupervised approach reduces reliance on labeled signatures and improves detection of novel threats.