Anomaly Detection in Security Logs

A modular machine learning suite for detecting anomalies across varied security log types.

Project Summary

This project explores various techniques for identifying anomalous behavior in system, network, and endpoint security logs. The suite includes multiple Jupyter notebooks each focusing on different detection methods (e.g. statistical outlier models, clustering, autoencoders), log types (EDR, DNS, proxy, authentication), and feature engineering pipelines.

Highlights

10+ modular notebooks, each dedicated to a different log source or detection strategy
Custom feature engineering pipelines for proxy, DNS, EDR, and authentication logs
Demonstrates unsupervised, semi-supervised, and supervised learning methods
Includes visualizations, EDA, and detailed metric evaluation (ROC, F1, clustering scores)

Notebooks

Notebook 1: Anomaly Detection in Proxy Logs

Applies DBSCAN and isolation forests to detect suspicious outbound connections.

View Notebook →

Notebook 2: DNS Behavior Modeling

Uses time-series embeddings and window-based anomaly scoring on DNS resolution patterns.

View Notebook →

Example Visualizations

Below are selected results from the notebooks:

Key Takeaways

Different log types require different detection techniques — no one-size-fits-all.
Effective detection pipelines depend on strong preprocessing and domain-aligned feature engineering.
Visualizations are critical for interpreting model results in an operational setting.