Anomaly Detection in Time-Series Data

Conformal LSTM for Anomaly Detection in Julia

Project Overview

This project implements a high-performance LSTM (Long Short-Term Memory) network in Julia to detect anomalies in streaming time-series data. Utilizing the Numenta Anomaly Benchmark (NAB), I developed a system that doesn’t just predict values, but provides a statistically sound “safety margin” using Conformal Prediction.

View Project on GitHub Open in Colab Dataset (GitHub - Numenta Anomaly Benchmark)


Methodology

1. High-Performance Modeling (Julia & Flux)

Leveraging the Julia programming language for its C-like speed, I implemented the LSTM architecture using the Flux.jl library. This allows for rapid training and inference on large-scale AWS Cloudwatch and traffic datasets.

2. Conformal Prediction Framework

Most AI models give a “guess.” This project uses Conformal Prediction to generate a dynamic threshold based on historical error distribution. If a new data point falls outside this mathematically rigorous interval, it is flagged as an anomaly with a specific confidence level (e.g., 95%).

3. Supervised vs. Unsupervised Pipelines

  • Supervised: Trained on known anomaly patterns to recognize specific signatures.
  • Unsupervised: A forecasting-based approach that identifies “surprises” in data it has never seen before.

Evaluation Results

The model was evaluated against the Numenta Anomaly Benchmark (NAB). Below are the top-performing categories for both the supervised and unsupervised pipelines.

Full Experimental Results

The sections below contain the complete results.


Complete Analysis Notebook

To view the full Julia implementation, mathematical derivations for the conformal thresholds, and performance logs, please visit the technical notebook page.

TipProject Documentation

The complete methodology, data analysis, and model performance metrics are detailed in the full technical report.

Back to top