Module 2 Lab: Anomaly detection foundations

Module 2 Lab: Anomaly detection foundations#

Build a simple anomaly detector.

Lab Context#

This lab uses synthetic security telemetry with login velocity, data transfer volume, process rarity, and threat labels as a safe proxy for the course setting. It is not a substitute for institutional data, but it lets you practice the reasoning, metrics, and documentation pattern before working with real records.

Lab Tasks#

  1. Run the baseline analysis.

  2. Identify the decision the metric supports.

  3. Change one threshold, score weight, or input assumption.

  4. Compare the result before and after your change.

  5. Record one deployment risk that the synthetic data cannot reveal.

import numpy as np
import matplotlib.pyplot as plt

rng = np.random.default_rng(2)
n = 96
exposure = rng.beta(2, 4, size=n)
severity = rng.beta(2.5, 2.5, size=n)
control_gap = rng.beta(3, 5, size=n)
activity = rng.beta(2, 6, size=n)
business_impact = rng.beta(2, 3, size=n)

risk_score = 0.25*exposure + 0.25*severity + 0.20*control_gap + 0.15*activity + 0.15*business_impact
threshold = float(np.quantile(risk_score, 0.80))
priority = risk_score >= threshold

plt.figure(figsize=(6, 3))
plt.scatter(severity, risk_score, c=priority, cmap="coolwarm", s=24)
plt.xlabel("severity")
plt.ylabel("risk/detection priority")
plt.title("Module 2 Lab: Anomaly detection foundations")
plt.tight_layout()

summary = {
    "priority_count": int(priority.sum()),
    "threshold": threshold,
    "top_indices": np.argsort(risk_score)[-5:][::-1].tolist(),
    "review_note": "Inspect high-score cases for false positives and missing context before action.",
}
summary
{'priority_count': 20,
 'threshold': 0.4501860567503254,
 'top_indices': [70, 55, 56, 14, 16],
 'review_note': 'Inspect high-score cases for false positives and missing context before action.'}
../_images/3ab4eea327f65cdac0c974d13917c1df054f043ce4be25968ec67376a990d68b.png
reflection = {
    "what_changed": "",
    "metric_before": "",
    "metric_after": "",
    "interpretation": "",
    "synthetic_data_limit": "",
    "next_real_world_evidence_needed": "",
}
reflection
{'what_changed': '',
 'metric_before': '',
 'metric_after': '',
 'interpretation': '',
 'synthetic_data_limit': '',
 'next_real_world_evidence_needed': ''}