Machine Learning Model for Spam & Phishing Detection

Designed and trained a supervised ML classification model on a real-world dataset for spam and phishing detection. Applied feature engineering, model selection, and evaluation metrics — and analysed the security and ethical implications of AI-based threat detection including false positive rates and deployment risks.

Overview

This MSc project sat at the intersection of cybersecurity and machine learning — applying supervised classification to a real-world spam and phishing detection problem. The project went beyond model training to critically analyse the security and ethical implications of deploying ML-based threat detection in production environments.

Problem

Spam and phishing remain among the most prevalent initial attack vectors in enterprise environments. Rule-based filters are brittle — attackers adapt quickly to signature changes. Machine learning approaches can generalise across unseen patterns, but they introduce their own failure modes that security practitioners need to understand.

Dataset & Feature Engineering

Large real-world labelled dataset (spam/ham/phishing classification)
Feature extraction: email header analysis, URL pattern features, text-based features (term frequency, n-grams), structural features (HTML ratio, link density)
Feature selection: assessed feature importance to identify the most discriminating signals and reduce dimensionality

Model Development

Algorithms Evaluated

Logistic Regression (baseline)
Random Forest
Gradient Boosting (XGBoost)
Support Vector Machine (SVM)

Evaluation Metrics

Standard accuracy was deliberately deprioritised in favour of security-relevant metrics:

Precision — proportion of flagged items that are actually malicious
Recall — proportion of actual malicious items that are flagged
F1 Score — harmonic mean balancing precision and recall
False Positive Rate — proportion of legitimate emails incorrectly flagged (directly impacts user productivity and trust)
False Negative Rate — proportion of malicious emails missed (directly impacts security risk)

The trade-off between false positives and false negatives is a security policy decision, not a pure optimisation problem.

Security & Ethical Analysis

False Positive Impact

A false positive in a spam filter means a legitimate email goes to junk. In a phishing detection system, it could mean a legitimate security notification is suppressed. The organisational cost of false positives is not just user frustration — it erodes trust in the system, leading users to disable or ignore alerts.

Adversarial Robustness

ML-based spam filters face adversarial attacks: spammers actively probe and adapt to detection boundaries. Models trained on historical data degrade as attackers learn the feature space. Continuous retraining and anomaly detection for model drift are production requirements, not nice-to-haves.

Ethical Considerations

Training data bias — if the training dataset over-represents certain languages, domains, or communication styles as spam, the model will discriminate against legitimate communications from those sources
Privacy implications of email content analysis
Transparency requirements — users should understand that automated classification is occurring and have a recourse process for false positives

Results

Best-performing model achieved high recall on phishing samples with an acceptable false positive rate on legitimate communications. Full evaluation metrics reported across all models with confidence intervals.

Key Learnings

Machine learning in security is not a silver bullet. A model that achieves 99% accuracy on a balanced test set may still produce thousands of false positives per day in production at enterprise scale. Understanding the security implications of the full error distribution — not just aggregate accuracy — is what separates security-aware ML engineering from standard data science. The ethical and adversarial robustness analysis in this project are as professionally important as the model performance results.