Credit Card Fraud Detection

Domain: Finance & Cybersecurity | Tools: Python, XGBoost, Scikit-learn, Matplotlib, Seaborn

Project Overview

This machine learning project investigates the detection of fraudulent credit card transactions using a real-world-style dataset. The goal was to develop and compare classification models and provide insights into transaction patterns and fraud detection techniques.

Objectives

Train ML models to classify transactions as fraudulent or legitimate
Conduct statistical analysis of fraud patterns
Visualize trends by transaction type, location, and merchant
Deploy a trained model for future use

Dataset Summary

98,000+ records with ~1% fraudulent transactions
Features include transaction type, location, amount, merchant ID, and timestamps
Date and time features engineered for modeling

Modeling

Three models were compared:

Logistic Regression: 64% Accuracy | ROC-AUC: 0.66
Random Forest: 97% Accuracy | ROC-AUC: 0.997
Tuned XGBoost: 92% Accuracy | ROC-AUC: 0.976

Model	Accuracy	ROC-AUC	F1 Score
Logistic Regression	64%	0.66	0.64
Random Forest	97%	0.997	0.97
Tuned XGBoost	92%	0.976	0.92

Feature Importance

Top predictors included transaction type, amount, merchant ID, and time-of-day features.

Deployment

The Random Forest and XGBoost models were serialized using Joblib for deployment in application as a Flask API. Future enhancements include real-time fraud scoring and user feedback loops.

Resources

Explore Project Notebook Download Report (PDF) Live App, coming soon!