Data Science / Machine Learning / Professional internship
Insurance Payment Anomaly Detection
PyTorch/Azure ML neural network for DSN company insurance anomaly detection.
- Role
- Data science intern
- When
- Jul 2023 — Dec 2023
- Status
- professional research project
A research-oriented ML project at Klesia to detect anomalies in company DSN declarations and improve the current non-ML anomaly detection algorithm.
From raw DSN to red flags
Company payroll declarations (DSN) go in; anomaly probabilities come out. The network below draws itself the way the project came together — data first, layers next, alerts last. (Illustrative — the real data is confidential.)
The pipeline
- 01 Extract
Query Klesia's warehouses with SAS Enterprise Guide.
- 02 Clean
Turn messy DSN declarations into modelling datasets (SAS + Python).
- 03 Train
PyTorch neural networks on Azure ML experiments.
- 04 Compare
Benchmark against the existing rule-based detector to find where ML wins.
Features
- DSN anomaly detection
- Company insurance payment anomaly prediction
- SAS/Python preprocessing pipeline
- PyTorch neural network
- Azure ML experimentation
What I did
- Queried large Klesia datasets using SAS Enterprise Guide.
- Created clean modelling datasets from raw DSN data.
- Combined SAS Enterprise Guide and Python preprocessing.
- Built and trained PyTorch neural network models in Azure ML.
- Compared ML anomaly detection with existing business algorithms.
Project timeline
Data science internship timeline from SAS data extraction to dataset construction and PyTorch anomaly-detection experiments on Azure ML.
-
Jul 2023 Planning / Research
Research internship started
Started the Klesia data science internship focused on company insurance payment anomaly detection.
-
Jul 2023 Planning / Research
SAS and database discovery
Learned SAS Enterprise Guide and queried Klesia databases to understand available DSN data.
-
Aug 2023 ~ Development
Dataset created from raw data
Created machine-learning datasets from uncleaned enterprise data using SAS and Python processing.
-
Sep 2023 ~ Development
Embedding and feature experiments
Explored feature engineering and embedding techniques for DSN anomaly detection.
-
Oct 2023 ~ Development
PyTorch model on Azure ML
Developed neural-network experiments with PyTorch and Azure ML to predict or detect payment anomalies.
-
Dec 2023 Release / Delivery
Final internship results
Delivered the internship research work comparing ML-based anomaly detection against the existing non-ML algorithmic approach.
Built with
- SAS Enterprise Guide
- Python
- PyTorch
- Azure ML
- Machine Learning
- Data cleaning
- Feature engineering