Data Mining For Detecting Fraud in Health Care Claims

The objective of this project was to look at an alternative approach in determining if particular health care claims were fraudulent. The general approach in finding fraudulent entries in a database is to compare proven patterns of fraud in historical data to the current trends. Instead we looked into proven trends that exist in numerical data (ie Bendford's Law) and compare to see if the new data accurately follows this data. This algorithm was implemented in a Java program to allow the program to be used online.

Screenshot of Prototype (Data Censored for Confidentiality)

Process Overview:

Without going into overly detailed steps; in general what this prototype does is takes one or two database files (Access, Excel, Tab Delimited) and analyzes the data in comparision to Bendford's Law. If two files are opened we look for relationships between the attributes to link the entries over multiple databases. Mathematical analysis and probabilities are rated and given rewards values for each cell. The program will systematically walk through the data based on two different methods and accumulate the rewards based on its steps. These rewards values can be exported to a tab deliminated file that can be further analyzed by a human to see if the path taken (entries) could be fraudulent.


The prototype has been completed and has yet to be tested and analyzed by a human auditor to ensure the project was successful in finding fraudulent entries.


All content is copyright © 2015 Sing Chen, unless otherwise specified. All Rights Reserved.