Mystery of Enron Fraud: Solved with this ML project

The mysterious bankruptcy of the Enron Corporation has led to the development of this project. This project is built to investigate this case on the huge data set of this fraud business, which took place in December 2001. The data set mainly comprises the millions of e-mails sent to and from the executives of the company during the year 2000-2002. The nature of emails was reported to be suspicious, and hence it was not possible for anyone to decide nature.

Looking to build projects on Machine Learning?:

Machine Learning Kit will be shipped to you and you can learn and build using tutorials. You can start for free today!

1. Machine Learning (Career Building Course)

2. Fraud Detection using Machine Learning

3. Machine Learning using Python

4. Movie Recommendation using ML

5. Handwritten Digits Recognition using ML

6. Machine Learning Training & Internship

To decide the nature based on the patterns of data led to the need for a machine learning project. The financial information contains a huge of numeric values, which again becomes a tiring job for anyone to classify. A machine learning application will classify the data itself and give the desired output.

Project Implementation

The first step is to explore the huge data which has around 21 variables and 146 observations. The Outlier investigation consists of checking the odd pattern of data like some of the employees were recorded to earn a huge amount of salary. Then we have to create for POI for received and sent emails. Then select the important feature required for observations, which are stock options, shared receipt, loan advance, long term incentive, salary, etc.

Did you know

Skyfi Labs helps students learn practical skills by building real-world projects.

You can enrol with friends and receive kits at your doorstep

You can learn from experts, build working projects, showcase skills to the world and grab the best jobs.
Get started today!

The Algorithms which are found perfect for the study of data are Gaussian Naïve, Support vector machine and, Decision Tree Classifier. The most crucial part of machine learning is to tune and implement the algorithm. GridSearchCV tool is used to tune the algorithm, which is provided in Scikit learn. To extract most of the information from the data, a validation strategy is used, such as Nested Stratified Shuffle Cross-Validation.

This method will help us to extract the essential information from all that heap of data. Hyperparameter optimization is the process of optimizing the performance of machine learning using parameter tuning. The cross-validation method will help to cross-check the pattern of data and give the desired results. The tree classifier uses the cross-validation method, which is defined in the tester.py function.

Results and Conclusion

The application will hence be able to classify that huge data which almost 1.67 emails. The data will be processed through the algorithms and methods which will detect the real problem. It will show the odd data, which can be considered as fraud elements, and it can play an important role in the investigation of Enron.

Latest projects on Machine Learning

Want to develop practical skills on Machine Learning? Checkout our latest projects and start learning for free

Start for free

Kit required to develop Enron Investigation:

Technologies you will learn by working on Enron Investigation:

Machine Learning

Enron Investigation

Skyfi Labs • Published: 2019-10-30 • Last Updated: 2021-06-14

Machine Learning

Enron Investigation

Latest projects on Machine Learning

Enron Investigation

More Project Ideas on Machine-learning

Subscribe to receive more project ideas