ISSS608
Visual Analytics and Applications
Group Project - Loan Default Prediction Challenge
(Group 5)
Loan default prediction through data exploration, analysis and prediction using advanced R technologies
Find MoreLoan lending offers substantial profits, but also carries the risk of customers failing to repay loans on time, known as loan default risk, which is a type of credit risk. To manage uncertainty, lending institutions have established lending standards and created predictive models to better evaluate the likelihood of loan repayment through credit risk assessment of customers. Over the past thirty years, loan default risk assessment has progressed from traditional credit scoring systems to data modeling utilizing data analytics and machine learning techniques.
Read MoreThis project aims to explore R technologies to develop reusable solutions (such as R-shiny application), to support users to perform data exploration, analysis, and prediction from set of customer variables to select the optimal variables to perform the customer loan default prediction.
In this project, two types of loans 1) new and 2) repeating loans together with customer demographics and loan history are used in this study to analyze customers’ ability to pay and willingness to pay, and predictive models are provided to evaluate factors selected on compared customer loan default prediction.
To effectively predict the customer loan default, following approaches to explore, analysis, mining factors are important to evaluate loan default.
1. Loan default factor exploration: visualization on factors, to understand the trends, correlation of factors.
2. Loan default factor mining: deep dive analysis for further analysis the importance of loan default factors for prediction.
3. Loan default prediction: interfaces to select factors, data sampling methods, and algorithms to conduct predictive analyze for customer loan default.
The data that we are using, is from Data Science Nigeria Challenge #1: Loan Default Prediction. There are two types of risk models in general:
1. New business risk, which would be used to assess the risk of application(s) associated with the first loan that he/she applies.
2. Repeat or behavior risk model, in which case the customer has been a client and applies for a repeat loan.
Overall, we can split the data into 2 main subsets of data: For Single Loans or Repeat Loans based on number of loans per customer in the data.
Our application analyse the customer demographics and loan default predictive analysis
Read More