Click here for linear version of these pages.
  PT flag DE flag
Home
About Us
New Account
Access Your Account
News
Info & Links
ANN Documentation
Links
Data Security
User License Agreement
How to quote microCortex.com
Scientific Publications Related to the microCortex Algorithm
Why ANN?
To Get Started
WalkThrough Guide
Guide: Credit Risk Assessment
Guide: How to Use and Features
Guide: How to Analyse the ANN
Data Types
Very Quick Guide to Neural Networks
ANN: More Details
Statistical Analysis
microCortex.com logo
Privacy Statement
Find at our Site
Impressum
Contact us

Guide: Credit Risk Assessment (ver.1.05) (Printer friendly)

Table of Content

1. Introduction

This is a step by step guide that walks you through the process of developing an artificial neural network (ANN) at microCortex.com. Here we show you a very simple way of how to use the application in an area of renewed interest, the Credit Risk Assessment.

This is about finding accurate predictors of individual risk in the credit portfolios. The reader is invited to repeat these steps using the example on a fictitious mortgage loan data set to get a feeling for the procedure and also for the applicability of the microCortex computer environment. This data set refers to mortgage loans to individuals.

You will understand how to:
  • transfer the data from your spreadsheet to the data input page,
  • interpret the quality of the ANN obtained,
  • submit a new case for the ANN to answer (use the ANN to predict).

Just follow the steps and... enjoy it!

2. The data Top

Putting yourself into the place of a bank loaner you can easily understand that one of the most important issues you would like to know in advance, while lending money, is whether will your client pay back the loan or not. In other words, you will want to predict your client's likelihood of repayment.

The data set used here refers to mortgage loans to individuals. The likelihood of repayment is measured as a simple: "Yes, the client will pay back the loan" or "No, the client will not pay back the loan".

ANN's work through a process of learning with examples from the past in order to predict the future - the same is to say, learning a generalizable association amongst data or simply training the ANN. This means you have to set a record of your clients' past behavior leading the ANN to learn which client profiles tend to fail the repayment and which don't. This record must have two main categories of data:

The input data (Fig.1) - the set of values for each criteria you think can influence the loan repayment.

Example: customer's age, income, number of children. Note that you will also be able to access the sensitivity analysis (how strongly each variable determines the repayment).

The output data (Fig.2) - the set of records with the "answer" of each time you loaned money: "Yes, the client payed back the loan" or "No, the client didn't pay back the loan".

If you gather that data in a spreadsheet you get something like shown in the figures:

Your Spreadsheet
Spreadsheet with data of input
Fig. 1 - Input Data in a spreadsheet

Spreadsheet with data of output
Fig. 2-Output Data in a spreadsheet

Note: The sample data used to generate it is in the links at the bottom of this page.

Please remember this sample data is totally fictitious. It's not our purpose to give you here an accurate view of the risk management in banking. Real world application of ANN's to Credit Risk Assessment can possibly understand the use of other criteria and relationships amongst data different from the ones shown here.

An example: "beeing divorced" is said here as having a strong influence on the mortgage likelihood of repayment - although this can be true don't take it as a cientific or even empirical basis to say this is what happens in the real world.

One last call about two important issues in your data:

  • Use a minimum of 150 observations (cases) to train the ANN. As ANN 's are trained with past observed data, the more observations you use to train your ANN the more accurate and reliable it will be for predictions and sensitivity analysis (the more you practice to ride a bicycle the better rider you become!);

  • The data you use to train the ANN should be as generalized as possible - try not to use always the same kind of client's profile (Example: it is better to train the ANN with clients randomly aged 20 through 90 instead of having 50% of them around the 30's - the more different situations you practice with your bicycle the expert you become).

3. Submitting the data Top

If you are an authorised user you can submit data to train your ANN through the Data Input link in navigation bar after you log on.

Data can now be pasted into the right boxes.

First, the Inputs:

Data input, input values: spreadsheet  to browser
Fig. 3 - Pasting input data into browser

 

... then the Outputs:

Data input, output values: spreadsheet  to browser
Fig. 4 - Pasting output data into browser

Note: Data is presented in separate spreadsheets for a better understanding of the process, but obviously it can be placed in one spreadsheet only, as it usually is.

While entering data pay attention to the following Submitting Rules (check Figs. 3 and 4):

  • input and output names are placed in rows - each name (variable) in its own row;
  • input and output data are placed in columns, one column for each name, one row for each credit repayment case.
  • For the input and output data values in each row must be separated by at least one tab or space character. In this example we made a straightforward copy/paste action from our spreadsheet, which automatically places data in the right place - one tab separation between each column of values. You can do it this way, pasting from any Tab-delimited text file or from an Excel worksheet, or you can introduce data manually, as long as you separate values with at least one tab or space between each column.
  • The number of rows for the input and the output data must match, meaning in this example that the number of cases observed for the "payed" must be exactly the same as the used for the "age", "children" and "income".
By pressing the "Submit" button, you get a text listing of the data submitted in the Data Confirmation page. If everything looks OK and the data is introduced correctly, according to the rules described, you will be allowed to proceed - pushing the "Next" button - and invited to introduce a name and a short description for the ANN to be trained.

After this final operation you get a new screen (white box below) confirming that the data was submitted and an ID number is assigned so you can retrieve it and use it later:

Job submitted.

Please wait until you receive an email indicating that your job is finished, and then go to the ANN List page.
If you prefer you can leave this browser open and the click in this link

TestUser_977142474_7069_bee3133fe707c783f61bddcbe6c42612

An e-mail was sent to TestUser@microcortex.com
ID NUMBER: TestUser_977142474_7069_bee3133fe707c783f61bddcbe6c42612

You also get an email with a similar message.

4. Retrieving the ANN Top

After a waiting (training) period that depends on the complexity of the ANN being developed and the amount of data submitted you get an email reporting that your ANN is ready: "Your Job has finished successfully"

From now on the ANN can be retrieved, at any time, by going to the ANN List link after you log on.

By clicking on the ANN ID, you will move to the ANN Analysis page, where you can have a glance at the quality of the trained ANN:

Trained ANN statistics: quality
Fig. 5 - Statistics of the trained ANN: Quality
The quality of the predictions can be inspected by looking at the predicted versus obtained values.

In this example, observed values can only be "0" or "1" - the values represented by circles in the plot.

Trained ANN statistics: sensitivity
Fig. 6 - Statistics of the trained ANN: Sensitivity
Here you can evaluate which variables most influence your output by considering the average sensitivity of the output to each of the inputs.

In this example, we can see that "income", "number of children" and "divorced" have a big influence on "payed", while "relincharge" (relatives in charge) and "pets" have little or no influence.

For those more familiar with statistical analysis the ANN Analysis page gives a good impression on the statistical quality measure of the trained ANN. If that is your case, check the details on statistical results.

5. Making predictions Top

Prediction button
Fig. 7 - Predictions button

After pressing the "Predictions" button you are asked for " input questions".

As an example, the following 4 input sets can be submitted to request ANN predictions for the output - "Will they pay or not?":

Prediction data input page
Fig. 8 - Predictions Input page

The input questions are lines of data values, one column for each variable, in the same format used in the beginning for the input data. Once again you can simply copy and paste values from a tab-delimited text file or any spreadsheet. Apart from having as many input variables (columns) as the input data set used to train the ANN there is no other restriction.

The corresponding 4 output predictions, are generated bellow the box for submitting:

Statistical analysis of prediction  
 
Here you get the predictions to your inputs.
Values close to 1 can be assumed as 1, which means "yes".
Check the novelty of your question. Values clearly higher than 1 indicate values in your question very different from anything used to train the ANN. Accordingly, the table highlights the 4th input line as having a certain degree of novelty - no case line with someone aged 90 (or near) was used to train this ANN.
 
Fig. 9 - Statistical analysis of the prediction

If you feel comfortable about ANN terms, you may check the section for advanced users for more details on analysing the trained ANN in the "Walk Through Guide".

6. Download Data Top

Data in one file (txt)
Data in one file (html)