| column_name | type | max_length | |
|---|---|---|---|
| 0 | Credit Score | Numeric | 4 |
| 1 | First Payment Date | Date | 6 |
| 2 | First Time Homebuyer Flag | Alpha | 1 |
| 3 | Maturity Date | Date | 6 |
| 4 | Metropolitan Statistical Area (MSA) Or Metropo... | Numeric | 5 |
| 5 | Mortgage Insurance Percentage (MI %) | Numeric | 3 |
| 6 | Number of Units | Numeric | 2 |
| 7 | Occupancy Status | Alpha | 1 |
| 8 | Original Combined Loan-to-Value (CLTV) | Numeric | 3 |
| 9 | Original Debt-to-Income (DTI) Ratio | Numeric | 3 |
| 10 | Original UPB | Numeric | 12 |
| 11 | Original Loan-to-Value (LTV) | Numeric | 3 |
| 12 | Original Interest Rate | Numeric - 6,3 | 6 |
| 13 | Channel | Alpha | 1 |
| 14 | Prepayment Penalty Mortgage (PPM) Flag | Alpha | 1 |
| 15 | Amortization Type (Formerly Product Type) | Alpha | 5 |
| 16 | Property State | Alpha | 2 |
| 17 | Property Type | Alpha | 2 |
| 18 | Postal Code | Numeric | 5 |
| 19 | Loan Sequence Number | Alpha Numeric - PYYQnXXXXXXX | 12 |
| 20 | Loan Purpose | Alpha | 1 |
| 21 | Original Loan Term | Numeric | 3 |
| 22 | Number of Borrowers | Numeric | 2 |
| 23 | Seller Name | Alpha Numeric | 60 |
| 24 | Servicer Name | Alpha Numeric | 60 |
| 25 | Super Conforming Flag | Alpha | 1 |
| 26 | Pre-HARP Loan Sequence Number | Alpha Numeric - PYYQnXXXXXXX | 12 |
| 27 | Program Indicator | Alpha Numeric | 1 |
| 28 | HARP Indicator | Alpha | 1 |
| 29 | Property Valuation Method | Numeric | 1 |
| 30 | Interest Only (I/O) Indicator | Alpha | 1 |
| 31 | Mortgage Insurance Cancellation Indicator | Alpha | 1 |
About Freddie Mac
The data originates from Freddy Mac a group created by congress in an effort to make mortgages more accessible to Americans. This accessibility is derived from Freddie Mac’s mortgage purchases. Lenders sell the mortgage loans to Freddie Mac who in turn uses the assets as securities sold to investors. Securities give one the right to something else– a financial contract. In this case mortgages, as securities, give investors a monthly stream of income for decades.
As a main purchaser of mortgages, Freddie Mac has a large amount of mortgage data across the entire US. These datapoints are ideal for the Home Loan Adviser model.
About the Raw Data
The data is provided by Freddie Mac in zip files by year. The download is available here. A guide is also provided here.
Each year of the SFLL data is composed of quarters, which are provided in zip files as well. Each quarter then contains two files, one contains loan origination data and the other contains the performances of each loan. Their “key” is a column named loan_sequence_number to be used for joining if needed.
#| eval: false
#--to extract automatically (on mac)--
cd ~/Downloads
unzip "historical_data_*.zip"
#-- remove the main zip from folder and re-run to extract quarters--
unzip "historical_data_*.zip"
Once the nested zip files were extracted, there were a total of 40 files. The raw data is in .txt format separated by | vertical bars, or “pipes” instead of commas or tabs. There are no headers so all columns were named upon importing.

Data Dictionary
| column_name | type | max_length | |
|---|---|---|---|
| 0 | Loan Sequence Number | Alpha Numeric - PYYQnXXXXXXX | 12 |
| 1 | Monthly Reporting Period | Date | 6 |
| 2 | Current Actual UPB | Numeric - 12,2 | 12 |
| 3 | Current Loan Delinquency Status | Alpha Numeric | 3 |
| 4 | Loan Age | Numeric | 3 |
| 5 | Remaining Months to Legal Maturity | Numeric | 3 |
| 6 | Defect Settlement Date | Date | 6 |
| 7 | Modification Flag | Alpha | 1 |
| 8 | Zero Balance Code | Numeric | 2 |
| 9 | Zero Balance Effective Date | Date | 6 |
| 10 | Current Interest Rate | Numeric - 8,3 | 8 |
| 11 | Current Deferred UPB | Numeric | 12 |
| 12 | Due Date of Last Paid Installment (DDLPI) | Date | 6 |
| 13 | MI Recoveries | Numeric - 12,2 | 12 |
| 14 | Net Sales Proceeds | Alpha-Numeric | 14 |
| 15 | Non MI Recoveries | Numeric - 12,2 | 12 |
| 16 | Expenses | Numeric - 12,2 | 12 |
| 17 | Legal Costs | Numeric - 12,2 | 12 |
| 18 | Maintenance and Preservation Costs | Numeric - 12,2 | 12 |
| 19 | Taxes and Insurance | Numeric - 12,2 | 12 |
| 20 | Miscellaneous Expenses | Numeric - 12,2 | 12 |
| 21 | Actual Loss Calculation | Numeric - 12,2 | 12 |
| 22 | Modification Cost | Numeric - 12,2 | 12 |
| 23 | Step Modification Flag | Alpha | 1 |
| 24 | Deferred Payment Plan | Alpha | 1 |
| 25 | Estimated Loan-to-Value (ELTV) | Numeric | 4 |
| 26 | Zero Balance Removal UPB | Numeric - 12,2 | 12 |
| 27 | Delinquent Accrued Interest | Numeric - 12,2 | 12 |
| 28 | Delinquency Due to Disaster | Alpha | 1 |
| 29 | Borrower Assistance Status Code | Alpha | 1 |
| 30 | Current Month Modification Cost | Numeric - 12,2 | 12 |
| 31 | Interest Bearing UPB | Numeric - 12,2 | 12 |
Data Load & Clean
About Missing Values
(how were handling missing values)
Categorical Variable
(“encoding the categorical values”)
Target Variable
(“ever 90+ days delinquent”)
Sampling
(about imbalances, “weighted sampling or SMOTE”)
Data Exploration
(data exploration and descriptive stats)
Data Visualizations
(Are we showing distributions and such?)
