Fama and French make updated risk factor and research portfolio data available through their website, and you can use the pandas_datareader library to obtain the data. For this application, refer to the fama_macbeth.ipynb notebook for additional detail.
In particular, we will be using the five Fama—French factors that result from sorting stocks first into three size groups and then into two for each of the remaining three firm-specific factors. Hence, the factors involve three sets of value-weighted portfolios formed as 3 x 2 sorts on size and book-to-market, size and operating profitability, and size and investment. The risk factor values computed as the average returns of the portfolios (PF) as outlined in the following table:
Concept |
Label |
Name |
Risk factor calculation |
Size |
SMB |
Small minus big |
Nine small stock PF minus nine large stock PF |
Value |
HML |
High minus low |
|
Profitability |
RMW |
Robust minus weak |
Two robust OP PF minus two weak OP PF |
Investment |
CMA |
Conservative minus aggressive |
Two conservative investment portfolios minus two aggressive investment portfolios |
Market |
Rm-Rf |
Excess return on the market |
Value-weight return of all firms incorporated in and listed on major US exchanges with good data minus the one-month Treasury bill rate |
We will use returns at a monthly frequency that we obtain for the period 2010 – 2017 as follows:
import pandas_datareader.data as web
ff_factor = 'F-F_Research_Data_5_Factors_2x3'
ff_factor_data = web.DataReader(ff_factor, 'famafrench', start='2010', end='2017-12')[0]
ff_factor_data.info()
PeriodIndex: 96 entries, 2010-01 to 2017-12
Freq: M
Data columns (total 6 columns):
Mkt-RF 96 non-null float64
SMB 96 non-null float64
HML 96 non-null float64
RMW 96 non-null float64
CMA 96 non-null float64
RF 96 non-null float64
Fama and French also make available numerous portfolios that we can illustrate the estimation of the factor exposures, as well as the value of the risk premia available in the market for a given time period. We will use a panel of the 17 industry portfolios at a monthly frequency. We will subtract the risk-free rate from the returns because the factor model works with excess returns:
ff_portfolio = '17_Industry_Portfolios'
ff_portfolio_data = web.DataReader(ff_portfolio, 'famafrench', start='2010', end='2017-12')[0]
ff_portfolio_data = ff_portfolio_data.sub(ff_factor_data.RF, axis=0)
ff_factor_data = ff_factor_data.drop('RF', axis=1)
ff_portfolio_data.info()
PeriodIndex: 96 entries, 2010-01 to 2017-12
Freq: M
Data columns (total 17 columns):
Food 96 non-null float64
Mines 96 non-null float64
Oil 96 non-null float64
...
Rtail 96 non-null float64
Finan 96 non-null float64
Other 96 non-null float64
We will now build a linear factor model based on this panel data using a method that addresses the failure of some basic linear regression assumptions.