{ "metadata": { "name": "Example Fama-MacBeth regression" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": "Estimating the Risk Premia using Fama-MacBeth Regressions" }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": "IPython Notebook Setup" }, { "cell_type": "markdown", "metadata": {}, "source": "This command is used to set the location of the data directory.\n\n import os\n os.chdir(r'C:\\Users\\kevin.sheppard\\Dropbox\\Teaching\\Graduate\\2013-MFE\\Python\\Python_Introduction\\data')" }, { "cell_type": "code", "collapsed": false, "input": "import os\nos.chdir(r'C:\\Users\\kevin.sheppard\\Dropbox\\Teaching\\Graduate\\2013-MFE\\Python\\Python_Introduction\\data')", "language": "python", "metadata": {}, "outputs": [], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": "This example highlights how to implement a Fama-MacBeth 2-stage regression to estimate factor risk premia, make inference on the risk premia, and test whether a linear factor model can explain a cross-section of portfolio returns. This example closely follows [Cochrane::2001] (See also [JagannathanSkoulakisWang::2010]). As in the previous example, the first segment contains the imports. " }, { "cell_type": "code", "collapsed": false, "input": "from __future__ import print_function, division\nfrom numpy import mat, cov, mean, hstack, multiply,sqrt,diag, genfromtxt, \\\n squeeze, ones, array, vstack, kron, zeros, eye, savez_compressed\nfrom numpy.linalg import lstsq, inv\nfrom scipy.stats import chi2\nfrom pandas import read_csv", "language": "python", "metadata": {}, "outputs": [], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": "Next, the data are imported. I formatted the data downloaded from Ken French's website into an easy-to-import CSV which can be read by pandas.read_csv. The data is split using named columns for the small sets of variables and ix for the portfolios. The code uses pure NumPy arrays, and so values is used to retrieve the array from the DataFrame. The dimensions are determined using shape. Finally the risk free rate is forced to have 2 dimensions so that it will be broadcastable with the portfolio returns in the construction of the excess returns to the Size and Value-weighted portfolios. asmatrix is used to return matrix views of all of the arrays. This code is linear algebra-heavy and so matrices are easier to use than arrays." }, { "cell_type": "code", "collapsed": false, "input": "data = read_csv('FamaFrench.csv')\n\n# Split using both named colums and ix for larger blocks\ndates = data['date'].values\nfactors = data[['VWMe', 'SMB', 'HML']].values\nriskfree = data['RF'].values\nportfolios = data.ix[:, 5:].values\n\n# Use mat for easier linear algebra\nfactors = mat(factors)\nriskfree = mat(riskfree)\nportfolios = mat(portfolios)\n\n# Shape information\nT,K = factors.shape\nT,N = portfolios.shape\n# Reshape rf and compute excess returns\nriskfree.shape = T,1\nexcessReturns = portfolios - riskfree", "language": "python", "metadata": {}, "outputs": [], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": "\nThe next block does 2 things:\n\n1. Compute the time-series $\\beta$s. This is done be regressing the full array of excess returns on the factors (augmented with a constant) using lstsq.\n2. Compute the risk premia using a cross-sectional regression of average excess returns on the estimates $\\beta$s. This is a standard regression where the step 1 $\\beta$ estimates are used as regressors, and the dependent variable is the average excess return." }, { "cell_type": "code", "collapsed": false, "input": "# Time series regressions\nX = hstack((ones((T, 1)), factors))\nout = lstsq(X, excessReturns)\nalpha = out[0][0]\nbeta = out[0][1:]\navgExcessReturns = mean(excessReturns, 0)\n# Cross-section regression\nout = lstsq(beta.T, avgExcessReturns.T)\nriskPremia = out[0]", "language": "python", "metadata": {}, "outputs": [], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": "\nThe asymptotic variance requires computing the covariance of the demeaned returns and the weighted pricing errors. The problem is formulated using 2-step GMM where the moment conditions are \n\$$\ng_{t}\\left(\\theta\\right)=\\left[\\begin{array}{c}\n\\epsilon_{1t}\\\\\n\\epsilon_{1t}f_{t}\\\\\n\\epsilon_{2t}\\\\\n\\epsilon_{2t}f_{t}\\\\\n\\vdots\\\\\n\\epsilon_{Nt}\\\\\n\\epsilon_{Nt}f_{t}\\\\\n\\beta u_{t}\n\\end{array}\\right]\n\$$\n\nwhere $\\epsilon_{it}=r_{it}^{e}-\\alpha_{i}-\\beta_{i}^{\\prime}f_{t}$, $\\beta_{i}$ is a $K$ by 1 vector of factor loadings, $f_{t}$ is a $K$ by 1 set of factors, $\\beta=\\left[\\beta_{1}\\,\\beta_{2}\\ldots\\beta_{N}\\right]$ is a $K$ by $N$ matrix of all factor loadings, $u_{t}=r_{t}^{e}-\\beta'\\lambda$ are the $N$ by 1 vector of pricing errors and $\\lambda$ is a $K$ by 1 vector of risk premia. \nThe vector of parameters is then $\\theta= \\left[\\alpha_{1}\\:\\beta_{1}^{\\prime}\\:\\alpha_{2}\\:\\beta_{2}^{\\prime}\\:\\ldots\\:\\alpha_{N}\\,\\beta_{N}^{\\prime}\\:\\lambda'\\right]'$\n To make inference on this problem, the derivative of the moments with respect to the parameters, $\\partial g_{t}\\left(\\theta\\right)/\\partial\\theta^{\\prime}$ is needed. With some work, the estimator of this matrix can be seen to be \n \n\$$\n G=E\\left[\\frac{\\partial g_{t}\\left(\\theta\\right)}{\\partial\\theta^{\\prime}}\\right]=\\left[\\begin{array}{cc}\n-I_{n}\\otimes\\Sigma_{X} & 0\\\\\nG_{21} & -\\beta\\beta^{\\prime}\n\\end{array}\\right].\n\$$\n\nwhere $X_{t}=\\left[1\\: f_{t}^{\\prime}\\right]'$ and $\\Sigma_{X}=E\\left[X_{t}X_{t}^{\\prime}\\right]$. $G_{21}$ is a matrix with the structure \n\n\$$\nG_{21}=\\left[G_{21,1}\\, G_{21,2}\\,\\ldots G_{21,N}\\right]\n\$$\n\nwhere \n\n\$$\nG_{21,i}=\\left[\\begin{array}{cc} \n0_{K,1} & \\textrm{diag}\\left(E\\left[u_{i}\\right]-\\beta_{i}\\odot\\lambda\\right)\\end{array}\\right]\$$\n\nand where $E\\left[u_{i}\\right]$ is the expected pricing error. In estimation, all expectations are replaced with their sample analogues." }, { "cell_type": "code", "collapsed": false, "input": "# Moment conditions\nX = hstack((ones((T, 1)), factors))\np = vstack((alpha, beta))\nepsilon = excessReturns - X * p\nmoments1 = kron(epsilon, ones((1, K + 1)))\nmoments1 = multiply(moments1, kron(ones((1, N)), X))\nu = excessReturns - riskPremia.T * beta\nmoments2 = u * beta.T\n# Score covariance\nS = mat(cov(hstack((moments1, moments2)).T))\n# Jacobian\nG = mat(zeros((N * K + N + K, N * K + N + K)))\nSigmaX = X.T * X / T\nG[:N * K + N, :N * K + N] = kron(eye(N), SigmaX)\nG[N * K + N:, N * K + N:] = -beta * beta.T\nfor i in xrange(N):\n temp = zeros((K, K + 1))\n values = mean(u[:, i]) - multiply(beta[:, i], riskPremia)\n temp[:, 1:] = diag(values.A1)\n G[N * K + N:, i * (K + 1):(i + 1) * (K + 1)] = temp\n\nvcv = inv(G.T) * S * inv(G) / T", "language": "python", "metadata": {}, "outputs": [], "prompt_number": 5 }, { "cell_type": "markdown", "metadata": {}, "source": "The $J$-test examines whether the average pricing errors, $\\hat{\\alpha}$, are zero. The $J$ statistic has an asymptotic $\\chi_{N}^{2}$ distribution, and the model is badly rejected." }, { "cell_type": "code", "collapsed": false, "input": "vcvAlpha = vcv[0:N * K + N:4, 0:N * K + N:4]\nJ = alpha * inv(vcvAlpha) * alpha.T\nJ = J[0, 0]\nJpval = 1 - chi2(25).cdf(J)", "language": "python", "metadata": {}, "outputs": [], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": "The final block using formatted output to present all of the results in a readable manner." }, { "cell_type": "code", "collapsed": false, "input": "vcvRiskPremia = vcv[N * K + N:, N * K + N:]\nannualizedRP = 12 * riskPremia\narp = list(squeeze(annualizedRP.A))\narpSE = list(sqrt(12 * diag(vcvRiskPremia)))\nprint(' Annualized Risk Premia')\nprint(' Market SMB HML')\nprint('--------------------------------------')\nprint('Premia {0:0.4f} {1:0.4f} {2:0.4f}'.format(arp[0], arp[1], arp[2]))\nprint('Std. Err. {0:0.4f} {1:0.4f} {2:0.4f}'.format(arpSE[0], arpSE[1], arpSE[2]))\nprint('\\n\\n')\n\nprint('J-test: {:0.4f}'.format(J))\nprint('P-value: {:0.4f}'.format(Jpval))\n\ni = 0\nbetaSE = []\nfor j in xrange(5):\n for k in xrange(5):\n a = alpha[0, i]\n b = beta[:, i].A1\n variances = diag(vcv[(K + 1) * i:(K + 1) * (i + 1), (K + 1) * i:(K + 1) * (i + 1)])\n betaSE.append(sqrt(variances))\n s = sqrt(variances)\n c = hstack((a, b))\n t = c / s\n print('Size: {:}, Value:{:} Alpha Beta(VWM) Beta(SMB) Beta(HML)'.format(j + 1, k + 1))\n print('Coefficients: {:>10,.4f} {:>10,.4f} {:>10,.4f} {:>10,.4f}'.format(a, b[0], b[1], b[2]))\n print('Std Err. {:>10,.4f} {:>10,.4f} {:>10,.4f} {:>10,.4f}'.format(s[0], s[1], s[2], s[3]))\n print('T-stat {:>10,.4f} {:>10,.4f} {:>10,.4f} {:>10,.4f}'.format(t[0], t[1], t[2], t[3]))\n print('')\n i += 1", "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": " Annualized Risk Premia\n Market SMB HML\n--------------------------------------\nPremia 6.6642 2.8731 2.8080\nStd. Err. 0.5994 0.4010 0.4296\n\n\n\nJ-test: 95.2879\nP-value: 0.0000\nSize: 1, Value:1 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: -0.8354 1.3099 1.2892 0.3943\nStd Err. 0.1820 0.1269 0.1671 0.2748\nT-stat -4.5904 10.3196 7.7127 1.4348\n\nSize: 1, Value:2 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: -0.3911 1.0853 1.6100 0.3317\nStd Err. 0.1237 0.0637 0.1893 0.1444\nT-stat -3.1616 17.0351 8.5061 2.2971\n\nSize: 1, Value:3 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: -0.1219 1.0747 1.1812 0.4648\nStd Err. 0.0997 0.0419 0.0938 0.0723\nT-stat -1.2225 25.6206 12.5952 6.4310\n\nSize: 1, Value:4 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: 0.0388 0.9630 1.2249 0.5854\nStd Err. 0.0692 0.0232 0.1003 0.0353\nT-stat 0.5614 41.5592 12.2108 16.5705\n\nSize: 1, Value:5 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: 0.0918 0.9850 1.3453 0.9052\nStd Err. 0.0676 0.0255 0.0818 0.0610\nT-stat 1.3580 38.5669 16.4489 14.8404\n\nSize: 2, Value:1 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: -0.2397 1.0691 1.0520 -0.2647\nStd Err. 0.0725 0.0318 0.0609 0.0591\nT-stat -3.3052 33.6540 17.2706 -4.4768\n\nSize: 2, Value:2 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: -0.0194 1.0416 0.9880 0.1877\nStd Err. 0.0615 0.0170 0.0776 0.0350\nT-stat -0.3162 61.1252 12.7393 5.3646\n\nSize: 2, Value:3 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: 0.0898 0.9590 0.8619 0.3553\nStd Err. 0.0517 0.0170 0.0733 0.0320\nT-stat 1.7359 56.4856 11.7528 11.0968\n\nSize: 2, Value:4 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: 0.0482 0.9788 0.8178 0.5562\nStd Err. 0.0495 0.0138 0.0454 0.0281\nT-stat 0.9733 70.7006 18.0210 19.8055\n\nSize: 2, Value:5 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: -0.0109 1.0502 0.9373 0.8493\nStd Err. 0.0596 0.0182 0.0281 0.0263\nT-stat -0.1830 57.7092 33.3971 32.2980\n\nSize: 3, Value:1 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: -0.1556 1.1416 0.7883 -0.1980\nStd Err. 0.0591 0.0190 0.0445 0.0411\nT-stat -2.6320 60.1173 17.6973 -4.8171\n\nSize: 3, Value:2 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: 0.0889 1.0133 0.5151 0.0720\nStd Err. 0.0553 0.0179 0.0340 0.0334\nT-stat 1.6068 56.6380 15.1651 2.1546\n\nSize: 3, Value:3 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: 0.1118 1.0129 0.4130 0.3379\nStd Err. 0.0578 0.0267 0.0324 0.0321\nT-stat 1.9344 37.9790 12.7488 10.5399\n\nSize: 3, Value:4 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: 0.0818 0.9615 0.4646 0.5068\nStd Err. 0.0568 0.0141 0.0475 0.0301\nT-stat 1.4399 68.3360 9.7754 16.8580\n\nSize: 3, Value:5 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: -0.0526 1.1447 0.4970 0.9143\nStd Err. 0.0687 0.0197 0.0509 0.0390\nT-stat -0.7655 58.0319 9.7690 23.4302\n\nSize: 4, Value:1 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: 0.0902 1.0661 0.2857 -0.3692\nStd Err. 0.0498 0.0151 0.0444 0.0323\nT-stat 1.8127 70.4710 6.4268 -11.4334\n\nSize: 4, Value:2 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: -0.0104 1.0308 0.2430 0.1328\nStd Err. 0.0534 0.0217 0.0300 0.0294\nT-stat -0.1952 47.5567 8.0926 4.5183\n\nSize: 4, Value:3 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: 0.0392 1.0096 0.2214 0.2980\nStd Err. 0.0572 0.0209 0.0436 0.0486\nT-stat 0.6862 48.3271 5.0836 6.1333\n\nSize: 4, Value:4 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: 0.0148 1.0437 0.2016 0.5857\nStd Err. 0.0593 0.0224 0.0343 0.0484\nT-stat 0.2497 46.5053 5.8694 12.0922\n\nSize: 4, Value:5 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: -0.1762 1.2284 0.2974 0.9834\nStd Err. 0.0803 0.0224 0.0490 0.0378\nT-stat -2.1927 54.8427 6.0726 26.0265\n\nSize: 5, Value:1 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: 0.0794 1.0310 -0.1507 -0.2508\nStd Err. 0.0372 0.0095 0.0247 0.0168\nT-stat 2.1369 108.0844 -6.1067 -14.9673\n\nSize: 5, Value:2 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: 0.0535 0.9576 -0.1893 -0.0107\nStd Err. 0.0457 0.0170 0.0243 0.0239\nT-stat 1.1690 56.3228 -7.7765 -0.4458\n\nSize: 5, Value:3 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: -0.0236 0.9753 -0.2173 0.3127\nStd Err. 0.0559 0.0178 0.0309 0.0256\nT-stat -0.4225 54.6936 -7.0217 12.2061\n\nSize: 5, Value:4 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: -0.1978 1.0546 -0.1732 0.7115\nStd Err. 0.0587 0.0230 0.0300 0.0316\nT-stat -3.3679 45.7933 -5.7749 22.5339\n\nSize: 5, Value:5 Alpha Beta(VWM) Beta(SMB) Beta(HML)\nCoefficients: -1.2737 1.1045 0.0076 0.8527\nStd Err. 0.3557 0.1143 0.1594 0.1490\nT-stat -3.5805 9.6657 0.0477 5.7232\n\n" } ], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": "The final block converts the standard errors of $\\beta$ to be an array and saves the results." }, { "cell_type": "code", "collapsed": false, "input": "betaSE = array(betaSE)\nsavez_compressed('Fama-MacBeth_results', alpha=alpha, \\\n beta=beta, betaSE=betaSE, arpSE=arpSE, arp=arp, J=J, Jpval=Jpval)", "language": "python", "metadata": {}, "outputs": [], "prompt_number": 8 } ], "metadata": {} } ] }