Example: GMM Estimation

Risk Premia Estimation using GMM

Start by importing the modules and functions needed

In [1]:
from numpy import hstack, ones, array, mat, tile, reshape, squeeze, eye, asmatrix
from numpy.linalg import inv
from pandas import read_csv, Series 
from scipy.linalg import kron
from scipy.optimize import fmin_bfgs
import numpy as np
import statsmodels.api as sm

Next a callable function is used to produce iteration-by-iteration output when using the non-linear optimizer.

In [2]:
iteration = 0
lastValue = 0
functionCount = 0

def iter_print(params):
    global iteration, lastValue, functionCount
    iteration += 1
    print('Func value: {0:}, Iteration: {1:}, Function Count: {2:}'.format(lastValue, iteration, functionCount))

The GMM objective, which is minimized, is defined next.

In [3]:
def gmm_objective(params, pRets, fRets, Winv, out=False):
    global lastValue, functionCount
    T,N = pRets.shape
    T,K = fRets.shape
    beta = squeeze(array(params[:(N*K)]))
    lam = squeeze(array(params[(N*K):]))
    beta = reshape(beta,(N,K))
    lam = reshape(lam,(K,1))
    betalam = beta @ lam
    expectedRet = fRets @ beta.T
    e = pRets - expectedRet
    instr = tile(fRets,N)
    moments1  = kron(e,ones((1,K)))
    moments1 = moments1 * instr
    moments2 = pRets - betalam.T
    moments = hstack((moments1,moments2))

    avgMoment = moments.mean(axis=0)
    
    J = T * mat(avgMoment) * mat(Winv) * mat(avgMoment).T
    J = J[0,0]
    lastValue = J
    functionCount += 1
    if not out:
        return J
    else:
        return J, moments

The G matrix, which is the derivative of the GMM moments with respect to the parameters, is defined.

In [4]:
def gmm_G(params, pRets, fRets):
    T,N = pRets.shape
    T,K = fRets.shape
    beta = squeeze(array(params[:(N*K)]))
    lam = squeeze(array(params[(N*K):]))
    beta = reshape(beta,(N,K))
    lam = reshape(lam,(K,1))
    G = np.zeros((N*K+K,N*K+N))
    ffp = (fRets.T @ fRets) / T
    G[:(N*K),:(N*K)]=kron(eye(N),ffp)
    G[:(N*K),(N*K):] = kron(eye(N),-lam)
    G[(N*K):,(N*K):] = -beta.T
    
    return G

Next, the data is imported and a subset of the test portfolios is selected to make the estimation faster.

In [5]:
data = read_csv('FamaFrench.csv')

# Split using both named colums and ix for larger blocks
dates = data['date'].values
factors = data[['VWMe','SMB','HML']].values
riskfree = data['RF'].values
portfolios = data.iloc[:,5:].values

T,N = portfolios.shape
portfolios = portfolios[:,np.arange(0,N,2)]
T,N = portfolios.shape
excessRet = portfolios - np.reshape(riskfree,(T,1))
K = np.size(factors,1)

Starting values for the factor loadings and rick premia are estimated using OLS and simple means.

In [6]:
betas = []
for i in range(N):
    res = sm.OLS(excessRet[:,i],sm.add_constant(factors)).fit()
    betas.append(res.params[1:])

avgReturn = excessRet.mean(axis=0)
avgReturn.shape = N,1
betas = array(betas)
res = sm.OLS(avgReturn, betas).fit()
riskPremia = res.params

The starting values are computed the first step estimates are found using the non-linear optimizer. The initial weighting matrix is just the identify matrix.

In [7]:
riskPremia.shape = 3
startingVals = np.concatenate((betas.flatten(),riskPremia))

Winv = np.eye(N*(K+1))
args = (excessRet, factors, Winv)
iteration = 0
functionCount = 0
step1opt = fmin_bfgs(gmm_objective, startingVals, args=args, callback=iter_print)
Func value: 1915.975414620774, Iteration: 1, Function Count: 132
Func value: 1817.0224254364093, Iteration: 2, Function Count: 220
Func value: 1814.9526088153193, Iteration: 3, Function Count: 308
Func value: 1814.8636328788023, Iteration: 4, Function Count: 396
Func value: 1814.7320075212833, Iteration: 5, Function Count: 440
Func value: 1814.4944170296885, Iteration: 6, Function Count: 484
Func value: 1814.4840096314288, Iteration: 7, Function Count: 572
Func value: 1814.4835355894866, Iteration: 8, Function Count: 660
Func value: 1814.4834334886873, Iteration: 9, Function Count: 748
Func value: 1814.4832402214106, Iteration: 10, Function Count: 792
Func value: 1814.483239345376, Iteration: 11, Function Count: 880
Func value: 1814.4832044513546, Iteration: 12, Function Count: 1012
Func value: 1814.3989963962504, Iteration: 13, Function Count: 1276
Func value: 1814.3642859418874, Iteration: 14, Function Count: 1320
Func value: 1814.301102018856, Iteration: 15, Function Count: 1364
Func value: 1814.301098499327, Iteration: 16, Function Count: 1452
Func value: 1814.3010704933515, Iteration: 17, Function Count: 1540
Func value: 1814.296612835476, Iteration: 18, Function Count: 1716
Func value: 1814.2538448019918, Iteration: 19, Function Count: 1804
Func value: 1814.253749266872, Iteration: 20, Function Count: 1892
Func value: 1814.2536217543443, Iteration: 21, Function Count: 1936
Func value: 1814.2341819587186, Iteration: 22, Function Count: 2112
Func value: 1814.2190046927274, Iteration: 23, Function Count: 2156
Func value: 1814.1901664290635, Iteration: 24, Function Count: 2200
Func value: 1814.1900618989262, Iteration: 25, Function Count: 2288
Func value: 1814.1899629209177, Iteration: 26, Function Count: 2332
Func value: 1814.1315029565549, Iteration: 27, Function Count: 2552
Func value: 1814.1207160483482, Iteration: 28, Function Count: 2640
Func value: 1814.120651593227, Iteration: 29, Function Count: 2728
Func value: 1814.1206404952559, Iteration: 30, Function Count: 2816
Func value: 1814.093987040505, Iteration: 31, Function Count: 3080
Func value: 1814.0931557560025, Iteration: 32, Function Count: 3168
Func value: 1814.0922310255569, Iteration: 33, Function Count: 3212
Func value: 1814.0921898261458, Iteration: 34, Function Count: 3300
Func value: 1814.092112795961, Iteration: 35, Function Count: 3344
Func value: 1814.080248929288, Iteration: 36, Function Count: 3520
Func value: 1814.0799729900195, Iteration: 37, Function Count: 3608
Func value: 1814.079961881844, Iteration: 38, Function Count: 3696
Func value: 1814.0799614793552, Iteration: 39, Function Count: 3784
Func value: 1814.0757650935916, Iteration: 40, Function Count: 4092
Func value: 1814.0755830705248, Iteration: 41, Function Count: 4180
Func value: 1814.0755746091875, Iteration: 42, Function Count: 4268
Func value: 1814.0702842972405, Iteration: 43, Function Count: 4488
Func value: 1814.0700243731067, Iteration: 44, Function Count: 4576
Func value: 1814.0700110352632, Iteration: 45, Function Count: 4664
Func value: 1814.0678482400072, Iteration: 46, Function Count: 4840
Func value: 1814.067660339931, Iteration: 47, Function Count: 4928
Func value: 1814.067656754271, Iteration: 48, Function Count: 5016
Func value: 1814.065377183398, Iteration: 49, Function Count: 5236
Func value: 1814.0652521531151, Iteration: 50, Function Count: 5324
Func value: 1814.0652503654978, Iteration: 51, Function Count: 5412
Func value: 1814.0640861808768, Iteration: 52, Function Count: 5632
Func value: 1814.0640022668042, Iteration: 53, Function Count: 5720
Func value: 1814.0611488164795, Iteration: 54, Function Count: 5852
Func value: 1814.059515832646, Iteration: 55, Function Count: 5896
Func value: 1814.0595158325361, Iteration: 56, Function Count: 5940
Func value: 1814.0595158325355, Iteration: 57, Function Count: 6072
Warning: Desired error not necessarily achieved due to precision loss.
         Current function value: 1814.059516
         Iterations: 57
         Function evaluations: 9031
         Gradient evaluations: 205

Here we look at the risk premia estimates from the first step (inefficient) estimates.

In [8]:
premia = step1opt[-3:]
premia = Series(premia,index=['VWMe', 'SMB', 'HML'])
print('Annualized Risk Premia (First step)')
print(12 * premia)
Annualized Risk Premia (First step)
VWMe    5.829995
SMB     4.068224
HML     1.680948
dtype: float64

Next the first step estimates are used to estimate the moment conditions which are in-turn used to estimate the optimal weighting matrix for the moment conditions. This is then used as an input for the 2nd-step estimates.

In [9]:
out = gmm_objective(step1opt, excessRet, factors, Winv, out=True)
S = np.cov(out[1].T)
Winv2 = inv(S)
args = (excessRet, factors, Winv2)

iteration = 0
functionCount = 0
step2opt = fmin_bfgs(gmm_objective, step1opt, args=args, callback=iter_print)
Func value: 70.69178252370772, Iteration: 1, Function Count: 132
Func value: 69.26303959975596, Iteration: 2, Function Count: 176
Func value: 67.07244129650894, Iteration: 3, Function Count: 220
Func value: 64.57443451479321, Iteration: 4, Function Count: 264
Func value: 62.64097306083999, Iteration: 5, Function Count: 308
Func value: 60.38315319123633, Iteration: 6, Function Count: 352
Func value: 59.77131346063476, Iteration: 7, Function Count: 396
Func value: 59.016700262647376, Iteration: 8, Function Count: 440
Func value: 58.11824688768306, Iteration: 9, Function Count: 484
Func value: 57.16139475771817, Iteration: 10, Function Count: 528
Func value: 56.54119670206884, Iteration: 11, Function Count: 572
Func value: 55.76261111890216, Iteration: 12, Function Count: 616
Func value: 54.70774239263665, Iteration: 13, Function Count: 660
Func value: 54.16273697904013, Iteration: 14, Function Count: 748
Func value: 53.68442984106602, Iteration: 15, Function Count: 792
Func value: 53.24912513313372, Iteration: 16, Function Count: 836
Func value: 52.95654923541569, Iteration: 17, Function Count: 880
Func value: 52.70763030807515, Iteration: 18, Function Count: 924
Func value: 52.40947922763522, Iteration: 19, Function Count: 968
Func value: 52.28025850343027, Iteration: 20, Function Count: 1012
Func value: 52.0945930956645, Iteration: 21, Function Count: 1056
Func value: 51.92591993694722, Iteration: 22, Function Count: 1100
Func value: 51.69127887764556, Iteration: 23, Function Count: 1144
Func value: 51.32800767550518, Iteration: 24, Function Count: 1188
Func value: 50.8832556502003, Iteration: 25, Function Count: 1232
Func value: 50.61502081122463, Iteration: 26, Function Count: 1276
Func value: 50.3253061036018, Iteration: 27, Function Count: 1320
Func value: 49.82326422689157, Iteration: 28, Function Count: 1364
Func value: 49.458055584312206, Iteration: 29, Function Count: 1408
Func value: 49.30507269503761, Iteration: 30, Function Count: 1452
Func value: 49.080604460391186, Iteration: 31, Function Count: 1496
Func value: 48.86937171160105, Iteration: 32, Function Count: 1540
Func value: 48.76039718096907, Iteration: 33, Function Count: 1628
Func value: 48.61522962324979, Iteration: 34, Function Count: 1672
Func value: 48.43822338181308, Iteration: 35, Function Count: 1716
Func value: 48.223264727455955, Iteration: 36, Function Count: 1760
Func value: 48.119297612182464, Iteration: 37, Function Count: 1804
Func value: 47.9966953205165, Iteration: 38, Function Count: 1848
Func value: 47.82071403838669, Iteration: 39, Function Count: 1892
Func value: 47.59984420196816, Iteration: 40, Function Count: 1936
Func value: 47.19190580374522, Iteration: 41, Function Count: 1980
Func value: 46.46434101774428, Iteration: 42, Function Count: 2024
Func value: 46.17952767128317, Iteration: 43, Function Count: 2112
Func value: 45.64869841620502, Iteration: 44, Function Count: 2156
Func value: 44.79178194363791, Iteration: 45, Function Count: 2200
Func value: 44.31246192707072, Iteration: 46, Function Count: 2244
Func value: 44.31220746711396, Iteration: 47, Function Count: 2288
Func value: 44.31216779746252, Iteration: 48, Function Count: 2332
Func value: 44.31216776026349, Iteration: 49, Function Count: 2376
Func value: 44.31216775979089, Iteration: 50, Function Count: 2420
Func value: 44.31216775977227, Iteration: 51, Function Count: 2464
Func value: 44.31216775977222, Iteration: 52, Function Count: 3564
Warning: Desired error not necessarily achieved due to precision loss.
         Current function value: 44.312168
         Iterations: 52
         Function evaluations: 6786
         Gradient evaluations: 154

Finally the VCV of the parameter estimates is computed.

In [10]:
out = gmm_objective(step2opt, excessRet, factors, Winv2, out=True)
G = gmm_G(step2opt, excessRet, factors)
S = np.cov(out[1].T)
vcv = inv(G @ inv(S) @ G.T)/T

The annualized risk premia and their associated t-stats.

In [11]:
premia = step2opt[-3:]
premia = Series(premia,index=['VWMe', 'SMB', 'HML'])
premia_vcv = vcv[-3:,-3:]
print('Annualized Risk Premia')
print(12 * premia)

premia_stderr = np.diag(premia_vcv)
premia_stderr = Series(premia_stderr,index=['VWMe', 'SMB', 'HML'])
print('T-stats')
print(premia / premia_stderr)
Annualized Risk Premia
VWMe    10.089708
SMB      3.457167
HML      7.620110
dtype: float64
T-stats
VWMe    28.282294
SMB     22.372714
HML     43.791637
dtype: float64