Python Pandas: Calculating exponentially weighted lagged squared returns (variance)


April 2019


479 time


I'm trying to implement the AQR ivestment strategy "Time Series Momentum":

I'm running into some confusion/trouble in part of the process. At first glance Pandas appears to have the functionality to calculate a key metric, "exponentially weighted lagged squared returns", as a measure of how volatile a financial instrument is. The formula is thus (with some background):

exponentially weighted lagged squared returns

I understand Pandas has some functionality to apply formula (1) above, to a time series. For example the daily returns for a future contract could be:

[In]: returns


1984-01-03   -0.007299
1984-01-04    0.003614
1984-01-05   -0.007318
1984-01-06   -0.004134
1984-01-09    0.009487
1984-01-10   -0.000896

I then use pandas.DataFrame.ewm in conjunction with pd.std() to try and implement the required formula in a quick one liner, setting com=60 in order to match the paper, this yields:

[In]: np.sqrt(261) * returns.ewm(com=60).std()


1984-01-03         NaN
1984-01-04    0.124664
1984-01-05    0.101879
1984-01-06    0.082925
1984-01-09    0.120588
1984-01-10    0.107411

Though this seems OK... though the formula in the paper uses the difference between the value of the previous or lagged return and the exponentially weighted average return at the current timestep in its calculation:

enter image description here

Would I be right in saying that the Pandas method I carried out above won't use the lagged return, but instead will use the return at the current timestep? As such, I will need to program up my own way of calculating this in Pandas? Perhaps by using some sort of shift?

Thanks in advance! I'm still getting to grips with the nuances of Pandas and your help is much appreciated.

1 answers


You can use the dataframe shift method.

df['shift'] = df['column to shift'].shift(-1)

This will shift column to shift 1 step backwards. So the value of shift row 1 is equal the value for row 2 for column to shift etc. For the final row a NaN will be imputed.

Like so.

    column to shift shift
0   4   1.0
1   1   1.0
2   1   3.0
3   3   4.0
4   4   2.0
5   2   3.0
6   3   2.0
7   2   2.0
8   2   2.0
9   2   NaN

This should be enough to create the formula you want to use.