De-Cumulating Time Series Data in a Pandas Series

Refresh

November 2018

Views

36 time

2

I have several monthly, datetime-indexed cumulative Pandas series which I would like to de-cumulate so I can just get the values for the specific months themselves.

So, for each year, Jan is Jan, Feb is Jan + Feb, Mar is Jan + Feb + Mar and so on, until the next year that starts at Jan again.

To be awkward some of these series start with Feb instead.

Here's an example series:

2016-02-29     112.3
2016-03-31     243.0
2016-04-30     360.1
2016-05-31     479.5
2016-06-30     643.0
2016-07-31     757.6
2016-08-31     874.5
2016-09-30    1051.8
2016-10-31    1203.4
2016-11-30    1358.3
2016-12-31    1573.5
2017-01-31      75.0
2017-02-28     140.5
2017-03-31     290.4
2017-04-30     416.6
2017-05-31     548.2
2017-06-30     746.6
2017-07-31     863.5
2017-08-31     985.4
2017-09-30    1160.1
2017-10-31    1302.5
2017-11-30    1465.7
2017-12-31    1694.1
2018-01-31      74.0
2018-02-28     146.3
2018-03-31     300.9
2018-04-30     421.9
2018-05-31     564.1
2018-06-30     771.4

I thought one way to do this would be to use df.diff() to get most of the differences for everything but Jan, replace the incorrect Jan values with NaN then do a df.update(original df) to fill in the NaNs with the correct values.

I'm having trouble trying to replace the Jan data with NaNs. Would anyone be able to help with this or suggest another solution at all please?

1 answers

0

Я хотел бы решить эту проблему с groupby+ diff+ fillna:

df.asfreq('M').groupby(pd.Grouper(freq='Y')).diff().fillna(df)

            Value
2016-02-29  112.3
2016-03-31  130.7
2016-04-30  117.1
2016-05-31  119.4
2016-06-30  163.5
2016-07-31  114.6
2016-08-31  116.9
2016-09-30  177.3
2016-10-31  151.6
2016-11-30  154.9
2016-12-31  215.2
2017-01-31   75.0
2017-02-28   65.5
2017-03-31  149.9
2017-04-30  126.2
2017-05-31  131.6
2017-06-30  198.4
2017-07-31  116.9
2017-08-31  121.9
2017-09-30  174.7
2017-10-31  142.4
2017-11-30  163.2
2017-12-31  228.4
2018-01-31   74.0
2018-02-28   72.3
2018-03-31  154.6
2018-04-30  121.0
2018-05-31  142.2
2018-06-30  207.3

Если предположить, что индекс является столбец даты, и «Значение» является поплавок.