.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/pandas/working_with_timeseries.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_pandas_working_with_timeseries.py: ============================= 5.3 working with time series ============================= .. important:: This lesson is still under development. In this file you will learn about following concepts of pandas - DateTimeIndex - TimeStamp - freq - Timedelta - offsets - resampling - Period .. GENERATED FROM PYTHON SOURCE LINES 20-29 .. code-block:: default import numpy as np import pandas as pd pd.set_option('display.max_rows', 20) print(np.__version__) print(pd.__version__) .. rst-class:: sphx-glr-script-out .. code-block:: none 1.26.4 1.5.3 .. GENERATED FROM PYTHON SOURCE LINES 30-31 Let's define a dataframe and check its index .. GENERATED FROM PYTHON SOURCE LINES 31-36 .. code-block:: default df = pd.DataFrame(np.arange(31)) print(df) .. rst-class:: sphx-glr-script-out .. code-block:: none 0 0 0 1 1 2 2 3 3 4 4 .. .. 26 26 27 27 28 28 29 29 30 30 [31 rows x 1 columns] .. GENERATED FROM PYTHON SOURCE LINES 37-40 Since dataframe is nothing but numpy arrays with indexes which means each row and column has a label (index). Therefore, we can also interpret dataframes as indexed numpy arrays. When we create a dataframe, pandas automatically assigns a suitable index (row labels) to it. .. GENERATED FROM PYTHON SOURCE LINES 40-43 .. code-block:: default print(df.index) .. rst-class:: sphx-glr-script-out .. code-block:: none RangeIndex(start=0, stop=31, step=1) .. GENERATED FROM PYTHON SOURCE LINES 44-45 The index is a range from 0 to 1 with step size of 1 and is of of type ``RangeIndex``. .. GENERATED FROM PYTHON SOURCE LINES 47-48 we can verify the type of index .. GENERATED FROM PYTHON SOURCE LINES 48-51 .. code-block:: default print(type(df.index)) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 52-55 DateTimeIndex ======================== Let's define a more useful index for the dataframe i.e., dates with daily time step .. GENERATED FROM PYTHON SOURCE LINES 55-60 .. code-block:: default index = [f"2011-01-{i}" for i in range(1, 32)] print(index) .. rst-class:: sphx-glr-script-out .. code-block:: none ['2011-01-1', '2011-01-2', '2011-01-3', '2011-01-4', '2011-01-5', '2011-01-6', '2011-01-7', '2011-01-8', '2011-01-9', '2011-01-10', '2011-01-11', '2011-01-12', '2011-01-13', '2011-01-14', '2011-01-15', '2011-01-16', '2011-01-17', '2011-01-18', '2011-01-19', '2011-01-20', '2011-01-21', '2011-01-22', '2011-01-23', '2011-01-24', '2011-01-25', '2011-01-26', '2011-01-27', '2011-01-28', '2011-01-29', '2011-01-30', '2011-01-31'] .. GENERATED FROM PYTHON SOURCE LINES 61-62 At this point the `index` is a list of strings where each string indicates a day/date. .. GENERATED FROM PYTHON SOURCE LINES 64-65 Now let's assign this index to our dataframe .. GENERATED FROM PYTHON SOURCE LINES 65-70 .. code-block:: default df.index = index print(df) .. rst-class:: sphx-glr-script-out .. code-block:: none 0 2011-01-1 0 2011-01-2 1 2011-01-3 2 2011-01-4 3 2011-01-5 4 ... .. 2011-01-27 26 2011-01-28 27 2011-01-29 28 2011-01-30 29 2011-01-31 30 [31 rows x 1 columns] .. GENERATED FROM PYTHON SOURCE LINES 71-73 We can see that the index of our the dataframe is now the date. But does pandas recognizes this new index as date or does it considers it still as strings? .. GENERATED FROM PYTHON SOURCE LINES 73-76 .. code-block:: default print(type(df.index)) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 77-78 So it turns out that pandas does not recognize the index as date/time .. GENERATED FROM PYTHON SOURCE LINES 80-81 Therefore, we can explicitly tell pandas that the new index is a date and time index .. GENERATED FROM PYTHON SOURCE LINES 81-86 .. code-block:: default index = pd.to_datetime(index) print(index) .. rst-class:: sphx-glr-script-out .. code-block:: none DatetimeIndex(['2011-01-01', '2011-01-02', '2011-01-03', '2011-01-04', '2011-01-05', '2011-01-06', '2011-01-07', '2011-01-08', '2011-01-09', '2011-01-10', '2011-01-11', '2011-01-12', '2011-01-13', '2011-01-14', '2011-01-15', '2011-01-16', '2011-01-17', '2011-01-18', '2011-01-19', '2011-01-20', '2011-01-21', '2011-01-22', '2011-01-23', '2011-01-24', '2011-01-25', '2011-01-26', '2011-01-27', '2011-01-28', '2011-01-29', '2011-01-30', '2011-01-31'], dtype='datetime64[ns]', freq=None) .. GENERATED FROM PYTHON SOURCE LINES 87-90 The ``to_datetime`` function of pandas converts an array of dates into DateTimeIndex object. It can accepts dates in a wide range of formats. We can verify the type of our new index. .. GENERATED FROM PYTHON SOURCE LINES 90-93 .. code-block:: default print(type(index)) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 94-96 now we have created an index whose type is DateTimeIndex i.e. pandas recognizes it as date/time. Let's assign this as index to the dataframe. .. GENERATED FROM PYTHON SOURCE LINES 96-101 .. code-block:: default df.index = index print(df.index) .. rst-class:: sphx-glr-script-out .. code-block:: none DatetimeIndex(['2011-01-01', '2011-01-02', '2011-01-03', '2011-01-04', '2011-01-05', '2011-01-06', '2011-01-07', '2011-01-08', '2011-01-09', '2011-01-10', '2011-01-11', '2011-01-12', '2011-01-13', '2011-01-14', '2011-01-15', '2011-01-16', '2011-01-17', '2011-01-18', '2011-01-19', '2011-01-20', '2011-01-21', '2011-01-22', '2011-01-23', '2011-01-24', '2011-01-25', '2011-01-26', '2011-01-27', '2011-01-28', '2011-01-29', '2011-01-30', '2011-01-31'], dtype='datetime64[ns]', freq=None) .. GENERATED FROM PYTHON SOURCE LINES 102-105 So now the type of index of the dataframe is date/time. Now we can perform slicing based upon time, for example we can ask pandas to return rows which are after 15 January 2011 as below .. GENERATED FROM PYTHON SOURCE LINES 105-108 .. code-block:: default print(df[df.index>pd.Timestamp("20110115")]) .. rst-class:: sphx-glr-script-out .. code-block:: none 0 2011-01-16 15 2011-01-17 16 2011-01-18 17 2011-01-19 18 2011-01-20 19 2011-01-21 20 2011-01-22 21 2011-01-23 22 2011-01-24 23 2011-01-25 24 2011-01-26 25 2011-01-27 26 2011-01-28 27 2011-01-29 28 2011-01-30 29 2011-01-31 30 .. GENERATED FROM PYTHON SOURCE LINES 109-110 Had we done it earlier (before converting our index to pd.DateTimeIndex, we would have got error .. GENERATED FROM PYTHON SOURCE LINES 112-116 creating datetime index --------------------------- Above we converted a normal index which was of type list into DateTimeIndex using ``to_datetime`` function. We can directly create DateTimeIndex using ``date_range`` function. .. GENERATED FROM PYTHON SOURCE LINES 116-121 .. code-block:: default index = pd.date_range(start="20110101", freq="D", periods=31) print(type(index)) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 122-125 .. code-block:: default print(index) .. rst-class:: sphx-glr-script-out .. code-block:: none DatetimeIndex(['2011-01-01', '2011-01-02', '2011-01-03', '2011-01-04', '2011-01-05', '2011-01-06', '2011-01-07', '2011-01-08', '2011-01-09', '2011-01-10', '2011-01-11', '2011-01-12', '2011-01-13', '2011-01-14', '2011-01-15', '2011-01-16', '2011-01-17', '2011-01-18', '2011-01-19', '2011-01-20', '2011-01-21', '2011-01-22', '2011-01-23', '2011-01-24', '2011-01-25', '2011-01-26', '2011-01-27', '2011-01-28', '2011-01-29', '2011-01-30', '2011-01-31'], dtype='datetime64[ns]', freq='D') .. GENERATED FROM PYTHON SOURCE LINES 126-127 We can also define the frequency or time-step of our DateTimeIndex. .. GENERATED FROM PYTHON SOURCE LINES 127-132 .. code-block:: default index = pd.date_range(start="20110101", end="20110131", freq="D") print(type(index)) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 133-136 .. code-block:: default print(index) .. rst-class:: sphx-glr-script-out .. code-block:: none DatetimeIndex(['2011-01-01', '2011-01-02', '2011-01-03', '2011-01-04', '2011-01-05', '2011-01-06', '2011-01-07', '2011-01-08', '2011-01-09', '2011-01-10', '2011-01-11', '2011-01-12', '2011-01-13', '2011-01-14', '2011-01-15', '2011-01-16', '2011-01-17', '2011-01-18', '2011-01-19', '2011-01-20', '2011-01-21', '2011-01-22', '2011-01-23', '2011-01-24', '2011-01-25', '2011-01-26', '2011-01-27', '2011-01-28', '2011-01-29', '2011-01-30', '2011-01-31'], dtype='datetime64[ns]', freq='D') .. GENERATED FROM PYTHON SOURCE LINES 137-142 TimeStamps ============ The `DateTimeIndex` is indeed an array of `TimeStamps` i.e. each member of DateTimeIndex is a TimeStamp. .. GENERATED FROM PYTHON SOURCE LINES 142-145 .. code-block:: default print(df.index[0]) .. rst-class:: sphx-glr-script-out .. code-block:: none 2011-01-01 00:00:00 .. GENERATED FROM PYTHON SOURCE LINES 146-149 .. code-block:: default print(type(df.index[0])) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 150-151 we can check whether a TimeStamp is in a DateTimeIndex or not .. GENERATED FROM PYTHON SOURCE LINES 151-153 .. code-block:: default print(index[1] in index) .. rst-class:: sphx-glr-script-out .. code-block:: none True .. GENERATED FROM PYTHON SOURCE LINES 154-155 we can also compare two TimeStamps .. GENERATED FROM PYTHON SOURCE LINES 155-157 .. code-block:: default print(index[0] > index[1]) .. rst-class:: sphx-glr-script-out .. code-block:: none False .. GENERATED FROM PYTHON SOURCE LINES 158-162 freq ========= the index (of dataframe) has a special attribute called ``freq`` which defines the time-step of the index. It is only available for the index of type ``DateTimeIndex``. .. GENERATED FROM PYTHON SOURCE LINES 162-165 .. code-block:: default print(df.index.freq) .. rst-class:: sphx-glr-script-out .. code-block:: none None .. GENERATED FROM PYTHON SOURCE LINES 166-171 There can be two reasons for the ``freq`` to be None. Either the data/DateTimeIndex does not have constant time-steps. In such a case freq (time-step) can not be computed. But sometimes even if the index is of type DateTimeIndex and has constant time-step but it can have None freq. This is what happened above. In both cases we can ask pandas to infer the freq/time-step of the index. .. GENERATED FROM PYTHON SOURCE LINES 171-174 .. code-block:: default print(pd.infer_freq(df.index)) .. rst-class:: sphx-glr-script-out .. code-block:: none D .. GENERATED FROM PYTHON SOURCE LINES 175-177 Now we can assign the frequency to the DataFrame.index (not DataFrame). This is kind of reminding the DataFrame that this is the time-step of your index. .. GENERATED FROM PYTHON SOURCE LINES 177-182 .. code-block:: default df.index.freq = pd.infer_freq(df.index) print(df.index.freq) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 183-184 we can see once 'reminded', the pandas now tells us the frequency of its index. .. GENERATED FROM PYTHON SOURCE LINES 184-187 .. code-block:: default print(type(df.index.freq)) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 188-191 .. code-block:: default print(type(df.index.freqstr)) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 192-194 forcing a frequency --------------------- .. GENERATED FROM PYTHON SOURCE LINES 194-197 .. code-block:: default df = pd.DataFrame(np.arange(31), index=pd.date_range("20110101", periods=31, freq="D")) .. GENERATED FROM PYTHON SOURCE LINES 198-200 .. code-block:: default print(df) .. rst-class:: sphx-glr-script-out .. code-block:: none 0 2011-01-01 0 2011-01-02 1 2011-01-03 2 2011-01-04 3 2011-01-05 4 ... .. 2011-01-27 26 2011-01-28 27 2011-01-29 28 2011-01-30 29 2011-01-31 30 [31 rows x 1 columns] .. GENERATED FROM PYTHON SOURCE LINES 201-203 .. code-block:: default print(df.index.freq) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 204-208 .. code-block:: default df = df.drop(labels="2011-01-03") print(df) .. rst-class:: sphx-glr-script-out .. code-block:: none 0 2011-01-01 0 2011-01-02 1 2011-01-04 3 2011-01-05 4 2011-01-06 5 ... .. 2011-01-27 26 2011-01-28 27 2011-01-29 28 2011-01-30 29 2011-01-31 30 [30 rows x 1 columns] .. GENERATED FROM PYTHON SOURCE LINES 209-211 .. code-block:: default print(df.index.freq) .. rst-class:: sphx-glr-script-out .. code-block:: none None .. GENERATED FROM PYTHON SOURCE LINES 212-214 .. code-block:: default pd.infer_freq(df.index) .. GENERATED FROM PYTHON SOURCE LINES 215-216 if we forcefully try to assign a frequency, pandas will throw ``ValueError``. .. GENERATED FROM PYTHON SOURCE LINES 216-221 .. code-block:: default # Try by uncommenting following line # df.index.freq = "D" # -> ValueError .. GENERATED FROM PYTHON SOURCE LINES 222-229 Resampling ============ Resampling means changing the frequency of time series. One major advantage of having a frequency i.e `freq` attribute defined is that we can easily change the frequency/time-step of the data (time series). .. GENERATED FROM PYTHON SOURCE LINES 229-232 .. code-block:: default df.asfreq('D') .. raw:: html
0
2011-01-01 0.0
2011-01-02 1.0
2011-01-03 NaN
2011-01-04 3.0
2011-01-05 4.0
... ...
2011-01-27 26.0
2011-01-28 27.0
2011-01-29 28.0
2011-01-30 29.0
2011-01-31 30.0

31 rows × 1 columns



.. GENERATED FROM PYTHON SOURCE LINES 233-235 Above when we tried to resample our time series data at daily time step, the time steps where we did not have any value, were assigned NaN values. .. GENERATED FROM PYTHON SOURCE LINES 237-241 upsampling ----------- This refers to changing the time step from larger to smaller such as from daily to hourly .. GENERATED FROM PYTHON SOURCE LINES 241-246 .. code-block:: default df = pd.DataFrame(np.random.randint(0, 5, 5), index=pd.date_range("20110101", periods=5, freq="D"), columns=['a']) print(df) .. rst-class:: sphx-glr-script-out .. code-block:: none a 2011-01-01 2 2011-01-02 4 2011-01-03 2 2011-01-04 4 2011-01-05 1 .. GENERATED FROM PYTHON SOURCE LINES 247-250 .. code-block:: default df.resample("6H") .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 251-253 Until now we have told pandas to resample at a particular time step but we have not told which method to use. We can as an example use the `mean` to resample. .. GENERATED FROM PYTHON SOURCE LINES 253-256 .. code-block:: default df.resample("6H").mean() .. raw:: html
a
2011-01-01 00:00:00 2.0
2011-01-01 06:00:00 NaN
2011-01-01 12:00:00 NaN
2011-01-01 18:00:00 NaN
2011-01-02 00:00:00 4.0
2011-01-02 06:00:00 NaN
2011-01-02 12:00:00 NaN
2011-01-02 18:00:00 NaN
2011-01-03 00:00:00 2.0
2011-01-03 06:00:00 NaN
2011-01-03 12:00:00 NaN
2011-01-03 18:00:00 NaN
2011-01-04 00:00:00 4.0
2011-01-04 06:00:00 NaN
2011-01-04 12:00:00 NaN
2011-01-04 18:00:00 NaN
2011-01-05 00:00:00 1.0


.. GENERATED FROM PYTHON SOURCE LINES 257-258 But this did not fill the NaNs in our new data. .. GENERATED FROM PYTHON SOURCE LINES 258-261 .. code-block:: default df.resample("6H").ffill() .. raw:: html
a
2011-01-01 00:00:00 2
2011-01-01 06:00:00 2
2011-01-01 12:00:00 2
2011-01-01 18:00:00 2
2011-01-02 00:00:00 4
2011-01-02 06:00:00 4
2011-01-02 12:00:00 4
2011-01-02 18:00:00 4
2011-01-03 00:00:00 2
2011-01-03 06:00:00 2
2011-01-03 12:00:00 2
2011-01-03 18:00:00 2
2011-01-04 00:00:00 4
2011-01-04 06:00:00 4
2011-01-04 12:00:00 4
2011-01-04 18:00:00 4
2011-01-05 00:00:00 1


.. GENERATED FROM PYTHON SOURCE LINES 262-263 .mean() returns us a pandas object. We can in fact call ``ffill`` on it as well. .. GENERATED FROM PYTHON SOURCE LINES 263-266 .. code-block:: default df.resample("6H").mean().ffill() .. raw:: html
a
2011-01-01 00:00:00 2.0
2011-01-01 06:00:00 2.0
2011-01-01 12:00:00 2.0
2011-01-01 18:00:00 2.0
2011-01-02 00:00:00 4.0
2011-01-02 06:00:00 4.0
2011-01-02 12:00:00 4.0
2011-01-02 18:00:00 4.0
2011-01-03 00:00:00 2.0
2011-01-03 06:00:00 2.0
2011-01-03 12:00:00 2.0
2011-01-03 18:00:00 2.0
2011-01-04 00:00:00 4.0
2011-01-04 06:00:00 4.0
2011-01-04 12:00:00 4.0
2011-01-04 18:00:00 4.0
2011-01-05 00:00:00 1.0


.. GENERATED FROM PYTHON SOURCE LINES 267-269 A better way to resample is apply some interpolation method. For example linear interpolation. .. GENERATED FROM PYTHON SOURCE LINES 269-272 .. code-block:: default df.resample("6H").interpolate(method="linear") .. raw:: html
a
2011-01-01 00:00:00 2.00
2011-01-01 06:00:00 2.50
2011-01-01 12:00:00 3.00
2011-01-01 18:00:00 3.50
2011-01-02 00:00:00 4.00
2011-01-02 06:00:00 3.50
2011-01-02 12:00:00 3.00
2011-01-02 18:00:00 2.50
2011-01-03 00:00:00 2.00
2011-01-03 06:00:00 2.50
2011-01-03 12:00:00 3.00
2011-01-03 18:00:00 3.50
2011-01-04 00:00:00 4.00
2011-01-04 06:00:00 3.25
2011-01-04 12:00:00 2.50
2011-01-04 18:00:00 1.75
2011-01-05 00:00:00 1.00


.. GENERATED FROM PYTHON SOURCE LINES 273-277 Sometimes, we may wish to equally distribute a quantity during upsampling For example if we have total amount of rainfall for a day, then linearly interpolating daily rainfall values to hourly will be wrong. In such a case we will wish to distribute daily rainfall to equally to hourly steps .. GENERATED FROM PYTHON SOURCE LINES 277-283 .. code-block:: default df1 = df.resample('6H').mean().ffill() df1['a'] = df1['a'] / df1.groupby('a')['a'].transform(len) # len/'size' print(df1) .. rst-class:: sphx-glr-script-out .. code-block:: none a 2011-01-01 00:00:00 0.25 2011-01-01 06:00:00 0.25 2011-01-01 12:00:00 0.25 2011-01-01 18:00:00 0.25 2011-01-02 00:00:00 0.50 2011-01-02 06:00:00 0.50 2011-01-02 12:00:00 0.50 2011-01-02 18:00:00 0.50 2011-01-03 00:00:00 0.25 2011-01-03 06:00:00 0.25 2011-01-03 12:00:00 0.25 2011-01-03 18:00:00 0.25 2011-01-04 00:00:00 0.50 2011-01-04 06:00:00 0.50 2011-01-04 12:00:00 0.50 2011-01-04 18:00:00 0.50 2011-01-05 00:00:00 1.00 .. GENERATED FROM PYTHON SOURCE LINES 284-287 downsampling ------------- It refers to resampling from low time step to high time step e.g. from hourly to daily .. GENERATED FROM PYTHON SOURCE LINES 287-293 .. code-block:: default df = pd.DataFrame(np.random.randint(0, 5, 24), index=pd.date_range("20110101", periods=24, freq="H"), columns=['a']) print(df) .. rst-class:: sphx-glr-script-out .. code-block:: none a 2011-01-01 00:00:00 0 2011-01-01 01:00:00 2 2011-01-01 02:00:00 4 2011-01-01 03:00:00 0 2011-01-01 04:00:00 1 ... .. 2011-01-01 19:00:00 1 2011-01-01 20:00:00 0 2011-01-01 21:00:00 1 2011-01-01 22:00:00 0 2011-01-01 23:00:00 1 [24 rows x 1 columns] .. GENERATED FROM PYTHON SOURCE LINES 294-297 .. code-block:: default df.resample("6H").mean() .. raw:: html
a
2011-01-01 00:00:00 1.666667
2011-01-01 06:00:00 2.000000
2011-01-01 12:00:00 2.333333
2011-01-01 18:00:00 0.666667


.. GENERATED FROM PYTHON SOURCE LINES 298-301 .. code-block:: default df.resample("6H").sum() .. raw:: html
a
2011-01-01 00:00:00 10
2011-01-01 06:00:00 12
2011-01-01 12:00:00 14
2011-01-01 18:00:00 4


.. GENERATED FROM PYTHON SOURCE LINES 302-307 inconsistent time step ------------------------ Sometimes we have quantities, which are not measured at exactly the same frequency where we want. For example below data is measured with inconsistent time steps. .. GENERATED FROM PYTHON SOURCE LINES 307-318 .. code-block:: default df = pd.DataFrame([np.nan, 1100, 1400, np.nan, 14000], index=pd.to_datetime(["2011-05-25 10:00:00", "2011-05-25 16:40:00", "2011-05-25 17:06:00", "2011-05-25 17:10:00", "2011-05-25 17:24:00"]), columns=['a']) print(df) .. rst-class:: sphx-glr-script-out .. code-block:: none a 2011-05-25 10:00:00 NaN 2011-05-25 16:40:00 1100.0 2011-05-25 17:06:00 1400.0 2011-05-25 17:10:00 NaN 2011-05-25 17:24:00 14000.0 .. GENERATED FROM PYTHON SOURCE LINES 319-321 Our target is to convert this data to 6 minute. A naive way would be to change the frequency and do not fill the new nans. .. GENERATED FROM PYTHON SOURCE LINES 321-324 .. code-block:: default df.resample('6Min').first() .. raw:: html
a
2011-05-25 10:00:00 NaN
2011-05-25 10:06:00 NaN
2011-05-25 10:12:00 NaN
2011-05-25 10:18:00 NaN
2011-05-25 10:24:00 NaN
... ...
2011-05-25 17:00:00 NaN
2011-05-25 17:06:00 1400.0
2011-05-25 17:12:00 NaN
2011-05-25 17:18:00 NaN
2011-05-25 17:24:00 14000.0

75 rows × 1 columns



.. GENERATED FROM PYTHON SOURCE LINES 325-328 You see the number of values change from 5 to 75 a better option will be to do backfill or forward fill .. GENERATED FROM PYTHON SOURCE LINES 328-331 .. code-block:: default df.resample('6min').bfill(limit=1) .. raw:: html
a
2011-05-25 10:00:00 NaN
2011-05-25 10:06:00 NaN
2011-05-25 10:12:00 NaN
2011-05-25 10:18:00 NaN
2011-05-25 10:24:00 NaN
... ...
2011-05-25 17:00:00 1400.0
2011-05-25 17:06:00 1400.0
2011-05-25 17:12:00 NaN
2011-05-25 17:18:00 14000.0
2011-05-25 17:24:00 14000.0

75 rows × 1 columns



.. GENERATED FROM PYTHON SOURCE LINES 332-333 it will be even better to do a linear interpolation between available values. .. GENERATED FROM PYTHON SOURCE LINES 333-337 .. code-block:: default df.resample('6min').interpolate() .. raw:: html
a
2011-05-25 10:00:00 NaN
2011-05-25 10:06:00 NaN
2011-05-25 10:12:00 NaN
2011-05-25 10:18:00 NaN
2011-05-25 10:24:00 NaN
... ...
2011-05-25 17:00:00 NaN
2011-05-25 17:06:00 1400.0
2011-05-25 17:12:00 5600.0
2011-05-25 17:18:00 9800.0
2011-05-25 17:24:00 14000.0

75 rows × 1 columns



.. GENERATED FROM PYTHON SOURCE LINES 338-340 .. code-block:: default df.resample('6min').interpolate('nearest') .. raw:: html
a
2011-05-25 10:00:00 NaN
2011-05-25 10:06:00 NaN
2011-05-25 10:12:00 NaN
2011-05-25 10:18:00 NaN
2011-05-25 10:24:00 NaN
... ...
2011-05-25 17:00:00 NaN
2011-05-25 17:06:00 1400.0
2011-05-25 17:12:00 1400.0
2011-05-25 17:18:00 14000.0
2011-05-25 17:24:00 14000.0

75 rows × 1 columns



.. GENERATED FROM PYTHON SOURCE LINES 341-345 Period ========= A Period is an interval between two TimeStamps. Therefore a Period has ``start_time`` and ``end_time`` attributes .. GENERATED FROM PYTHON SOURCE LINES 345-350 .. code-block:: default p = pd.Period("1979-02-01") print(type(p)) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 351-354 .. code-block:: default print(p.start_time) .. rst-class:: sphx-glr-script-out .. code-block:: none 1979-02-01 00:00:00 .. GENERATED FROM PYTHON SOURCE LINES 355-358 .. code-block:: default print(p.end_time) .. rst-class:: sphx-glr-script-out .. code-block:: none 1979-02-01 23:59:59.999999999 .. GENERATED FROM PYTHON SOURCE LINES 359-362 .. code-block:: default print(type(p.start_time)), print(type(p.end_time)) .. rst-class:: sphx-glr-script-out .. code-block:: none (None, None) .. GENERATED FROM PYTHON SOURCE LINES 363-366 .. code-block:: default print(p.freq) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 367-371 PeriodIndex ----------- Similar to the concept of DateTimeIndex is the concept of PeriodIndex. Just as a DateTimeIndex can be considered as an array of TimeStamps, a PeriodIndex is array of Period. .. GENERATED FROM PYTHON SOURCE LINES 371-376 .. code-block:: default pidx = pd.period_range("20110101", "20121231", freq="M") print(pidx) .. rst-class:: sphx-glr-script-out .. code-block:: none PeriodIndex(['2011-01', '2011-02', '2011-03', '2011-04', '2011-05', '2011-06', '2011-07', '2011-08', '2011-09', '2011-10', '2011-11', '2011-12', '2012-01', '2012-02', '2012-03', '2012-04', '2012-05', '2012-06', '2012-07', '2012-08', '2012-09', '2012-10', '2012-11', '2012-12'], dtype='period[M]') .. GENERATED FROM PYTHON SOURCE LINES 377-380 .. code-block:: default print(type(pidx)) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 381-382 each member of PeriodIndex array i.e., **pidx** is a Period .. GENERATED FROM PYTHON SOURCE LINES 382-385 .. code-block:: default print(type(pidx[0])) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 386-388 For an overview of difference between TimeStamp and PeriodIndex, `see this `_ .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.218 seconds) .. _sphx_glr_download_auto_examples_pandas_working_with_timeseries.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/AtrCheema/python-seekho/master?urlpath=lab/tree/notebooks/auto_examples/pandas/working_with_timeseries.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: working_with_timeseries.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: working_with_timeseries.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_