.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/pandas/io.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code or to run this example in your browser via Binder .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_pandas_io.py: ==================== 5.4 reading/writing ==================== This file describes how to read data from files and write data into files using pandas. .. important:: This lesson is still under development. .. GENERATED FROM PYTHON SOURCE LINES 14-17 .. code-block:: default import pandas as pd .. GENERATED FROM PYTHON SOURCE LINES 18-23 .. code-block:: default df = pd.read_csv("https://raw.githubusercontent.com/AtrCheema/AI4Water/master/ai4water/datasets/arg_busan.csv") type(df) .. GENERATED FROM PYTHON SOURCE LINES 24-26 .. code-block:: default df .. raw:: html
index tide_cm wat_temp_c sal_psu air_temp_c ... aac_coppml Total_otus otu_5575 otu_273 otu_94
0 6/19/2018 0:00 36.407149 19.321232 33.956058 19.780000 ... NaN NaN NaN NaN NaN
1 6/19/2018 0:30 35.562515 19.320124 33.950508 19.093333 ... NaN NaN NaN NaN NaN
2 6/19/2018 1:00 34.808016 19.319666 33.942532 18.733333 ... NaN NaN NaN NaN NaN
3 6/19/2018 1:30 30.645216 19.320406 33.931263 18.760000 ... NaN NaN NaN NaN NaN
4 6/19/2018 2:00 26.608980 19.326729 33.917961 18.633333 ... NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ...
1441 9/7/2019 22:00 -3.989912 20.990612 33.776449 23.700000 ... NaN NaN NaN NaN NaN
1442 9/7/2019 22:30 -2.807042 21.012014 33.702310 23.620000 ... NaN NaN NaN NaN NaN
1443 9/7/2019 23:00 -3.471326 20.831739 33.726177 23.666667 ... NaN NaN NaN NaN NaN
1444 9/7/2019 23:30 0.707771 21.006086 33.716274 23.633333 ... NaN NaN NaN NaN NaN
1445 9/8/2019 0:00 1.011731 20.896149 33.729773 23.600000 ... NaN NaN NaN NaN NaN

1446 rows × 26 columns



.. GENERATED FROM PYTHON SOURCE LINES 27-29 .. code-block:: default df.to_csv("arg_busan.csv") .. GENERATED FROM PYTHON SOURCE LINES 30-32 The index of df was 0,1,2,... By default, to_csv function writes the index to csv .. GENERATED FROM PYTHON SOURCE LINES 32-34 .. code-block:: default print(df.index) .. rst-class:: sphx-glr-script-out .. code-block:: none RangeIndex(start=0, stop=1446, step=1) .. GENERATED FROM PYTHON SOURCE LINES 35-37 .. code-block:: default print(df.index.name) .. rst-class:: sphx-glr-script-out .. code-block:: none None .. GENERATED FROM PYTHON SOURCE LINES 38-39 we can avoid writing the index to csv file by setting ``index=False``. .. GENERATED FROM PYTHON SOURCE LINES 39-42 .. code-block:: default df.to_csv("arg_busan.csv", index=False) .. GENERATED FROM PYTHON SOURCE LINES 43-45 we can also explicitly tell pandas what label for index to use when writing the index to csv file. .. GENERATED FROM PYTHON SOURCE LINES 45-48 .. code-block:: default df.to_csv("arg_busan.csv", index_label="index") .. GENERATED FROM PYTHON SOURCE LINES 49-50 if we want to save a dataframe to Excel file we can do it as following .. GENERATED FROM PYTHON SOURCE LINES 50-52 .. code-block:: default df.to_excel("arg_busan.xlsx") # we must have ``openpyxl`` package for that .. GENERATED FROM PYTHON SOURCE LINES 53-54 we can define the sheet name and exlcude the index as following .. GENERATED FROM PYTHON SOURCE LINES 54-57 .. code-block:: default df.to_excel("arg_busan.xlsx", index=False, sheet_name="data") .. GENERATED FROM PYTHON SOURCE LINES 58-59 to read the excel file as dataframe we can make use of read_excel function .. GENERATED FROM PYTHON SOURCE LINES 59-63 .. code-block:: default df = pd.read_excel("arg_busan.xlsx") print(df) .. rst-class:: sphx-glr-script-out .. code-block:: none index tide_cm wat_temp_c sal_psu air_temp_c ... \ 0 6/19/2018 0:00 36.407149 19.321232 33.956058 19.780000 ... 1 6/19/2018 0:30 35.562515 19.320124 33.950508 19.093333 ... 2 6/19/2018 1:00 34.808016 19.319666 33.942532 18.733333 ... 3 6/19/2018 1:30 30.645216 19.320406 33.931263 18.760000 ... 4 6/19/2018 2:00 26.608980 19.326729 33.917961 18.633333 ... ... ... ... ... ... ... ... 1441 9/7/2019 22:00 -3.989912 20.990612 33.776449 23.700000 ... 1442 9/7/2019 22:30 -2.807042 21.012014 33.702310 23.620000 ... 1443 9/7/2019 23:00 -3.471326 20.831739 33.726177 23.666667 ... 1444 9/7/2019 23:30 0.707771 21.006086 33.716274 23.633333 ... 1445 9/8/2019 0:00 1.011731 20.896149 33.729773 23.600000 ... aac_coppml Total_otus otu_5575 otu_273 otu_94 0 NaN NaN NaN NaN NaN 1 NaN NaN NaN NaN NaN 2 NaN NaN NaN NaN NaN 3 NaN NaN NaN NaN NaN 4 NaN NaN NaN NaN NaN ... ... ... ... ... ... 1441 NaN NaN NaN NaN NaN 1442 NaN NaN NaN NaN NaN 1443 NaN NaN NaN NaN NaN 1444 NaN NaN NaN NaN NaN 1445 NaN NaN NaN NaN NaN [1446 rows x 26 columns] .. GENERATED FROM PYTHON SOURCE LINES 64-65 we can tell which column should be used as index for the dataframe .. GENERATED FROM PYTHON SOURCE LINES 65-69 .. code-block:: default df = pd.read_excel("arg_busan.xlsx", index_col="index") print(df) .. rst-class:: sphx-glr-script-out .. code-block:: none tide_cm wat_temp_c sal_psu air_temp_c pcp_mm ... \ index ... 6/19/2018 0:00 36.407149 19.321232 33.956058 19.780000 0.0 ... 6/19/2018 0:30 35.562515 19.320124 33.950508 19.093333 0.0 ... 6/19/2018 1:00 34.808016 19.319666 33.942532 18.733333 0.0 ... 6/19/2018 1:30 30.645216 19.320406 33.931263 18.760000 0.0 ... 6/19/2018 2:00 26.608980 19.326729 33.917961 18.633333 0.0 ... ... ... ... ... ... ... ... 9/7/2019 22:00 -3.989912 20.990612 33.776449 23.700000 0.0 ... 9/7/2019 22:30 -2.807042 21.012014 33.702310 23.620000 0.0 ... 9/7/2019 23:00 -3.471326 20.831739 33.726177 23.666667 0.0 ... 9/7/2019 23:30 0.707771 21.006086 33.716274 23.633333 0.0 ... 9/8/2019 0:00 1.011731 20.896149 33.729773 23.600000 0.0 ... aac_coppml Total_otus otu_5575 otu_273 otu_94 index 6/19/2018 0:00 NaN NaN NaN NaN NaN 6/19/2018 0:30 NaN NaN NaN NaN NaN 6/19/2018 1:00 NaN NaN NaN NaN NaN 6/19/2018 1:30 NaN NaN NaN NaN NaN 6/19/2018 2:00 NaN NaN NaN NaN NaN ... ... ... ... ... ... 9/7/2019 22:00 NaN NaN NaN NaN NaN 9/7/2019 22:30 NaN NaN NaN NaN NaN 9/7/2019 23:00 NaN NaN NaN NaN NaN 9/7/2019 23:30 NaN NaN NaN NaN NaN 9/8/2019 0:00 NaN NaN NaN NaN NaN [1446 rows x 25 columns] .. GENERATED FROM PYTHON SOURCE LINES 70-72 .. code-block:: default print(type(df.index)) .. rst-class:: sphx-glr-script-out .. code-block:: none .. GENERATED FROM PYTHON SOURCE LINES 73-75 Although we index of dataframe is date and time but pandas does not recognize it as data and time but it recognizes it just as numbers .. GENERATED FROM PYTHON SOURCE LINES 75-80 .. code-block:: default df = pd.read_excel("arg_busan.xlsx", index_col="index", parse_dates=True) print(df) .. rst-class:: sphx-glr-script-out .. code-block:: none tide_cm wat_temp_c sal_psu air_temp_c pcp_mm \ index 2018-06-19 00:00:00 36.407149 19.321232 33.956058 19.780000 0.0 2018-06-19 00:30:00 35.562515 19.320124 33.950508 19.093333 0.0 2018-06-19 01:00:00 34.808016 19.319666 33.942532 18.733333 0.0 2018-06-19 01:30:00 30.645216 19.320406 33.931263 18.760000 0.0 2018-06-19 02:00:00 26.608980 19.326729 33.917961 18.633333 0.0 ... ... ... ... ... ... 2019-09-07 22:00:00 -3.989912 20.990612 33.776449 23.700000 0.0 2019-09-07 22:30:00 -2.807042 21.012014 33.702310 23.620000 0.0 2019-09-07 23:00:00 -3.471326 20.831739 33.726177 23.666667 0.0 2019-09-07 23:30:00 0.707771 21.006086 33.716274 23.633333 0.0 2019-09-08 00:00:00 1.011731 20.896149 33.729773 23.600000 0.0 ... aac_coppml Total_otus otu_5575 otu_273 otu_94 index ... 2018-06-19 00:00:00 ... NaN NaN NaN NaN NaN 2018-06-19 00:30:00 ... NaN NaN NaN NaN NaN 2018-06-19 01:00:00 ... NaN NaN NaN NaN NaN 2018-06-19 01:30:00 ... NaN NaN NaN NaN NaN 2018-06-19 02:00:00 ... NaN NaN NaN NaN NaN ... ... ... ... ... ... ... 2019-09-07 22:00:00 ... NaN NaN NaN NaN NaN 2019-09-07 22:30:00 ... NaN NaN NaN NaN NaN 2019-09-07 23:00:00 ... NaN NaN NaN NaN NaN 2019-09-07 23:30:00 ... NaN NaN NaN NaN NaN 2019-09-08 00:00:00 ... NaN NaN NaN NaN NaN [1446 rows x 25 columns] .. GENERATED FROM PYTHON SOURCE LINES 81-82 Now the index of dataframe is read as ``DateTimeIndex`` .. GENERATED FROM PYTHON SOURCE LINES 82-84 .. code-block:: default print(type(df.index)) #%% .. rst-class:: sphx-glr-script-out .. code-block:: none .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 3.405 seconds) .. _sphx_glr_download_auto_examples_pandas_io.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: binder-badge .. image:: images/binder_badge_logo.svg :target: https://mybinder.org/v2/gh/AtrCheema/python-seekho/master?urlpath=lab/tree/notebooks/auto_examples/pandas/io.ipynb :alt: Launch binder :width: 150 px .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: io.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: io.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_