Reading in time series data

In this section, we demonstrate the various ways to read in time series data, starting with the simple read_csv method:

    In [7]: ibmData=pd.read_csv('ibm-common-stock-closing-prices-1959_1960.csv')
      ibmData.head()
    Out[7]:   TradeDate  closingPrice
    0   1959-06-29   445
    1   1959-06-30   448
    2   1959-07-01   450
    3   1959-07-02   447
    4   1959-07-06   451
    5 rows 2 columns

The source of this information can be found at http://datamarket.com.

We would like the TradeDate column to be a series of datetime values so that we can index it and create a time series:

  1. Let's first check the type of values in the TradeDate series:

    In [16]: type(ibmData['TradeDate'])
    Out[16]: pandas.core.series.Series
    In [12]: type(ibmData['TradeDate'][0])
    Out[12]: str
  1. Next, we convert these values to a Timestamp type:
    In [17]: ibmData['TradeDate']=pd.to_datetime(ibmData['TradeDate'])
            type(ibmData['TradeDate'][0])
    Out[17]: pandas.tslib.Timestamp  
  1. We can now use the TradeDate column as an index:
In [113]: #Convert DataFrame to TimeSeries
              #Resampling creates NaN rows for weekend dates, 
hence use dropna ibmTS=ibmData.set_index('TradeDate').resample('D' ['closingPrice'].dropna() ibmTS Out[113]: TradeDate 1959-06-29 445 1959-06-30 448 1959-07-01 450 1959-07-02 447 1959-07-06 451 ... Name: closingPrice, Length: 255

In the next section, we will learn how to assign a date column as an index and then perform subsetting based on the index. For this section, we will use the Object Occupancy dataset where some room parameters were observed every few minutes for several weeks and the corresponding room occupancy was observed. This dataset is present as three separate files. 

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset