python - Pandas DatetimeIndex vs to_datetime discrepancies -


i'm trying convert pandas series of epoch timestamps human-readable times. there @ least 2 obvious ways this: pd.datetimeindex , pd.to_datetime(). seem work in quite different ways:

in [1]: import pandas pd  in [3]: nanos = pd.series([1462282258000000000, 1462282258100000000, 1462282258200000000])  in [4]: pd.to_datetime(nanos) out[4]:  0   2016-05-03 13:30:58.000 1   2016-05-03 13:30:58.100 2   2016-05-03 13:30:58.200 dtype: datetime64[ns]  in [5]: pd.datetimeindex(nanos) out[5]:  datetimeindex([       '2016-05-03 13:30:58', '2016-05-03 13:30:58.100000',                '2016-05-03 13:30:58.200000'],               dtype='datetime64[ns]', freq=none) 

with to_datetime(), display resolution milliseconds, , .000 printed on whole seconds. datetimeindex, display resolution microseconds (which like), decimal part omitted on whole seconds.

then, try converting time zone:

in [12]: pd.datetimeindex(nanos).tz_localize('utc')                    out[12]:  datetimeindex([       '2016-05-03 13:30:58+00:00',                '2016-05-03 13:30:58.100000+00:00',                '2016-05-03 13:30:58.200000+00:00'],               dtype='datetime64[ns, utc]', freq=none)  in [13]: pd.to_datetime(nanos).tz_localize('utc')   typeerror: index not valid datetimeindex or periodindex 

this strange: timezone functions don't work plain datetime series, datetimeindex. why be? tz_localize() method exists , documented here: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.series.tz_localize.html

i've tried pandas 0.17.0 , 0.18.1 same results.

i'm not trying make actual index, else being equal have expected use to_datetime() - can't time zone methods work it.

there 1 way convert things, pd.to_datetime(), yes can directly construct datetimeindex, restrictive on purpose, while to_datetime quite flexible.

so to_datetime give similar object input, if input array-like, datetimeindex, input series series.

in [5]: nanos = [1462282258000000000, 1462282258100000000, 1462282258200000000] 

by default convert unit='ns' lines here

in [7]: pd.to_datetime(nanos) out[7]: datetimeindex(['2016-05-03 13:30:58', '2016-05-03 13:30:58.100000', '2016-05-03 13:30:58.200000'], dtype='datetime64[ns]', freq=none) 

so 1 thing make series out of this. index integer here, values datetimes.

in [10]: s = series(pd.to_datetime(nanos))  in [11]: s out[11]:  0   2016-05-03 13:30:58.000 1   2016-05-03 13:30:58.100 2   2016-05-03 13:30:58.200 dtype: datetime64[ns] 

you can use .dt accessor operate on values. series.tz_localize operates on index.

in [12]: s.dt.tz_localize('us/eastern') out[12]:  0          2016-05-03 13:30:58-04:00 1   2016-05-03 13:30:58.100000-04:00 2   2016-05-03 13:30:58.200000-04:00 dtype: datetime64[ns, us/eastern] 

Popular posts from this blog

php - How should I create my API for mobile applications (Needs Authentication) -

5 Reasons to Blog Anonymously (and 5 Reasons Not To)

Google AdWords and AdSense - A Dynamic Small Business Marketing Duo