I am working on the Metro Interstate Traffic Volume data set (available here: http://archive.ics.uci.edu/ml/datasets/Metro+Interstate+Traffic+Volume) but I can't resample the dataset to show the average traffic volume per day instead of showing it per hour.
metro = pd.read_csv('Metro_Interstate_Traffic_Volume.csv')
metro['date_time'] = pd.to_datetime(metro['date_time'], format='%Y-%m-%d %H:%M:%S')
metro.set_index('date_time', inplace=True, drop=True)
metro.resample('1Y').mean()
这是我得到的:
holiday temp ... weather_description traffic_volume
date_time ...
2012-10-02 09:00:00 None 288.28 ... scattered clouds 5545
2012-10-02 10:00:00 None 289.36 ... broken clouds 4516
2012-10-02 11:00:00 None 289.58 ... overcast clouds 4767
2012-10-02 12:00:00 None 290.13 ... overcast clouds 5026
2012-10-02 13:00:00 None 291.14 ... broken clouds 4918
... ... ... ... ... ...
2018-09-30 19:00:00 None 283.45 ... broken clouds 3543
2018-09-30 20:00:00 None 282.76 ... overcast clouds 2781
2018-09-30 21:00:00 None 282.73 ... proximity thunderstorm 2159
2018-09-30 22:00:00 None 282.09 ... overcast clouds 1450
2018-09-30 23:00:00 None 282.12 ... overcast clouds 954
[48204 rows x 8 columns]
您对解决方法有任何想法吗?
您可以尝试通过创建年份列:
One year is not a fix time period: some years have 365 days, some have 366. You can use
groupby
:输出: