如何选择数据在列表中的熊猫DateTimeIndex的子集?

Lets say I have a idx=pd.DatatimeIndex with one minute frequency. I also have a list of bad dates (each are of type pd.Timestamp without the time information) that I want to remove from the original idx. How do I do that in pandas?

评论
  • mex
    mex 回复

    Use normalize to remove the time part from your index so you can do a simple ~ + isin selection, i.e. find the dates not in that bad list. You can further ensure your list of dates don't have a time part with the same [x.normalize() for x in bad_dates] if you need to be extra safe.

    样本数据

    import pandas as pd
    df = pd.DataFrame(range(9), index=pd.date_range('2010-01-01', freq='11H', periods=9))
    bad_dates = [pd.Timestamp('2010-01-02'), pd.Timestamp('2010-01-03')]
    

    df[~df.index.normalize().isin(bad_dates)]
    
    #                     0
    #2010-01-01 00:00:00  0
    #2010-01-01 11:00:00  1
    #2010-01-01 22:00:00  2
    #2010-01-04 05:00:00  7
    #2010-01-04 16:00:00  8