使用for循环附加pandas df

I am experimenting with a change point algorithm (code requires pip install ruptures) and I am attempting to create a pandas df by using the change point algorithm to calculate in hours per day of rise & fall times in the data. The part that I get tripped up on is how to iterate thru each day of the dataset to create like a "master dataframe" that I calling master_hrs

IE浏览器

master_hrs = pd.DataFrame({
    'overnight_AM_hrs':[], 'moring_startup_hrs':[], 'moring_ramp_hrs':[], 'high_load_hrs':[], 'evening_shoulder_hrs':[], 'overnight_PM_hrs':[],
})

The change point part of my code works Ok to calculate in hours as defined in the master_hrs df, but its appending the master df that doesnt in the for loop for idx, days in df.groupby(df.index.date):.

这就是所有的代码,我想我的问题是在for循环的最后循环进行。

import ruptures as rpt
import calendar

import numpy as np
import pandas as pd
np.random.seed(11)

rows,cols = 50000,2
data = np.random.rand(rows,cols) 
tidx = pd.date_range('2019-01-01', periods=rows, freq='H') 
df = pd.DataFrame(data, columns=['Temperature','Value'], index=tidx)

def changPointDf(df):
    arr = np.array(df.Value)
    #Define Binary Segmentation search method
    model = "l2"  
    algo = rpt.Binseg(model=model).fit(arr)
    my_bkps = algo.predict(n_bkps=5)
    # getting the timestamps of the change points
    bkps_timestamps = df.iloc[[0] + my_bkps[:-1] +[-1]].index
    # computing the durations between change points
    durations = (bkps_timestamps[1:] - bkps_timestamps[:-1])
    #hours calc
    d = durations.seconds/60/60
    d_f = pd.DataFrame(d)
    df2 = d_f.T
    return df2


master_hrs = pd.DataFrame({
    'overnight_AM_hrs':[], 'moring_startup_hrs':[], 'moring_ramp_hrs':[], 'high_load_hrs':[], 'evening_shoulder_hrs':[], 'overnight_PM_hrs':[],
})


columns = master_hrs.columns
data = []

for idx, days in df.groupby(df.index.date):
    changPoint_df = changPointDf(days)
    values = changPoint_df.values.tolist()
    zipped = zip(columns, values)
    a_dictionary = dict(zipped)
    print(a_dictionary)
    data.append(a_dictionary)

运行后,它将在列中每天输出一个列表。 IE浏览器

{'overnight_AM_hrs': [5.0, 5.0, 5.0, 5.0, 3.0]}
{'overnight_AM_hrs': [5.0, 5.0, 5.0, 5.0, 3.0]}
{'overnight_AM_hrs': [5.0, 5.0, 5.0, 5.0, 3.0]}
{'overnight_AM_hrs': [5.0, 5.0, 5.0, 5.0, 3.0]}
{'overnight_AM_hrs': [5.0, 5.0, 5.0, 5.0, 3.0]}
{'overnight_AM_hrs': [5.0, 5.0, 5.0, 5.0, 3.0]}
{'overnight_AM_hrs': [5.0, 5.0, 5.0, 5.0, 3.0]}
{'overnight_AM_hrs': [5.0, 5.0, 5.0, 5.0, 3.0]}
{'overnight_AM_hrs': [5.0, 5.0, 5.0, 5.0, 3.0]}
{'overnight_AM_hrs': [5.0, 5.0, 5.0, 5.0, 3.0]}
{'overnight_AM_hrs': [5.0, 5.0, 5.0, 5.0, 3.0]}
{'overnight_AM_hrs': [5.0, 5.0, 5.0, 5.0, 3.0]}

希望这是有意义的,我正在尝试附加此主数据框。任何提示大加赞赏

评论