获取列表列的频率表

假设我有DataFrame,其中有一列列表。

df = pd.DataFrame({'A': [['a', 'b', 'c'], ['b'], ['c'], ['a', 'b']]})

与输出

Index  A
0      ['a', 'b', 'c']
1      ['b']
2      ['c']
3      ['a', 'b']

如何获得频率表,以了解列表在该列中出现的频率?

理想的输出看起来像

A               Count
['a', 'b', 'c'] 1
['b']           1
['c']           1
['a', 'b']      1

尝试这样的事情...

df.A.value_counts()

导致错误

TypeError: unhashable type: 'list'
评论
  • ut_et
    ut_et 回复

    map to tuples, lists are not hashable as the error suggests:

    df.A.map(tuple).value_counts().rename_axis('A').reset_index(name='Count')
    
               A  Count
    0  (a, b, c)      1
    1     (a, b)      1
    2       (b,)      1
    3       (c,)      1
    
  • yut
    yut 回复

    You can also use apply to convert into tuples:

    In [422]: df.A.apply(tuple).value_counts()                                                                                                                                                                  
    Out[422]: 
    (a, b)       1
    (c,)         1
    (a, b, c)    1
    (b,)         1
    Name: A, dtype: int64