爆炸熊猫数据框,其中元素是字符串,而不是列表

我有:

a = pd.DataFrame([{'var1': ['a','b','c'], 'var2': 1},
               {'var1': ['d','e','f'], 'var2': 2}])
a.explode('var1')

    var1    var2
0   a,b,c   1
1   d,e,f   2

我想要:

    var1    var2
0   a   1
0   b   1
0   c   1
1   d   2
1   e   2
1   f   2

如果我在列表类型中包含var1元素,则使用explode可以工作:

b = pd.DataFrame([{'var1': ['a','b','c'], 'var2': 1},
               {'var1': ['d','e','f'], 'var2': 2}])
b.explode('var1')

I cannot get this to work where the elements in a['var1'] are strings.

type(a.var1[0]) --> str
type(b.var1[0]) --> list

有什么建议么?

评论
  • 甜小妞
    甜小妞 回复

    If values are strings with , only add Series.str.split for lists and assign back to same column by DataFrame.assign:

    a = pd.DataFrame([{'var1': 'a,b,c', 'var2': 1},
                      {'var1': 'd,e,f', 'var2': 2}])
    
    df = a.assign(var1 = a['var1'].str.split(',')).explode('var1').reset_index(drop=True)
    print (df)
      var1  var2
    0    a     1
    1    b     1
    2    c     1
    3    d     2
    4    e     2
    5    f     2