我有一个数据框,其中的列充满了字符串,如下所示:
1 Janus Henderson Research Fund Class N
2 Calvert Equity Fund Class A
3 Invesco Diversified Dividend Fund R5 Class
4 Prudential Day One 2035 Fund Class R3
5 TETON Convertible Securities Fund Class C
...
24991 BlackRock Asian Dragon Fund,Inc.Class R
24993 MFS Blended Research International Equity Fund...
24994 ClearBridge Small Cap Fund Class A
24995 Federated Equity Income Fund, Inc. Class A Shares
我正在寻找提取每行的类。 例如,第1行作为N类,第2行作为A类,依此类推..有些行没有单词class在一起,我想成为NA。还有一些行在单词class之前有class标签。我该如何提取呢? 任何指导表示赞赏
You can write a parsing function that receive a row and look for the word "Class" and then return whatever is next to it. This function could use
txt.split('Class')
for example. After you are writing this function, you can useapply()
(dataframe method) to apply it to every row separately.