熊猫系列替换值

我有一个熊猫系列,其值如下:

Bachelors Degree         639
Diploma                  291
O - Level                264
Masters Degree           149
Certificate              126
A - Level                 69
PGD                       40
Bachelors Degree          28
A-Level                   20
O-Level                   15
Masters                   10
Bachelors                  6
diploma                    5
certificate                5
Ph.D                       4
A- Level                   2
Post Graduate Diploma      1
Msc Environment            1
BBA                        1
O- Level                   1
Masters                    1
PhD                        1

我从Excel获得数据。

我想用熊猫做数据清理,方法是说用硕士学位替换所有拥有硕士学位的案例(我可以在excel中做到,但我正在学习熊猫)。

我试过了

mapp={"Bachelor's Degree":["Bachelors Degree","Bachelors","BBA","Bachelors Degree"],
      "Ordinary Diploma":"diploma",
      "Ordinary Level":["O - Level","O-Level","O- Level"],
      "Master's Degree":["Masters Degree","Masters","Msc Environment","Masters"],
      "Certificate":"certificate",
      "Advanced Level":["A - Level","A-Level","- Level"],
      "Post Graduate Diploma":["Post Graduate Diploma","PGD"],
      "PHD":["Ph.D","PhD"]    
     }
df['EDUCATION_LEVEL']=df['EDUCATION_LEVEL'].map(mapp)

仅针对只有一个值的证书密钥返回结果。

看来我不能使用列表作为字典键的值。

任何有关如何替换这些值的建议将不胜感激。 罗纳德

评论
  • 已注销
    已注销 回复

    One idea is convert one element values to one element lists like "diploma" to ["diploma"]:

    mapp={"Bachelor's Degree":["Bachelors Degree","Bachelors","BBA","Bachelors Degree"],
          "Ordinary Diploma":["diploma"],
          "Ordinary Level":["O - Level","O-Level","O- Level"],
          "Master's Degree":["Masters Degree","Masters","Msc Environment","Masters"],
          "Certificate":["certificate"],
          "Advanced Level":["A - Level","A-Level","- Level"],
          "Post Graduate Diploma":["Post Graduate Diploma","PGD"],
          "PHD":["Ph.D","PhD"]    
         }
    
    #swap key values in dict
    #http://stackoverflow.com/a/31674731/2901002
    d = {k: oldk for oldk, oldv in mapp.items() for k in oldv}
    
    df['EDUCATION_LEVEL']=df['EDUCATION_LEVEL'].map(d)
    

    如果不可能,请使用:

    mapp={"Bachelor's Degree":["Bachelors Degree","Bachelors","BBA","Bachelors Degree"],
          "Ordinary Diploma":"diploma",
          "Ordinary Level":["O - Level","O-Level","O- Level"],
          "Master's Degree":["Masters Degree","Masters","Msc Environment","Masters"],
          "Certificate":"certificate",
          "Advanced Level":["A - Level","A-Level","- Level"],
          "Post Graduate Diploma":["Post Graduate Diploma","PGD"],
          "PHD":["Ph.D","PhD"]    
         }
    
    d = {}
    for k, v in mapp.items():
        if isinstance(v, list):
            for x in v:
                d[x] = k
        else:
            d[v] = k
    
    
    df['EDUCATION_LEVEL']=df['EDUCATION_LEVEL'].map(d)