I have just started out with Pandas and I am trying to do a multilevel sorting of data by columns. I have four columns in my data: STNAME, CTYNAME, CENSUS2010POP, SUMLEV. I want to set the index of my data by columns: STNAME, CTYNAME and then sort the data by CENSUS2010POP. After I set the index the appears like in pic 1 (before sorting by CENSUS2010POP) and when I sort and the data appears like pic 2 (After sorting). You can see Indices are messy and no longer sorted serially.
I have read out a few posts including this one (Sorting a multi-index while respecting its index structure) which dates back to five years ago and does not work while I write them. I am yet to learn the group by function.
你能告诉我我可以做到这一点的方法吗?
ps:我来自会计/金融背景,对编码非常陌生。我刚刚完成了包括PY4E.com在内的两个Python课程
使用下面的代码来设置索引
census_dfq6 = census_dfq6.set_index(['STNAME','CTYNAME'])
并且,使用以下代码对数据进行排序:
census_dfq6 = census_dfq6.sort_values (by = ['CENSUS2010POP'], ascending = [False] )
我正在使用示例数据,我想共享csv文件,但是我看不到共享此文件的方法。
STNAME,CTYNAME,CENSUS2010POP,SUMLEV
Alabama,Autauga County,54571,50
Alabama,Baldwin County,182265,50
Alabama,Barbour County,27457,50
Alabama,Bibb County,22915,50
Alabama,Blount County,57322,50
Alaska,Aleutians East Borough,3141,50
Alaska,Aleutians West Census Area,5561,50
Alaska,Anchorage Municipality,291826,50
Alaska,Bethel Census Area,17013,50
Wyoming,Platte County,8667,50
Wyoming,Sheridan County,29116,50
Wyoming,Sublette County,10247,50
Wyoming,Sweetwater County,43806,50
Wyoming,Teton County,21294,50
Wyoming,Uinta County,21118,50
Wyoming,Washakie County,8533,50
Wyoming,Weston County,7208,50