使用awk,有没有一种简单的方法可以用空格对字段进行分组

 收藏

我有一个包含如下数据的文件:

New York  100 2 17 12
California 200 10 8 3
Montana   50 25  3 0

我希望状态名称被视为单个字段,然后计算字段2的百分比是字段3,并忽略其他字段。

所以我希望输出是

New York  2%
California 5%
Montana   50%

I can obtain the state name thusly awk -F [0-9] '{print $1}'

但是其余字段完全无法使用。

如果我不理会字段分隔符,New和York将获得单独的字段编号,而其他字段编号则“相加一”。

我可以用awk做到这一点吗,还是应该切换到红宝石,我对此有所了解。

回复
  • 您可以将最后一个字段用作参考点。需要gawk / mawk丢弃最后四个字段:

    $ awk '{p=$(NF-2)*100/$(NF-3); NF-=4; print ($0"\t"p"%")}' file
    New York   2%
    California 5%
    Montana    50%
    

  • You can do it fairly easily in awk. The trick is the find the first field that begins with a digit so you can accommodate names like "New York". For example

    awk '{
        n=0; name=""
        for(i=1;i<=NF;i++)
            if($i ~ /^[0-9]/) {
                n=i; break
            }
            else
                name=name?name" "$i:$i
        print name, $(n+1)/$n*100"%"
    }' file
    

    Where the variable n is used to capture the field-number for the first field beginning with a digit by looping over each field and comparing the first character to [0-9]. If the test is true, n is set to i and the loop is broken, otherwise the character field is concatenated with name.. (this assumes you have 2 fields with numbers)

    将其与您的数据放在一起,您将获得:

    $ awk '{
    >     n=0; name=""
    >     for(i=1;i<=NF;i++)
    >         if($i ~ /^[0-9]/) {
    >             n=i; break
    >         }
    >         else
    >             name=name?name" "$i:$i
    >     print name, $(n+1)/$n*100"%"
    > }' file
    New York 2%
    California 5%
    Montana 50%
    

  • 假设最后总是有固定数量的字段,那么您可以根据以下记录使用该信息即时调整字段:

    pax> echo; printf 'New York 100 2 17 12\nCalifornia 200 10 8 3\nMontana 50 25 3 0\n' | awk '
    +++> {while(NF>5){$1=$1" "$2;for(i=2;i<NF;i++){$i=$(i+1)};$NF="";NF=NF-1};print $1","$2","$3","$4}'
    
    New York,100,2,17
    California,200,10,8
    Montana,50,25,3
    

    You can see by the , separators that field 1 has been combined from the two fields New and York. Examining that script in detail:

    while (NF > 5) {                 # Loop until entire name combined into field 1.
        $1 = $1" "$2                 # Join field 1 and 2.
        for (i = 2; i < NF; i++) {   # For every field 2 onward.
            $i = $(i+1)              # Copy following field to this field,
        }                            #     includes blanking last field.
        NF = NF - 1                  # Reduce field count.
    }
    # At this point field1 is whole name and fields 2-5 are values.