目标检测中AP和mAP计算详解（代码全解） - 码农俱乐部 - Golang中国

点击上方“机器学习与python集中营”，星标公众号

重磅干货，第一时间送达

☞机器学习、深度学习、python全栈开发干货

作者：JimmyHua

来源：https://zhuanlan.zhihu.com/p/70667071

转自：CVer

定义

Accuracy：准确率

✔️ 准确率=预测正确的样本数/所有样本数，即预测正确的样本比例（包括预测正确的正样本和预测正确的负样本，不过在目标检测领域，没有预测正确的负样本这一说法，所以目标检测里面没有用Accuracy的）。

Precision：查准率

✔️ recision表示某一类样本预测有多准。

✔️ Precision针对的是某一类样本，如果没有说明类别，那么Precision是毫无意义的（有些地方不说明类别，直接说Precision，是因为二分类问题通常说的Precision都是正样本的Precision）。

Recall：召回率

✔️ Recall和Precision一样，脱离类别是没有意义的。说道Recall，一定指的是某个类别的Recall。Recall表示某一类样本，预测正确的与所有Ground Truth的比例。

✍️ Recall计算的时候，分母是Ground Truth中某一类样本的数量，而Precision计算的时候，是预测出来的某一类样本数。

F1 Score：平衡F分数

F1分数，它被定义为查准率和召回率的调和平均数

更加广泛的会定义分数，其中和分数在统计学在常用，并且，分数中，召回率的权重大于查准率，而分数中，则相反。

AP: Average Precision

以Recall为横轴，Precision为纵轴，就可以画出一条PR曲线，PR曲线下的面积就定义为AP，即：

由于计算积分相对困难，因此引入插值法，计算AP公式如下：

计算面积：

原理：

代码详解

computer_mAP.py

from voc_eval import voc_eval
import os

mAP = []
# 计算每个类别的AP
for i in range(8):
    class_name = str(i)  # 这里的类别名称为0,1,2,3,4,5,6,7
    rec, prec, ap = voc_eval( path/{}.txt ,  path/Annotations/{}.xml ,  path/test.txt , class_name,  ./ )
    print("{} :	 {} ".format(class_name, ap))
    mAP.append(ap)

mAP = tuple(mAP)

print("***************************")
# 输出总的mAP
print("mAP :	 {}".format( float( sum(mAP)/len(mAP)) ))

AP计算

import numpy as np

def voc_ap(rec, prec, use_07_metric=False):
    """ ap = voc_ap(rec, prec, [use_07_metric])
    Compute VOC AP given precision and recall.
    If use_07_metric is true, uses the
    VOC 07 11 point method (default:False).
    """
    # 针对2007年VOC，使用的11个点计算AP，现在不使用
    if use_07_metric:
        # 11 point metric
        ap = 0.
        for t in np.arange(0., 1.1, 0.1):
            if np.sum(rec >= t) == 0:
                p = 0
            else:
                p = np.max(prec[rec >= t])
            ap = ap + p / 11.
    else:
        # correct AP calculation
        # first append sentinel values at the end
        mrec = np.concatenate(([0.], rec, [1.]))  #[0.  0.0666, 0.1333, 0.4   , 0.4666,  1.]
        mpre = np.concatenate(([0.], prec, [0.])) #[0.  1.,     0.6666, 0.4285, 0.3043,  0.]

        # compute the precision envelope
        # 计算出precision的各个断点(折线点)
        for i in range(mpre.size - 1, 0, -1):
            mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])  #[1.     1.     0.6666 0.4285 0.3043 0.    ]

        # to calculate area under PR curve, look for points
        # where X axis (recall) changes value
        i = np.where(mrec[1:] != mrec[:-1])[0]  #precision前后两个值不一样的点
        print(mrec[1:], mrec[:-1])
        print(i) #[0, 1, 3, 4, 5]

        # AP= AP1 + AP2+ AP3+ AP4
        ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
    return ap

rec = np.array([0.0666, 0.1333,0.1333, 0.4, 0.4666])
prec = np.array([1., 0.6666, 0.6666, 0.4285, 0.3043])
ap = voc_ap(rec, prec)

print(ap) #输出：0.2456

voc_eval详解

1. Annotation

<annotation>
    <folder>VOC2007</folder>
    <filename>009961.jpg</filename>
    <source>
        <database>The VOC2007 Database</database>
        <annotation>PASCAL VOC2007</annotation>
        <image>flickr</image>
        <flickrid>334575803</flickrid>
    </source>
    <owner>
        <flickrid>dictioncanary</flickrid>
        <name>Lucy</name>
    </owner>
    <size><!--image shape-->
        <width>500</width>
        <height>374</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented><!--是否有分割label-->
    <object>
        <name>dog</name> <!--类别-->
        <pose>Unspecified</pose><!--物体的姿态-->
        <truncated>0</truncated><!--物体是否被部分遮挡（>15%）-->
        <difficult>0</difficult><!--是否为难以辨识的物体， 主要指要结体背景才能判断出类别的物体。虽有标注， 但一般忽略这类物体-->
        <bndbox><!--bounding box-->
            <xmin>69</xmin>
            <ymin>4</ymin>
            <xmax>392</xmax>
            <ymax>345</ymax>
        </bndbox>
    </object>
</annotation>

2. Prediction

<image id> <confidence> <left> <top> <right> <bottom>

Example

class_0.txt:
000004 0.702732 89 112 516 466
000006 0.870849 373 168 488 229
000006 0.852346 407 157 500 213
000006 0.914587 2 161 55 221
000008 0.532489 175 184 232 201

3. Eval

# --------------------------------------------------------
# Fast/er R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Bharath Hariharan
# --------------------------------------------------------

import xml.etree.ElementTree as ET
import os
import pickle
import numpy as np

# 读取annotation里面的label数据
def parse_rec(filename):
    """ Parse a PASCAL VOC xml file """
    tree = ET.parse(filename)
    objects = []
    for obj in tree.findall( object ):
        obj_struct = {}
        obj_struct[ name ] = obj.find( name ).text
        obj_struct[ pose ] = obj.find( pose ).text
        obj_struct[ truncated ] = int(obj.find( truncated ).text)
        obj_struct[ difficult ] = int(obj.find( difficult ).text)
        bbox = obj.find( bndbox )
        obj_struct[ bbox ] = [int(bbox.find( xmin ).text),
                              int(bbox.find( ymin ).text),
                              int(bbox.find( xmax ).text),
                              int(bbox.find( ymax ).text)]
        objects.append(obj_struct)

    return objects

# 计算AP，参考前面介绍
def voc_ap(rec, prec, use_07_metric=False):
    """ ap = voc_ap(rec, prec, [use_07_metric])
    Compute VOC AP given precision and recall.
    If use_07_metric is true, uses the
    VOC 07 11 point method (default:False).
    """
    if use_07_metric:
        # 11 point metric
        ap = 0.
        for t in np.arange(0., 1.1, 0.1):
            if np.sum(rec >= t) == 0:
                p = 0
            else:
                p = np.max(prec[rec >= t])
            ap = ap + p / 11.
    else:
        # correct AP calculation
        # first append sentinel values at the end
        mrec = np.concatenate(([0.], rec, [1.]))
        mpre = np.concatenate(([0.], prec, [0.]))

        # compute the precision envelope
        for i in range(mpre.size - 1, 0, -1):
            mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])

        # to calculate area under PR curve, look for points
        # where X axis (recall) changes value
        i = np.where(mrec[1:] != mrec[:-1])[0]

        # and sum (Delta recall) * prec
        ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
    return ap

# 主函数，读取预测和真实数据，计算Recall, Precision, AP
def voc_eval(detpath,
             annopath,
             imagesetfile,
             classname,
             cachedir,
             ovthresh=0.5,
             use_07_metric=False):
    """rec, prec, ap = voc_eval(detpath,
                                annopath,
                                imagesetfile,
                                classname,
                                [ovthresh],
                                [use_07_metric])
    Top level function that does the PASCAL VOC evaluation.
    detpath: Path to detections
        detpath.format(classname) 需要计算的类别的txt文件路径.
    annopath: Path to annotations
        annopath.format(imagename) label的xml文件所在的路径
    imagesetfile: 测试txt文件，里面是每个测试图片的地址，每行一个地址
    classname: 需要计算的类别
    cachedir: 缓存标注的目录
    [ovthresh]: IOU重叠度 (default = 0.5)
    [use_07_metric]: 是否使用VOC07的11点AP计算(default False)
    """
    # assumes detections are in detpath.format(classname)
    # assumes annotations are in annopath.format(imagename)
    # assumes imagesetfile is a text file with each line an image name
    # cachedir caches the annotations in a pickle file

    # first load gt 加载ground truth。
    if not os.path.isdir(cachedir):
        os.mkdir(cachedir)
    cachefile = os.path.join(cachedir,  annots.pkl )
    # read list of images
    with open(imagesetfile,  r ) as f:
        lines = f.readlines()
    #所有文件名字。
    imagenames = [os.path.basename(x.strip()).split( .jpg )[0] for x in lines]

    #如果cachefile文件不存在，则写入
    if not os.path.isfile(cachefile):
        # load annots
        recs = {}
        for i, imagename in enumerate(imagenames):
            recs[imagename] = parse_rec(annopath.format(imagename))
            if i % 100 == 0: # 进度条
                print(  Reading annotation for {:d}/{:d} .format(
                    i + 1, len(imagenames)))
        # save
        print(  Saving cached annotations to {:s} .format(cachefile))
        with open(cachefile,  wb ) as f:
            #写入cPickle文件里面。写入的是一个字典，左侧为xml文件名，右侧为文件里面个各个参数。
            pickle.dump(recs, f)
    else:
        # load
        with open(cachefile,  rb ) as f:
            recs = pickle.load(f)

    # 对每张图片的xml获取函数指定类的bbox等
    class_recs = {}  # 保存的是 Ground Truth的数据
    npos = 0
    for imagename in imagenames:
        # 获取Ground Truth每个文件中某种类别的物体
        R = [obj for obj in recs[imagename] if obj[ name ] == classname]

        bbox = np.array([x[ bbox ] for x in R])
        #  different基本都为0/False.
        difficult = np.array([x[ difficult ] for x in R]).astype(np.bool)
        det = [False] * len(R) #list中形参len(R)个False。
        npos = npos + sum(~difficult) #自增，~difficult取反,统计样本个数

        # 记录Ground Truth的内容
        class_recs[imagename] = { bbox : bbox,
                                  difficult : difficult,
                                  det : det}

    # read dets 读取某类别预测输出
    detfile = detpath.format(classname)

    with open(detfile,  r ) as f:
        lines = f.readlines()

    splitlines = [x.strip().split(   ) for x in lines]
    image_ids = [x[0].split( . )[0] for x in splitlines]  # 图片ID

    confidence = np.array([float(x[1]) for x in splitlines]) # IOU值
    BB = np.array([[float(z) for z in x[2:]] for x in splitlines]) # bounding box数值

    # 对confidence的index根据值大小进行降序排列。
    sorted_ind = np.argsort(-confidence) 
    sorted_scores = np.sort(-confidence)
    BB = BB[sorted_ind, :] #重排bbox，由大概率到小概率。
    image_ids = [image_ids[x] for x in sorted_ind] # 图片重排，由大概率到小概率。

    # go down dets and mark TPs and FPs
    nd = len(image_ids) 

    tp = np.zeros(nd)
    fp = np.zeros(nd)
    for d in range(nd):
        R = class_recs[image_ids[d]]  #ann

           
        1. 如果预测输出的是(x_min, y_min, x_max, y_max)，那么不需要下面的top,left,bottom, right转换
        2. 如果预测输出的是(x_center, y_center, h, w),那么就需要转换
        3. 计算只能使用[left, top, right, bottom],对应lable的[x_min, y_min, x_max, y_max]
           
        bb = BB[d, :].astype(float)

        # 转化为(x_min, y_min, x_max, y_max)
        top = int(bb[1]-bb[3]/2)
        left = int(bb[0]-bb[2]/2)
        bottom = int(bb[1]+bb[3]/2)
        right = int(bb[0]+bb[2]/2)
        bb = [left, top, right, bottom]

        ovmax = -np.inf  # 负数最大值
        BBGT = R[ bbox ].astype(float)

        if BBGT.size > 0:
            # compute overlaps
            # intersection
            ixmin = np.maximum(BBGT[:, 0], bb[0])
            iymin = np.maximum(BBGT[:, 1], bb[1])
            ixmax = np.minimum(BBGT[:, 2], bb[2])
            iymax = np.minimum(BBGT[:, 3], bb[3])
            iw = np.maximum(ixmax - ixmin + 1., 0.)
            ih = np.maximum(iymax - iymin + 1., 0.)
            inters = iw * ih

            # union
            uni = ((bb[2] - bb[0] + 1.) * (bb[3] - bb[1] + 1.) +
                   (BBGT[:, 2] - BBGT[:, 0] + 1.) *
                   (BBGT[:, 3] - BBGT[:, 1] + 1.) - inters)

            overlaps = inters / uni
            ovmax = np.max(overlaps) # 最大重叠
            jmax = np.argmax(overlaps) # 最大重合率对应的gt
        # 计算tp 和 fp个数
        if ovmax > ovthresh:
            if not R[ difficult ][jmax]:
                # 该gt被置为已检测到，下一次若还有另一个检测结果与之重合率满足阈值，则不能认为多检测到一个目标
                if not R[ det ][jmax]: 
                    tp[d] = 1.
                    R[ det ][jmax] = 1 #标记为已检测
                else:
                    fp[d] = 1.
        else:
            fp[d] = 1.
        print("**************")

    # compute precision recall
    fp = np.cumsum(fp)  # np.cumsum() 按位累加
    tp = np.cumsum(tp)
    rec = tp / float(npos)

    # avoid divide by zero in case the first detection matches a difficult
    # ground truth
    # np.finfo(np.float64).eps 为大于0的无穷小
    prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps) 
    ap = voc_ap(rec, prec, use_07_metric)

    return rec, prec, ap

参考

???- 评估函数eval.py

https://www.cnblogs.com/JZ-Ser/articles/7846399.html

???- voc_eval.py 解析

https://blog.csdn.net/shawncheer/article/details/78317711

● 公众号小编，别再瞎努力了！

● 月薪3000与3万的文章排版，究竟差在哪？

● 答应我！看完这篇后千万别取关！

● 公众号小编，别再瞎努力了！

● 月薪3000与3万的文章排版，究竟差在哪？

● 答应我！看完这篇后千万别取关！

机器学习与python集中营

有趣的灵魂在等你

长按扫码可关注

麻烦给我一个在看！