目标检测中AP和mAP计算详解(代码全解)


点击上方“机器学习与python集中营”,星标公众号
重磅干货,第一时间送达


机器学习、深度学习、python全栈开发干货
作者:JimmyHua
来源:https://zhuanlan.zhihu.com/p/70667071
转自:CVer




定义

Accuracy:准确率

✔️ 准确率=预测正确的样本数/所有样本数,即预测正确的样本比例(包括预测正确的正样本和预测正确的负样本,不过在目标检测领域,没有预测正确的负样本这一说法,所以目标检测里面没有用Accuracy的)。

Precision:查准率

✔️ recision表示某一类样本预测有多准。

✔️ Precision针对的是某一类样本,如果没有说明类别,那么Precision是毫无意义的(有些地方不说明类别,直接说Precision,是因为二分类问题通常说的Precision都是正样本的Precision)。

Recall:召回率

✔️ Recall和Precision一样,脱离类别是没有意义的。说道Recall,一定指的是某个类别的Recall。Recall表示某一类样本,预测正确的与所有Ground Truth的比例。

✍️ Recall计算的时候,分母是Ground Truth中某一类样本的数量,而Precision计算的时候,是预测出来的某一类样本数。

F1 Score:平衡F分数

F1分数,它被定义为查准率和召回率的调和平均数

更加广泛的会定义  分数,其中  和  分数在统计学在常用,并且,  分数中,召回率的权重大于查准率,而  分数中,则相反。

AP: Average Precision

以Recall为横轴,Precision为纵轴,就可以画出一条PR曲线,PR曲线下的面积就定义为AP,即:

PR曲线

由于计算积分相对困难,因此引入插值法,计算AP公式如下:

计算面积:

原理:

代码详解

computer_mAP.py

from voc_eval import voc_eval
import os

mAP = []
# 计算每个类别的AP
for i in range(8):
class_name = str(i) # 这里的类别名称为0,1,2,3,4,5,6,7
rec, prec, ap = voc_eval( path/{}.txt , path/Annotations/{}.xml , path/test.txt , class_name, ./ )
print("{} : {} ".format(class_name, ap))
mAP.append(ap)

mAP = tuple(mAP)

print("***************************")
# 输出总的mAP
print("mAP : {}".format( float( sum(mAP)/len(mAP)) ))

AP计算

import numpy as np

def voc_ap(rec, prec, use_07_metric=False):
""" ap = voc_ap(rec, prec, [use_07_metric])
Compute VOC AP given precision and recall.
If use_07_metric is true, uses the
VOC 07 11 point method (default:False).
"""

# 针对2007年VOC,使用的11个点计算AP,现在不使用
if use_07_metric:
# 11 point metric
ap = 0.
for t in np.arange(0., 1.1, 0.1):
if np.sum(rec >= t) == 0:
p = 0
else:
p = np.max(prec[rec >= t])
ap = ap + p / 11.
else:
# correct AP calculation
# first append sentinel values at the end
mrec = np.concatenate(([0.], rec, [1.])) #[0. 0.0666, 0.1333, 0.4 , 0.4666, 1.]
mpre = np.concatenate(([0.], prec, [0.])) #[0. 1., 0.6666, 0.4285, 0.3043, 0.]

# compute the precision envelope
# 计算出precision的各个断点(折线点)
for i in range(mpre.size - 1, 0, -1):
mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i]) #[1. 1. 0.6666 0.4285 0.3043 0. ]

# to calculate area under PR curve, look for points
# where X axis (recall) changes value
i = np.where(mrec[1:] != mrec[:-1])[0] #precision前后两个值不一样的点
print(mrec[1:], mrec[:-1])
print(i) #[0, 1, 3, 4, 5]

# AP= AP1 + AP2+ AP3+ AP4
ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
return ap

rec = np.array([0.0666, 0.1333,0.1333, 0.4, 0.4666])
prec = np.array([1., 0.6666, 0.6666, 0.4285, 0.3043])
ap = voc_ap(rec, prec)

print(ap) #输出:0.2456

voc_eval详解

1. Annotation

<annotation>
<folder>VOC2007</folder>
<filename>009961.jpg</filename>
<source>
<database>The VOC2007 Database</database>
<annotation>PASCAL VOC2007</annotation>
<image>flickr</image>
<flickrid>334575803</flickrid>
</source>
<owner>
<flickrid>dictioncanary</flickrid>
<name>Lucy</name>
</owner>
<size><!--image shape-->
<width>500</width>
<height>374</height>
<depth>3</depth>
</size>
<segmented>0</segmented><!--是否有分割label-->
<object>
<name>dog</name> <!--类别-->
<pose>Unspecified</pose><!--物体的姿态-->
<truncated>0</truncated><!--物体是否被部分遮挡(>15%)-->
<difficult>0</difficult><!--是否为难以辨识的物体, 主要指要结体背景才能判断出类别的物体。虽有标注, 但一般忽略这类物体-->
<bndbox><!--bounding box-->
<xmin>69</xmin>
<ymin>4</ymin>
<xmax>392</xmax>
<ymax>345</ymax>
</bndbox>
</object>
</annotation>

2. Prediction

<image id> <confidence> <left> <top> <right> <bottom>
Example
class_0.txt:
000004 0.702732 89 112 516 466
000006 0.870849 373 168 488 229
000006 0.852346 407 157 500 213
000006 0.914587 2 161 55 221
000008 0.532489 175 184 232 201

3. Eval

# --------------------------------------------------------
# Fast/er R-CNN
# Licensed under The MIT License [see LICENSE for details]
# Written by Bharath Hariharan
# --------------------------------------------------------

import xml.etree.ElementTree as ET
import os
import pickle
import numpy as np

# 读取annotation里面的label数据
def parse_rec(filename):
""" Parse a PASCAL VOC xml file """
tree = ET.parse(filename)
objects = []
for obj in tree.findall( object ):
obj_struct = {}
obj_struct[ name ] = obj.find( name ).text
obj_struct[ pose ] = obj.find( pose ).text
obj_struct[ truncated ] = int(obj.find( truncated ).text)
obj_struct[ difficult ] = int(obj.find( difficult ).text)
bbox = obj.find( bndbox )
obj_struct[ bbox ] = [int(bbox.find( xmin ).text),
int(bbox.find( ymin ).text),
int(bbox.find( xmax ).text),
int(bbox.find( ymax ).text)]
objects.append(obj_struct)

return objects

# 计算AP,参考前面介绍
def voc_ap(rec, prec, use_07_metric=False):
""" ap = voc_ap(rec, prec, [use_07_metric])
Compute VOC AP given precision and recall.
If use_07_metric is true, uses the
VOC 07 11 point method (default:False).
"""

if use_07_metric:
# 11 point metric
ap = 0.
for t in np.arange(0., 1.1, 0.1):
if np.sum(rec >= t) == 0:
p = 0
else:
p = np.max(prec[rec >= t])
ap = ap + p / 11.
else:
# correct AP calculation
# first append sentinel values at the end
mrec = np.concatenate(([0.], rec, [1.]))
mpre = np.concatenate(([0.], prec, [0.]))

# compute the precision envelope
for i in range(mpre.size - 1, 0, -1):
mpre[i - 1] = np.maximum(mpre[i - 1], mpre[i])

# to calculate area under PR curve, look for points
# where X axis (recall) changes value
i = np.where(mrec[1:] != mrec[:-1])[0]

# and sum (Delta recall) * prec
ap = np.sum((mrec[i + 1] - mrec[i]) * mpre[i + 1])
return ap

# 主函数,读取预测和真实数据,计算Recall, Precision, AP
def voc_eval(detpath,
annopath,
imagesetfile,
classname,
cachedir,
ovthresh=0.5,
use_07_metric=False):
"""rec, prec, ap = voc_eval(detpath,
annopath,
imagesetfile,
classname,
[ovthresh],
[use_07_metric])
Top level function that does the PASCAL VOC evaluation.
detpath: Path to detections
detpath.format(classname) 需要计算的类别的txt文件路径.
annopath: Path to annotations
annopath.format(imagename) label的xml文件所在的路径
imagesetfile: 测试txt文件,里面是每个测试图片的地址,每行一个地址
classname: 需要计算的类别
cachedir: 缓存标注的目录
[ovthresh]: IOU重叠度 (default = 0.5)
[use_07_metric]: 是否使用VOC07的11点AP计算(default False)
"""

# assumes detections are in detpath.format(classname)
# assumes annotations are in annopath.format(imagename)
# assumes imagesetfile is a text file with each line an image name
# cachedir caches the annotations in a pickle file

# first load gt 加载ground truth。
if not os.path.isdir(cachedir):
os.mkdir(cachedir)
cachefile = os.path.join(cachedir, annots.pkl )
# read list of images
with open(imagesetfile, r ) as f:
lines = f.readlines()
#所有文件名字。
imagenames = [os.path.basename(x.strip()).split( .jpg )[0] for x in lines]

#如果cachefile文件不存在,则写入
if not os.path.isfile(cachefile):
# load annots
recs = {}
for i, imagename in enumerate(imagenames):
recs[imagename] = parse_rec(annopath.format(imagename))
if i % 100 == 0: # 进度条
print( Reading annotation for {:d}/{:d} .format(
i + 1, len(imagenames)))
# save
print( Saving cached annotations to {:s} .format(cachefile))
with open(cachefile, wb ) as f:
#写入cPickle文件里面。写入的是一个字典,左侧为xml文件名,右侧为文件里面个各个参数。
pickle.dump(recs, f)
else:
# load
with open(cachefile, rb ) as f:
recs = pickle.load(f)

# 对每张图片的xml获取函数指定类的bbox等
class_recs = {} # 保存的是 Ground Truth的数据
npos = 0
for imagename in imagenames:
# 获取Ground Truth每个文件中某种类别的物体
R = [obj for obj in recs[imagename] if obj[ name ] == classname]

bbox = np.array([x[ bbox ] for x in R])
# different基本都为0/False.
difficult = np.array([x[ difficult ] for x in R]).astype(np.bool)
det = [False] * len(R) #list中形参len(R)个False。
npos = npos + sum(~difficult) #自增,~difficult取反,统计样本个数

# 记录Ground Truth的内容
class_recs[imagename] = { bbox : bbox,
difficult : difficult,
det : det}

# read dets 读取某类别预测输出
detfile = detpath.format(classname)

with open(detfile, r ) as f:
lines = f.readlines()

splitlines = [x.strip().split( ) for x in lines]
image_ids = [x[0].split( . )[0] for x in splitlines] # 图片ID

confidence = np.array([float(x[1]) for x in splitlines]) # IOU值
BB = np.array([[float(z) for z in x[2:]] for x in splitlines]) # bounding box数值

# 对confidence的index根据值大小进行降序排列。
sorted_ind = np.argsort(-confidence)
sorted_scores = np.sort(-confidence)
BB = BB[sorted_ind, :] #重排bbox,由大概率到小概率。
image_ids = [image_ids[x] for x in sorted_ind] # 图片重排,由大概率到小概率。

# go down dets and mark TPs and FPs
nd = len(image_ids)

tp = np.zeros(nd)
fp = np.zeros(nd)
for d in range(nd):
R = class_recs[image_ids[d]] #ann


1. 如果预测输出的是(x_min, y_min, x_max, y_max),那么不需要下面的top,left,bottom, right转换
2. 如果预测输出的是(x_center, y_center, h, w),那么就需要转换
3. 计算只能使用[left, top, right, bottom],对应lable的[x_min, y_min, x_max, y_max]

bb = BB[d, :].astype(float)

# 转化为(x_min, y_min, x_max, y_max)
top = int(bb[1]-bb[3]/2)
left = int(bb[0]-bb[2]/2)
bottom = int(bb[1]+bb[3]/2)
right = int(bb[0]+bb[2]/2)
bb = [left, top, right, bottom]

ovmax = -np.inf # 负数最大值
BBGT = R[ bbox ].astype(float)

if BBGT.size > 0:
# compute overlaps
# intersection
ixmin = np.maximum(BBGT[:, 0], bb[0])
iymin = np.maximum(BBGT[:, 1], bb[1])
ixmax = np.minimum(BBGT[:, 2], bb[2])
iymax = np.minimum(BBGT[:, 3], bb[3])
iw = np.maximum(ixmax - ixmin + 1., 0.)
ih = np.maximum(iymax - iymin + 1., 0.)
inters = iw * ih

# union
uni = ((bb[2] - bb[0] + 1.) * (bb[3] - bb[1] + 1.) +
(BBGT[:, 2] - BBGT[:, 0] + 1.) *
(BBGT[:, 3] - BBGT[:, 1] + 1.) - inters)

overlaps = inters / uni
ovmax = np.max(overlaps) # 最大重叠
jmax = np.argmax(overlaps) # 最大重合率对应的gt
# 计算tp 和 fp个数
if ovmax > ovthresh:
if not R[ difficult ][jmax]:
# 该gt被置为已检测到,下一次若还有另一个检测结果与之重合率满足阈值,则不能认为多检测到一个目标
if not R[ det ][jmax]:
tp[d] = 1.
R[ det ][jmax] = 1 #标记为已检测
else:
fp[d] = 1.
else:
fp[d] = 1.
print("**************")

# compute precision recall
fp = np.cumsum(fp) # np.cumsum() 按位累加
tp = np.cumsum(tp)
rec = tp / float(npos)

# avoid divide by zero in case the first detection matches a difficult
# ground truth
# np.finfo(np.float64).eps 为大于0的无穷小
prec = tp / np.maximum(tp + fp, np.finfo(np.float64).eps)
ap = voc_ap(rec, prec, use_07_metric)

return rec, prec, ap

参考

???- 评估函数eval.py

https://www.cnblogs.com/JZ-Ser/articles/7846399.html

???- voc_eval.py 解析

https://blog.csdn.net/shawncheer/article/details/78317711


● 公众号小编,别再瞎努力了!

 月薪3000与3万的文章排版,究竟差在哪?

 答应我!看完这篇后千万别取关!

● 公众号小编,别再瞎努力了!

 月薪3000与3万的文章排版,究竟差在哪?

 答应我!看完这篇后千万别取关!

机器学习与python集中营

有趣的灵魂在等你

长按扫码可关注


麻烦给我一个在看