I was looking at this graph from Learning Data Augmentation Strategies for Object Detection and I noticed that the value for mAP is lower than all of mAP_S, mAP_M, and mAP_L for the third set of points. I would think mAP would be the weighted sum of those, but obviously not. Can someone tell me exactly how these are calculated? In the previous section (4.4), they mention using PASCAL VOC and therefore using an IoU threshold of 0.5, but in this section (4.5) they are using COCO, so I would assume they’re using IoU thresholds of [.5:.95]. Is this correct?

enter image description here

