ROD

Abstract

Detection of rare objects (e.g., traffic cones, traffic barrels and traffic warning triangles) is an important perception task to improve the safety of autonomous driving. Training of such models typically requires a large number of annotated data which is expensive and time consuming to obtain. To address the above problem, an emerging approach is to apply data augmentation to automatically generate costfree training samples. In this work, we propose a systematic study on simple Copy-Paste data augmentation for rare object detection in autonomous driving. Specifically, local adaptive instance-level image transformation is introduced to generate realistic rare object masks from source domain to the target domain. Moreover, traffic scene context is utilized to guide the placement of masks of rare objects. To this end, our data augmentation generates training data with high quality and realistic characteristics by leveraging both local and global consistency. In addition, we build a new dataset named ROD consisting 10k training images, 4k validation images and the corresponding labels with a diverse range of scenarios in autonomous driving. Experiments on ROD dataset show that our method achieves promising results on rare object detection. We also present a thorough study to illustrate the effectiveness of our local-adaptive and global constraints based Copy-Paste data augmentation for rare object detection. For more details, please refer to our paper.

Reference

Traffic Context Aware Data Augmentation for Rare Object Detection in Autonomous Driving},
Naifan Li, Fan Song, Ying Zhang, Pengpeng Liang, Erkang Cheng
ICRA, 2022

ROD is a new diverse real-world dataset from Nullmax, consisting 10k training images (640x384), 4k validation images (640x384) and corresponding 2D bounding box annotations with 5 representative object categories (car, truck, bus, pedestrian, bicycle). The dataset also covers a diverse range of scenarios, such as different road grades (e.g. highway, expressway, city street and country road), different weathers (e.g. sunny, cloudy and rainy) and different times of day (e.g. daytime, evening and night). In addition, we also make 1k traffic cone masks, 100 traffic barrel masks, and 50 traffic warning triangle masks available to the community.

Below is the statistics of our ROD dataset:

Downloads

Google Drive and Baidu Cloud links are available for the downloading of training and testing data.

For each training images, we manually annotate the 2D bounding box annotations with 5 representative object catgories. In addition, the global traffic scene contexts (e.g., freespace, common objects and traffic lanes) are obtained from a multi-task deep model containing three heads: (1) one semantic segmentation head for freespace segmentation; (2) another instance segmentation head for traffic lane segmentation and (3) a detection head for common road users detection.

Training data

Data Type	Google Drive	Baidu Cloud
Image	Download	Download (Extraction Code: 1w2g)
2D label	Download	Download (Extraction Code: akmg)

Testing data

Data Type	Google Drive	Baidu Cloud
Image	Download	Download (Extraction Code: ivtv)
2D label	Download	Download (Extraction Code: e9sc)

Supplementary data

Data Type	Google Drive	Baidu Cloud
Traffic cone mask	Download	Download (Extraction Code: mjda)
Traffic barrel mask	Download	Download (Extraction Code: vs8x)
Traffic warning triangle mask	Download	Download (Extraction Code: 5wqf)

Devkit

Please refer to cocoapi for evaluation. In addition, we provide a view.py script to enable you familiar with this dataset.

Contact

{chengerkang, linaifan, zhangying}@nullmax.ai

Citation

@article{li2022traffic,
 title={Traffic Context Aware Data Augmentation for Rare Object Detection in Autonomous Driving},
 author={Li, Naifan and Song, Fan and Zhang, Ying and Liang, Pengpeng and Cheng, Erkang},
 booktitle={2022 IEEE International Conference on Robotics and Automation (ICRA)},
 year={2022},
 organization={IEEE}
}