Abstract:
Pedestrian detection has been an important topic of research due to its increasing demand in the surveillance based applications. Thermal and color images are used to detect pedestrian under different illumination conditions. Recently people have used saliency maps to augment the images as an attention mechanism. This work employs different saliency based networks to evaluate their performances when used for augmentation and to determine the kind of saliency networks which derive better results in combination with Faster R-CNN. It also proposes an enhanced version of the KAIST multispectral dataset with corrected and extended set of annotations for both color and thermal channels separately. Pixel-level annotations for saliency networks are also proposed for thermal and color channels separately by using a subset of KAIST dataset. A detailed analysis of the saliency network performance is presented in terms of precision, recall, F-measure and mean absolute error. A new metric "region-level F-measure" is introduced to study the efficacy of saliency networks while used for augmentation. This work also presents the best combinations of saliency network and Faster R-CNN detector for both thermal and color channels maintaining a trade-off between detection performance and computation speed. The proposed detectors outperform existing detectors of similar type.