Hidden-layer Ensemble Fusion of MLP Neural Networks for Pedestrian Detection

Kyaw Kyaw Htike

Abstract


Being able to detect pedestrians is a crucial task for intelligent agents especially for autonomous vehicles, robots navigating in cities, machine vision, automatic traffic control in smart cities, and public safety and security. Various sophisticated pedestrian detection systems have been presented in literature and most of the state-of-the-art systems have two main components: feature extraction and classification. Over the past decade, the majority of the attention has been paid to feature extraction. In this paper, we show that much can be gained by having a high-performing classification algorithm, and changing only the classification component of the detection pipeline while fixing the feature extraction mechanism constant, we show reduction in pedestrian detection error (in terms of log-average miss rate) by over 40%. To be specific, we propose a novel algorithm for generating a compact and efficient ensemble of Multi-layer Perceptron neural networks that is well-suited for pedestrian detection both in terms of detection accuracy and speed. We demonstrate the efficacy of our proposed method by comparing with several state-of-the-art pedestrian detection algorithms.

Full Text:

PDF

References


Constantine Papageorgiou and Tomaso Poggio. A trainable system for object detection. International Journal of Computer Vision, 38(1):15–33, 2000.

Paul Viola and Michael Jones. Rapid object detection using a boosted cascade of simple features. In Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), volume 1, pages I–511. IEEE, 2001.

Navneet Dalal and Bill Triggs. Histograms of oriented gradients for human detection. In Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 886–893, 2005.

Bastian Leibe, Ales Leonardis, and Bernt Schiele. Combined object categorization and segmentation with an Implicit Shape Model. In Workshop on Statistical Learning in Computer Vision, ECCV, volume 2, pages 7–14, 2004.

Dana H Ballard. Generalizing the hough transform to detect arbitrary shapes. Pattern recognition, 13(2):111–122, 1981.

Krystian Mikolajczyk and Cordelia Schmid. A performance evaluation of local descriptors. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 27(10):1615–1630, 2005.

Serge Belongie, Jitendra Malik, and Jan Puzicha. Matching shapes. In Computer Vision, 2001. ICCV 2001. Proceedings. Eighth IEEE International Conference on, volume 1, pages 454–461. IEEE, 2001.

Pedro Felzenszwalb, David McAllester, and Deva Ramanan. A discriminatively trained, multiscale, deformable part model. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1–8. IEEE, 2008.

Pedro F Felzenszwalb, Ross B Girshick, David McAllester, and Deva Ramanan. Object detection with discriminatively trained part-based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(9):1627–1645, 2010.

Pedro F Felzenszwalb, Ross B Girshick, and David McAllester. Cascade object detection with deformable part models. In Computer vision and pattern recognition (CVPR), 2010 IEEE conference on, pages 2241–2248. IEEE, 2010.

Ross B Girshick, Pedro F Felzenszwalb, and David A Mcallester. Object detection with grammar models. In Proceedings of Advances in Neural Information Processing Systems (NIPS), pages 442–450, 2011.

William Robson Schwartz, Aniruddha Kembhavi, David Harwood, and Larry S Davis. Human detection using Partial Least Squares analysis. In Computer vision, 2009 IEEE 12th international conference on, pages 24–31. IEEE, 2009.

Piotr Dollár, Zhuowen Tu, Pietro Perona, and Serge Belongie. Integral channel features. 2009.

Piotr Dollár, Serge Belongie, and Pietro Perona. The fastest pedestrian detector in the West. In BMVC, volume 2, page 7. Citeseer, 2010.

Rodrigo Benenson, Markus Mathias, Radu Timofte, and Luc Van Gool. Pedestrian detection at 100 frames per second. In Proceedings of the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 2903–2910. IEEE, 2012.

Rodrigo Benenson, Markus Mathias, Tinne Tuytelaars, and Luc Gool. Seeking the strongest rigid detector. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3666–3673, 2013.

E Filippi, M Costa, and E Pasero. Multi-layer perceptron ensembles for increased performance and fault-tolerance in pattern recognition tasks. In Neural Networks, 1994. IEEE World Congress on Computational Intelligence., 1994 IEEE International Conference on, volume 5, pages 2901–2906. IEEE, 1994.

Pablo M Granitto, Pablo F Verdes, and H Alejandro Ceccatto. Neural network ensembles: evaluation of aggregation algorithms. Artificial Intelligence, 163(2):139–162, 2005.

Lars Kai Hansen and Peter Salamon. Neural network ensembles. IEEE Transactions on Pattern Analysis & Machine Intelligence, (10):993–1001, 1990.

Anders Krogh, Jesper Vedelsby, et al. Neural network ensembles, cross validation, and active learning. Advances in neural information processing systems, 7:231–238, 1995.

Zhi-Hua Zhou, Jianxin Wu, and Wei Tang. Ensembling neural networks: many could be better than all. Artificial intelligence, 137(1):239–263, 2002.

Dong C. Liu, Jorge Nocedal, and Dong C. On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45:503–528, 1989.

Piotr Dollár, Christian Wojek, Bernt Schiele, and Pietro Perona. Pedestrian detection: An evaluation of the state of the art. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(4):743–761, 2012.

Rodrigo Benenson, Mohamed Omran, Jan Hosang, and Bernt Schiele. Ten years of pedestrian detection, what have we learned? In Computer Vision-ECCV 2014 Workshops, pages 613–627. Springer, 2014.

Subhransu Maji, Alexander C Berg, and Jitendra Malik. Classification using intersection kernel support vector machines is efficient. In Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on, pages 1–8. IEEE, 2008.

Saso Džeroski and Bernard Ženko. Is combining classifiers with stacking better than selecting the best one? Machine learning, 54(3):255–273, 2004.

Mark Everingham, Luc J. Van Gool, Christopher K. I. Williams, John M. Winn, and Andrew Zisserman. The Pascal Visual Object Classes (VOC) challenge. International Journal of Computer Vision, 88(2):303–338, 2010.




Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.