In spite of the large number of existing methods, pedestrian detection remains an open challenge. In recent years, deep learning classification methods combined with multi-modality images within different fusion schemes have achieved the best performance. It was proven that the late-fusion scheme outperforms both direct and intermediate integration of modalities for pedestrian recognition. Hence, in this paper, we focus on improving the late-fusion scheme for pedestrian classification on the Daimler stereo vision data set. Each image modality, Intensity, Depth and Flow, is classified by an independent Convolutional Neural Network (CNN), the outputs of which are then fused by a Multi-layer Perceptron (MLP) before the recognition decision. We propose different methods based on Cross-Modality deep learning of CNNs: (1) a correlated model where a unique CNN is trained with Intensity, Depth and Flow images for each frame, (2) an incremental model where a CNN is trained with the first modality images frames, then a second CNN, initialized by transfer learning on the first one is trained on the second modality images frames, and finally a third CNN initialized on the second one, is trained on the last modality images frames. The experiments show that the incremental cross-modality deep learning of CNNs improves classification performances not only for each independent modality classifier, but also for the multi-modality classifier based on late-fusion. Different learning algorithms are also investigated.
Incremental Cross-Modality deep learning for pedestrian recognition
2017-06-01
461507 byte
Aufsatz (Konferenz)
Elektronische Ressource
Englisch
Deep learning‐based real‐time fine‐grained pedestrian recognition using stream processing
Wiley | 2018
|Deep learning-based real-time fine-grained pedestrian recognition using stream processing
IET | 2018
|