With the growth of autonomous vehicles and collision‐avoidance systems, several approaches using deep learning and convolutional neural networks (CNNs) continually address accuracy improvement in obstacle detection. The authors introduce a three‐stage architecture that adds side channels as low‐level features to serve as input to existing CNNs. In a case study, the architecture is used to extract depth from stereo cameras, and then compose RGBD inputs to state‐of‐the‐art CNNs to improve their vehicle and pedestrian detection accuracy. This can be achieved by simple modifications on the first layers of any existing CNN with RGB inputs. To validate the architecture, the state‐of‐the‐art matching cost‐CNN, and cascade residual learning, both specialist algorithms to extract depth information combined to the state‐of‐the‐art Faster‐region‐based CNN, MSCNCN, and Subcategory‐aware Convolutional Neural Network (SubCNN) to yield the models to be tested using the KITTI dataset benchmark. In many cases, the accuracy (in terms of average precision) using their proposal outperforms the original scores in various scenarios of detection difficulty, reaching improvements up to +3.96% in the training and +1.50% in the testing KITTI datasets. This proposal also introduces efficient methods to initialise the weights of the depth convolutional filters during transfer learning using net surgery.
Three‐stage RGBD architecture for vehicle and pedestrian detection using convolutional neural networks and stereo vision
IET Intelligent Transport Systems ; 14 , 10 ; 1319-1327
2020-10-01
9 pages
Article (Journal)
Electronic Resource
English
KITTI dataset benchmark , cameras , transfer learning , low‐level features , RGBD inputs , neural nets , object detection , detection difficulty , collision avoidance , stereo image processing , image matching , autonomous vehicles , pedestrians , stage RGBD architecture , state‐of‐the‐art matching cost‐CNN , state‐of‐the‐art CNNs , depth convolutional filters , residual learning , stereo vision , collision‐avoidance systems , RGB inputs , three‐stage architecture , existing CNN , image colour analysis , depth information , convolutional neural networks , stereo cameras , testing KITTI datasets , feature extraction , deep learning , learning (artificial intelligence) , state‐of‐the‐art Faster‐region‐based CNN , pedestrian detection accuracy , object tracking , obstacle detection
Pedestrian Detection Using Stereo Night Vision
British Library Conference Proceedings | 2003
|Pedestrian detection using stereo night vision
IEEE | 2003
|Pedestrian Detection with Convolutional Neural Networks
British Library Conference Proceedings | 2005
|Pedestrian Detection using Stereo-vision and GraphKernels
British Library Conference Proceedings | 2005
|