Skip to main content

A Follow-the-Leader Strategy Using Hierarchical Deep Neural Networks with Grouped Convolutions

A Correction to this article was published on 13 April 2021

This article has been updated


The task of following-the-leader is implemented using a hierarchical deep neural network (DNN) end-to-end driving model to match the direction and speed of a target pedestrian. The model uses a classifier DNN to determine if the pedestrian is within the field of view of the camera sensor. If the pedestrian is present, the image stream from the camera is fed to a regression DNN which simultaneously adjusts the autonomous vehicle’s steering and throttle to keep cadence with the pedestrian. If the pedestrian is not visible, the vehicle uses a straightforward exploratory search strategy to reacquire the tracking objective. The classifier and regression DNNs incorporate grouped convolutions to boost model performance as well as to significantly reduce parameter count and compute latency. The models are trained on the intelligence processing unit (IPU) to leverage its fine-grain compute capabilities to minimize time-to-train. The results indicate very robust tracking behavior on the part of the autonomous vehicle in terms of its steering and throttle profiles, while requiring minimal data collection to produce. The throughput in terms of processing training samples has been boosted by the use of the IPU in conjunction with grouped convolutions by a factor \(\sim \,3.5\) for training of the classifier and a factor of \(\sim \,7\) for the regression network. A recording of the vehicle tracking a pedestrian has been produced and is available on the web.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Change history


  1. Video link:


  1. Baarath K, Zakaria MA, Suparmaniam M, MYb Abu. Platooning strategy of mobile robot: simulation and experiment. MATEC Web Conf. 2017;90:01060.

    Article  Google Scholar 

  2. Bojarski M, Testa DD, Dworakowski D, Firner B, Flepp B, Goyal P, Jackel LD, Monfort M, Muller U, Zhang J, Zhang X, Zhao J, Zieba K. End to end learning for self-driving cars. 2016. arXiv:1604.07316.

  3. Chowdhuri S, Pankaj T, Zipser K. Multi-modal multi-task deep learning for autonomous driving. CoRR. 2017. arXiv:abs/1709.05581.

  4. Goodchild A, Toy J. Delivery by drone: an evaluation of unmanned aerial vehicle technology in reducing CO\(_2\) emissions in the delivery service industry. Transp Res Part D Transp Environ. 2017.

    Article  Google Scholar 

  5. HDF5 2010; Hierarchical data format version 5. Accessed Feb 2020

  6. Iandola FN, Moskewicz MW, Ashraf K, Han S, Dally WJ, Keutzer K. Squeezenet: Alexnet-level accuracy with 50x fewer parameters and \(<1mb\) model size. CoRR. 2016. arXiv:abs/1602.07360.

  7. Iandola FN, Shaw AE, Krishna R, Keutzer KW. Squeezebert: What can computer vision teach nlp about efficient neural networks? CoRR. 2020. arXiv:abs/2006.11316.

  8. Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. 2015. arXiv:1502.03167.

  9. Jia Z, Tillman B, Maggioni M, Scarpazza DP. Dissecting the graphcore ipu architecture via microbenchmarking. 2019. arXiv:1912.03413.

  10. Klančar G, Matko D, Blazic S. Wheeled mobile robots control in a linear platoon. J Intell Robot Syst. 2009;54:709–31.

    Article  Google Scholar 

  11. Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ, editors. Advances in neural information processing systems 25. Red Hook: Curran Associates, Inc.; 2012. p. 1097–105.

    Google Scholar 

  12. Pomerleau D. Alvinn: an autonomous land vehicle in a neural network. In: Touretzky D, editor. Proceedings of advances in neural information processing systems 1. San Francisco: Morgan Kaufmann; 1989. p. 305–13.

    Google Scholar 

  13. Relles NJ, Patterson MR. AUVs (ROVs). Netherlands, Dordrecht: Springer; 2011. p. 71–5.

    Book  Google Scholar 

  14. Solomon J, Charette F. Hierarchical multi-task deep neural network architecture for end-to-end driving. CoRR. 2019. arXiv:abs/1902.03466.

  15. Tensorlfow. TensorFlow: large-scale machine learning on heterogeneous systems., software available from 2015.

  16. Xie S, Girshick R, Dollar P, Tu Z, He K. Aggregated residual transformations for deep neural networks. 2016. pp 5987–5995.

Download references


The authors would like to thank Venkatapathi Nallapa of Greenfield Labs, (Ford Motor Company), for his valuable support and encouragement during the effort to conduct this research.

Author information

Authors and Affiliations


Corresponding author

Correspondence to José Enrique Solomon.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: Due to incorrect term “mean squared-error (MSE)” in three instances under the section “Results” and sub-section “RN: Regression Performance”. Now, they have been corrected to "root-mean-square error (RMSE)".

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Solomon, J.E., Charette, F. A Follow-the-Leader Strategy Using Hierarchical Deep Neural Networks with Grouped Convolutions. SN COMPUT. SCI. 2, 147 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Deep learning
  • Computer vision
  • Grouped convolutions
  • Autonomous systems
  • Robotics