MARS: An Instance-Aware, Modular and Realistic Simulator for Autonomous Driving

Wu, Zirui; Liu, Tianyu; Luo, Liyi; Zhong, Zhide; Chen, Jianteng; Xiao, Hongmin; Hou, Chao; Lou, Haozhe; Chen, Yuantao; Yang, Runyi; Huang, Yuxin; Ye, Xiaoyu; Yan, Zike; Shi, Yongliang; Liao, Yiyi; Zhao, Hao

doi:10.1007/978-981-99-8850-1_1

Zirui Wu^11,12,
Tianyu Liu^11,13,
Liyi Luo^11,14,
Zhide Zhong^11,15,
Jianteng Chen^11,15,
Hongmin Xiao^11,16,
Chao Hou^11,17,
Haozhe Lou^11,18,
Yuantao Chen^11,19,
Runyi Yang^11,20,
Yuxin Huang^11,15,
Xiaoyu Ye^11,15,
Zike Yan¹¹,
Yongliang Shi¹¹,
Yiyi Liao²¹ &
…
Hao Zhao¹¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14473))

Included in the following conference series:

CAAI International Conference on Artificial Intelligence

326 Accesses
3 Citations

Abstract

Nowadays, autonomous cars can drive smoothly in ordinary cases, and it is widely recognized that realistic sensor simulation will play a critical role in solving remaining corner cases by simulating them. To this end, we propose an autonomous driving simulator based upon neural radiance fields (NeRFs). Compared with existing works, ours has three notable features: (1) Instance-aware. Our simulator models the foreground instances and background environments separately with independent networks so that the static (e.g., size and appearance) and dynamic (e.g., trajectory) properties of instances can be controlled separately. (2) Modular. Our simulator allows flexible switching between different modern NeRF-related backbones, sampling strategies, input modalities, etc. We expect this modular design to boost academic progress and industrial deployment of NeRF-based autonomous driving simulation. (3) Realistic. Our simulator set new state-of-the-art photo-realism results given the best module selection. Our simulator will be open-sourced while most of our counterparts are not. Project page: https://open-air-sun.github.io/mars/.

H. Zhao—Sponsored by Tsinghua-Toyota Joint Research Fund (20223930097).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Barron, J.T., Mildenhall, B., Tancik, M., Hedman, P., Martin-Brualla, R., Srinivasan, P.P.: Mip-NeRF: a multiscale representation for anti-aliasing neural radiance fields. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5835–5844 (2021)
Google Scholar
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-NeRF 360: unbounded anti-aliased neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. arXiv (2022)
Google Scholar
Cabon, Y., Murray, N., Humenberger, M.: Virtual KITTI 2. http://arxiv.org/abs/2001.10773
Chen, X., Zhao, H., Zhou, G., Zhang, Y.Q.: PQ-transformer: jointly parsing 3D objects and layouts from point clouds. IEEE Robot. Autom. Lett. 7(2), 2519–2526 (2022)
Article Google Scholar
Chen, Y., et al.: GeoSim: realistic video simulation via geometry-aware composition for self-driving. http://arxiv.org/abs/2101.06543
Deng, K., Liu, A., Zhu, J.Y., Ramanan, D.: Depth-supervised NeRF: fewer views and faster training for free. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12872–12881 (2022)
Google Scholar
Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: CARLA: an open urban driving simulator. In: Proceedings of the 1st Annual Conference on Robot Learning, pp. 1–16. PMLR (2017)
Google Scholar
Fridovich-Keil, S., Meanti, G., Warburg, F., Recht, B., Kanazawa, A.: K-planes: explicit radiance fields in space, time, and appearance. In: Computer Vision and Pattern Recognition (2023)
Google Scholar
Fu, X., et al.: Panoptic NeRF: 3D-to-2D label transfer for panoptic urban scene segmentation. In: 2022 International Conference on 3D Vision (3DV), pp. 1–11 (2022)
Google Scholar
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)
Article Google Scholar
Hu, Y., et al.: Planning-oriented autonomous driving. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17853–17862 (2023)
Google Scholar
Jin, B., et al.: ADAPT: action-aware driving caption transformer. In: 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 7554–7561 (2023)
Google Scholar
Kundu, A., et al.: Panoptic neural fields: a semantic object-aware neural scene representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12871–12881 (2022)
Google Scholar
Li, P., et al.: LODE: locally conditioned eikonal implicit scene completion from sparse LiDAR. In: 2023 IEEE International Conference on Robotics and Automation (ICRA). arXiv (2023)
Google Scholar
Li, W., et al.: AADS: augmented autonomous driving simulation using data-driven algorithms. Sci. Robot. 4(28), eaaw0863 (2019)
Google Scholar
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12346, pp. 405–421. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_24
Chapter Google Scholar
Müller, T., Evans, A., Schied, C., Keller, A.: Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph. 41(4), 1–15 (2022)
Article Google Scholar
Niemeyer, M., Geiger, A.: GIRAFFE: representing scenes as compositional generative neural feature fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11453–11464 (2021)
Google Scholar
Ost, J., Mannan, F., Thuerey, N., Knodt, J., Heide, F.: Neural scene graphs for dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. arXiv (2021)
Google Scholar
Rematas, K., et al.: Urban radiance fields. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12922–12932 (2022)
Google Scholar
Tancik, M., et al.: Nerfstudio: a modular framework for neural radiance field development. ACM Trans. Graph. 1(1) (2023)
Google Scholar
Tian, B., Liu, M., Gao, H.A., Li, P., Zhao, H., Zhou, G.: Unsupervised road anomaly detection with language anchors. In: 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 7778–7785 (2023)
Google Scholar
Tian, B., Luo, L., Zhao, H., Zhou, G.: VIBUS: data-efficient 3D scene parsing with VIewpoint Bottleneck and Uncertainty-Spectrum modeling. J. Photogramm. Remote Sens. 194, 302–318 (2022)
Article Google Scholar
Turki, H., Zhang, J.Y., Ferroni, F., Ramanan, D.: SUDS: scalable urban dynamic scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. arXiv (2023)
Google Scholar
Yang, Z., et al.: UniSim: a neural closed-loop sensor simulator. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1389–1399 (2023)
Google Scholar
Yu, Z., Peng, S., Niemeyer, M., Sattler, T., Geiger, A.: MonoSDF: exploring monocular geometric cues for neural implicit surface reconstruction. In: Advances in Neural Information Processing Systems (2022)
Google Scholar
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)
Google Scholar
Zheng, Y., et al.: STEPS: joint self-supervised nighttime image enhancement and depth estimation. In: 2023 IEEE Conference on Robotics and Automation (ICRA 2023) (2023)
Google Scholar
Zhi, S., Laidlow, T., Leutenegger, S., Davison, A.J.: In-place scene labelling and understanding with implicit scene representation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

AIR, Tsinghua University, Beijing, China
Zirui Wu, Tianyu Liu, Liyi Luo, Zhide Zhong, Jianteng Chen, Hongmin Xiao, Chao Hou, Haozhe Lou, Yuantao Chen, Runyi Yang, Yuxin Huang, Xiaoyu Ye, Zike Yan, Yongliang Shi & Hao Zhao
System Hub, HKUST(GZ), Guangzhou, China
Zirui Wu
HKUST, Hong Kong SAR, China
Tianyu Liu
McGill University, Montreal, Canada
Liyi Luo
Beijing Institute of Technology, Beijing, China
Zhide Zhong, Jianteng Chen, Yuxin Huang & Xiaoyu Ye
National University of Singapore, Singapore, Singapore
Hongmin Xiao
HKU, Pokfulam, Hong Kong
Chao Hou
University of Wisconsin Madison, Madison, USA
Haozhe Lou
Xi’an University of Architecture and Technology, Xi’an, China
Yuantao Chen
Imperial College London, London, UK
Runyi Yang
Zhejiang University, Hangzhou, China
Yiyi Liao

Authors

Zirui Wu
View author publications
You can also search for this author in PubMed Google Scholar
Tianyu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Liyi Luo
View author publications
You can also search for this author in PubMed Google Scholar
Zhide Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Jianteng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hongmin Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Chao Hou
View author publications
You can also search for this author in PubMed Google Scholar
Haozhe Lou
View author publications
You can also search for this author in PubMed Google Scholar
Yuantao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Runyi Yang
View author publications
You can also search for this author in PubMed Google Scholar
Yuxin Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyu Ye
View author publications
You can also search for this author in PubMed Google Scholar
Zike Yan
View author publications
You can also search for this author in PubMed Google Scholar
Yongliang Shi
View author publications
You can also search for this author in PubMed Google Scholar
Yiyi Liao
View author publications
You can also search for this author in PubMed Google Scholar
Hao Zhao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Zhao .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Lu Fang
Duke University, Durham, NC, USA
Jian Pei
Shanghai Jiao Tong Univeristy, Shanghai, China
Guangtao Zhai
Chinese Academy of Sciences, Beijing, China
Ruiping Wang

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 901 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wu, Z. et al. (2024). MARS: An Instance-Aware, Modular and Realistic Simulator for Autonomous Driving. In: Fang, L., Pei, J., Zhai, G., Wang, R. (eds) Artificial Intelligence. CICAI 2023. Lecture Notes in Computer Science(), vol 14473. Springer, Singapore. https://doi.org/10.1007/978-981-99-8850-1_1

Download citation

DOI: https://doi.org/10.1007/978-981-99-8850-1_1
Published: 04 February 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8849-5
Online ISBN: 978-981-99-8850-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

MARS: An Instance-Aware, Modular and Realistic Simulator for Autonomous Driving