Skip to main content
Log in

An Automated Tool to Classify and Transform Unstructured MRI Data into BIDS Datasets

  • Research
  • Published:
Neuroinformatics Aims and scope Submit manuscript

Abstract

The increasing use of neuroimaging in clinical research has driven the creation of many large imaging datasets. However, these datasets often rely on inconsistent naming conventions in image file headers to describe acquisition, and time-consuming manual curation is necessary. Therefore, we sought to automate the process of classifying and organizing magnetic resonance imaging (MRI) data according to acquisition types common to the clinical routine, as well as automate the transformation of raw, unstructured images into Brain Imaging Data Structure (BIDS) datasets. To do this, we trained an XGBoost model to classify MRI acquisition types using relatively few acquisition parameters that are automatically stored by the MRI scanner in image file metadata, which are then mapped to the naming conventions prescribed by BIDS to transform the input images to the BIDS structure. The model recognizes MRI types with 99.475% accuracy, as well as a micro/macro-averaged precision of 0.9995/0.994, a micro/macro-averaged recall of 0.9995/0.989, and a micro/macro-averaged F1 of 0.9995/0.991. Our approach accurately and quickly classifies MRI types and transforms unstructured data into standardized structures with little-to-no user intervention, reducing the barrier of entry for clinical scientists and increasing the accessibility of existing neuroimaging data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

References

  • Bedetti, C., arnaudbore, Guay, S., Carlin, J., Nick, Dastous, A. (2022, May). UNFmontreal/Dcm2Bids: 2.1.7. Zenodo. https://doi.org/10.5281/zenodo.6596007.

  • Butzkueven, H., Chapman, J., Cristiano, E., Grand’Maison, F., Hoffmann, M., Izquierdo, G., et al. (2006). MSBase: An international, online registry and platform for collaborative outcomes research in multiple sclerosis. Multiple Sclerosis Journal, 12(6), 769–774.

    Article  CAS  PubMed  Google Scholar 

  • Chen, T., He, T., Benesty, M., Khotilovich, V., Tang, Y., Cho, H., & Chen, K. (2015). Xgboost: Extreme gradient boosting. R Package Version 0 4-2, 1(4), 1–4.

    Google Scholar 

  • Esteban, O., Birman, D., Schaer, M., Koyejo, O. O., Poldrack, R. A., & Gorgolewski, K. J. (2017). MRIQC: Advancing the automatic prediction of image quality in MRI from unseen sites. PLOS ONE, 12(9), e0184661.

  • Esteban, O., Wright, J., Markiewicz, C. J., Thompson, W. H., Goncalves, M., Ciric, R. (2019). NiPreps: enabling the division of labor in neuroimaging beyond fMRIPrep, 7–9.

  • Gauriau, R., Bridge, C., Chen, L., Kitamura, F., Tenenholtz, N. A., Kirsch, J. E., et al. (2020). Using DICOM Metadata for Radiological Image Series categorization: A feasibility study on large clinical brain MRI datasets. Journal of Digital Imaging, 33(3), 747–762. https://doi.org/10.1007/s10278-019-00308-x.

    Article  PubMed  PubMed Central  Google Scholar 

  • Gorgolewski, K. J., Auer, T., Calhoun, V. D., Craddock, R. C., Das, S., Duff, E. P., et al. (2016). The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Scientific Data, 3(1), 160044. https://doi.org/10.1038/sdata.2016.44.

    Article  PubMed  PubMed Central  Google Scholar 

  • Halchenko, Y. O. (2018). & others. Open Source Software: Heudiconv. Zenodo. doi, 10.

  • JackJr., C. R., Bernstein, M. A., Fox, N. C., Thompson, P., Alexander, G., Harvey, D., et al. (2008). The Alzheimer’s disease neuroimaging initiative (ADNI): MRI methods. Journal of Magnetic Resonance Imaging, 27(4), 685–691. https://doi.org/10.1002/jmri.21049.

    Article  PubMed  Google Scholar 

  • Kennedy, D. N., Abraham, S. A., Bates, J. F., Crowley, A., Ghosh, S., Gillespie, T., et al. (2019). Everything matters: The ReproNim Perspective on reproducible neuroimaging. Frontiers in Neuroinformatics.

  • LaMontagne, P. J., Benzinger, T. L. S., Morris, J. C., Keefe, S., Hornbeck, R., Xiong, C. (2019). OASIS-3: Longitudinal neuroimaging, clinical, and cognitive dataset for normal aging and Alzheimer disease. MedRxiv, 2012–2019.

  • Li, X., Morgan, P. S., Ashburner, J., Smith, J., & Rorden, C. (2016). The first step for neuroimaging data analysis: DICOM to NIfTI conversion. Journal of Neuroscience Methods, 264, 47–56. https://doi.org/10.1016/j.jneumeth.2016.03.001.

    Article  PubMed  Google Scholar 

  • Lundberg, S. M., & Lee, S. I. A Unified Approach to Interpreting Model Predictions. Advances in neural information processing systems 30 (2017).

  • Luo, X. J., Kennedy, D. N., & Cohen, Z. (2009). Neuroimaging Informatics Tools and resources Clearinghouse (NITRC) Resource announcement. Neuroinformatics, 7(1), 55–56. https://doi.org/10.1007/s12021-008-9036-8.

    Article  PubMed  Google Scholar 

  • Marek, K., Jennings, D., Lasch, S., Siderowf, A., Tanner, C., Simuni, T., et al. (2011). The Parkinson progression marker Initiative (PPMI). Progress in Neurobiology, 95(4), 629–635. https://doi.org/10.1016/j.pneurobio.2011.09.005.

    Article  PubMed Central  Google Scholar 

  • Markiewicz, C. J., Gorgolewski, K. J., Feingold, F., Blair, R., Halchenko, Y. O., Miller, E., et al. (2021). The OpenNeuro resource for sharing of neuroscience data. eLife, 10, e71774. https://doi.org/10.7554/eLife.71774.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Mason, D., scaramallion, Suever, Vanessasaurus, J. (2022). pydicom/pydicom: pydicom 2.3.0. https://doi.org/10.5281/ZENODO.6394735.

  • Mildenberger, P., Eichelberg, M., & Martin, E. (2002). Introduction to the DICOM standard. European Radiology, 12(4), 920–927. https://doi.org/10.1007/s003300101100.

    Article  PubMed  Google Scholar 

  • Milham, M., Fair, D., Mennes, M., & Mostofsky, S. (2012). The adhd-200 consortium: A model to advance the translational potential of neuroimaging in clinical neuroscience. Frontiers in Systems Neuroscience.

  • Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikit-learn: Machine learning in Python. The Journal of Machine Learning Research, 12, 2825–2830.

    Google Scholar 

  • Satterthwaite, T. D., Connolly, J. J., Ruparel, K., Calkins, M. E., Jackson, C., Elliott, M. A., et al. (2016). The Philadelphia Neurodevelopmental Cohort: A publicly available resource for the study of normal and abnormal brain development in youth. Neuroimage, 124, 1115–1119. https://doi.org/10.1016/j.neuroimage.2015.03.056.

    Article  PubMed  Google Scholar 

  • Smith, S. M., Zhang, Y., Jenkinson, M., Chen, J., Matthews, P. M., Federico, A., & De Stefano, N. (2002). Accurate, robust, and automated longitudinal and cross-sectional brain change analysis. Neuroimage, 17(1), 479–489. https://doi.org/10.1006/nimg.2002.1040.

    Article  PubMed  Google Scholar 

  • Smith-Bindman, R., Kwan, M. L., Marlow, E. C., Theis, M. K., Bolch, W., Cheng, S. Y., et al. (2019). Trends in Use of Medical Imaging in US Health Care systems and in Ontario, Canada, 2000–2016. JAMA - Journal of the American Medical Association, 322(9), 843–856. https://doi.org/10.1001/jama.2019.11456.

    Article  PubMed  Google Scholar 

  • Tapera, T. M., Cieslak, M., Bertolero, M., Adebimpe, A., Aguirre, G. K., Butler, E. R., et al. (2021). FlywheelTools: Data Curation and Manipulation on the Flywheel platform. Frontiers in Neuroinformatics.

  • Taylor, J. R., Williams, N., Cusack, R., Auer, T., Shafto, M. A., Dixon, M., et al. (2017). The Cambridge Centre for Ageing and Neuroscience (Cam-CAN) data repository: Structural and functional MRI, MEG, and cognitive data from a cross-sectional adult lifespan sample. Neuroimage, 144, 262–269. https://doi.org/10.1016/j.neuroimage.2015.09.018.

    Article  PubMed  Google Scholar 

  • van der Voort, S. R., Smits, M., & Klein, S. (2021). DeepDicomSort: An automatic sorting algorithm for Brain magnetic resonance Imaging Data. Neuroinformatics, 19(1), 159–184. https://doi.org/10.1007/s12021-020-09475-7.

    Article  PubMed  Google Scholar 

  • van Ooijen, P. M. A. (2019). In E. R. Ranschaert, S. Morozov, & P. R. Algra (Eds.), Quality and Curation of Medical images and data BT - Artificial Intelligence in Medical Imaging: Opportunities, applications and risks (pp. 247–255). Springer International Publishing. https://doi.org/10.1007/978-3-319-94878-2_17.

  • Zwiers, M. P., Moia, S., & Oostenveld, R. (2022). BIDScoin: A User-Friendly Application to Convert Source Data to Brain Imaging Data Structure. Frontiers in Neuroinformatics, 15(January). https://doi.org/10.3389/fninf.2021.770608.

Download references

Author information

Authors and Affiliations

Authors

Contributions

A.B. wrote the main manuscript text, including tables and figures, and conceptualized the approach.A.B., M.S., N.B., and M.D. identified model features.A.B. and C.S. trained and validated the model.A.B. and S.S. implemented the model in a tool to transform MRI data to BIDS.All authors reviewed the manuscript.

Corresponding author

Correspondence to Michael G. Dwyer.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bartnik, A., Singh, S., Sum, C. et al. An Automated Tool to Classify and Transform Unstructured MRI Data into BIDS Datasets. Neuroinform (2024). https://doi.org/10.1007/s12021-024-09659-5

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12021-024-09659-5

Keywords

Navigation