Artificial intelligence is defined by the ISO/IEC TR 24028:2020 as the “capability of an engineered system to acquire, process and apply knowledge and skills” . Within this scope, we focus on image analysis products which use techniques such as machine learning and deep learning. To establish an overview of AI software products, exhibitor lists from RSNA and ECR and marketplace offerings were reviewed. Also, news sources were monitored for the appearance of new vendors, products, or certifications. In Europe, the CE mark is a prerequisite for medical devices to be allowed on the market; therefore, CE marking of the product by April 2020 was a requirement for inclusion. Also, the product had to be vendor neutral and aid the radiologist in image interpretation in clinical practice. We excluded software used for dictation or image reconstruction at the source. Some vendors offer “suites” incorporating different software components performing different tasks, while other vendors market these components as separate products. To perform a balanced evaluation, we considered suite components as separate products. As the market is fast moving, we maintain an overview of products on www.aiforradiology.com. Discrepancies between the included products in this article and the Web resource may be caused by updates of the website, refusal of companies to appear on the website, and stricter inclusion criteria for this review.
All vendors were contacted to verify the collected information and supplement the product specifications. We retrieved information about the organ-based subspeciality, modality, and main task of the product. Also, the date to market, method of deployment, and pricing model were gathered. The CE status was verified by collecting the CE certificates or Declaration of Conformity of the vendors; a public database does not exist yet (EUDAMED is planned for 2021) . Also, the American FDA (Food and Drug Administration) approval status was gathered and confirmed with the public FDA database . CE and FDA status reported in this study reflect the status in September 2020. For the most recent information, visit www.aiforradiology.com.
Scientific evidence for the efficacy of the AI products was gathered in two ways. First, PubMed was systematically searched by vendor and product name for peer-reviewed articles published between Jan 1, 2015, and May 18, 2020. Queries are provided in the supplementary materials, Table S1. Secondly, a manual search was performed by inspecting the vendor’s websites for listings of papers and requesting vendors to provide peer-reviewed papers. No date restriction was applied for the manual search.
Included articles were original, peer-reviewed, and in English, and aimed to demonstrate the efficacy of the AI software. Papers were included when the product name (including known former names) and/or company name were mentioned, the tool was applied on in vivo human data, and efficacy of the product was reported on an independent dataset (data on which the algorithm was not trained) . Letters, commentaries, reviews, study protocols, white papers, and case reports were excluded.
Papers were assessed by two of the authors who independently screened the title, abstracts, and full paper for inclusion criteria. Cases of disagreement were resolved by the reviewers in a consensus meeting.
Hierarchical model of efficacy
We propose an adapted hierarchical model of efficacy to categorize the papers with respect to the type of validation addressed. Originally, this model was developed by Fryback and Thornbury in 1991 as a structure to assess the contribution of diagnostic imaging to patient management . It comprises six levels assessing an innovation from its technical efficacy—level 1 (does it do what it is supposed to do)—up until societal efficacy—level 6 (how do the costs and benefits compare). We have adapted the definitions and split level 1 into two subtypes to better accommodate the appraisal of scientific evidence regarding the contribution of AI software to the diagnostic imaging process. The adapted definition of each level is given in Table 1.
Products were categorized according to the aimed subspeciality, modality, main task, CE marking, FDA clearance, deployment method, and pricing model. We calculated the mean time between the founding of the company and bringing their first AI product to the market, excluding companies that were founded before 2005.
We report the available scientific evidence and the level of efficacy the papers addressed. Multiple levels could be assessed by a single paper. For each article, we reviewed the author list, funding source, and disclosures to categorize the publication as vendor independent or not. Data used in the included papers was categorized for the number of centers, countries, and acquisition machine manufacturers it originated from. We aggregated this information per product to give insights in the total number of centers, countries, and manufacturers addressed in the total evidence of that product.