1 Introduction

The evolvement of the development and use of computers in intelligently solving problems predates the creation and testing of the Turing machine in 1950. Such systems aim to demonstrate their suitability in interfacing with human beings in a manner that shows a high level of intelligence compared to humans. However, this new set of systems was earlier motivated by the design of 1940s systems such as ENIAC, which aimed to emulate humans in promoting learning and thinking. The outcome of this led to computer game applications competitively gaming with humans. Furthermore, this motivated the design of perceptron, which accumulated into a broader design of machine learning used for classification purposes. Further research and applications in statistics have promoted machine learning so that the intersection of statistics and computer science has advanced studies on artificial intelligence (AI). In this section, we organize the discussion to provide background knowledge on AI, ML and deep learning (DL). We provide a summary of a multi-disciplinary approach to research on ML to show recent methods and major application areas of ML in addressing real-problems. We conclude this section by providing a motivation for the bibliometric analysis and highlighting the study's contribution.

1.1 A Brief Background of ML and Its Evolution from AI-ML-DL

The drive to replace human capability with machine intelligence led to the evolvement of various methods of AI, which is now defined as the science and engineering of achieving machine intelligence as often exhibited in the form of computer programs and often in controlling and receiving signals from hardware [1]. An upsurge of research in AI has resulted in the outstanding performance of machines that now perform complex tasks intelligibly. Several AI paradigms have now evolved, including natural language processing (NLP), constraint satisfaction, machine learning, distributed AI, machine reasoning, data mining, expert systems, case-based reasoning (CBR), knowledge-representation, programming, robotics, belief revision, neural network, theorem proving, theory computation, logic, and genetic algorithm. This evolvement follows a historical trend, as shown in Fig. 1, which demonstrates a continuous improvement of methods and algorithms to increase accuracy in the exhibition of machine intelligence. This timeline shows that research in AI advanced through some challenging exploits until around the 1970s, when ML conceptualization began to manifest interesting results and performances. Interestingly, with these advances came the challenge of addressing ethical issues so that AI-driven systems are not allowed to infringe on human rights, nor will the moral status of such systems be compromised [2]. That notwithstanding, the evolvement peaked from the basic Turing’s concept to the current Industry 4.0 by connecting multi-disciplinary approaches, including those from computer science but also psychology, philosophy, neuroscience, biology, mathematics, sociology, linguistics, and other areas [3].

Fig. 1
figure 1

Evolvement of AI from dream to reality

The field of machine learning (ML) branched out of AI and is focused on evolving computational methods and algorithms learning and building learning machines to leverage an object's natural pattern of learning features. ML has been reputed to advance AI dramatically because of its problem-solving approach of recognizing patterns in domain-specific datasets to gather artificial experience from the observed data. This follows a data extraction pipeline through training and prediction using new data. This learning, shown in Fig. 2, is approached from and has evolved into different perspectives, including popular supervised, unsupervised, semi-supervised, and reinforcement learning. Over the years, algorithms have been designed and further evolved in each aspect of learning. These algorithms address real-life problems involving classification and regression problems using supervised learning methods, clustering and association using unsupervised learning methods, and the problem of understanding and manoeuvring an environment using reinforcement learning. The learning process in ML uses both symbolic and numeric methods as incorporated into some of its popular algorithms such as linear regression, nearest neighbor, Gaussian Naive Bayes, decision trees, support vector machine (SVM), random forest, K-Means, density-based spatial clustering of applications with noise (DBSCAN), balanced iterative reducing and clustering (BIRCH), temporal difference (TD), Q-Learning, and deep adversarial networks. The design of these algorithms includes a broad domain of statistics, genetic algorithms, computational learning theory, neural networks, stochastic modeling, and pattern recognition. The resulting algorithms have demonstrated state-of-the-art performances in email filters, NLP, pattern recognition, computer vision and autonomous vehicle design.

Fig. 2
figure 2

Evolvement of machine learning evolved from supervised learning to reinforcement Learning

Deep learning (DL) belongs to the broader family of ML and can analyse data intelligently through transformations, graph technologies and representation patterns. Derived from the simulation of the human brain from the basic Artificial Neural Networks (ANN), convolutional neural networks are designed in a manner that outperforms traditional ML algorithms. The approach leverages increasingly available training data from sensors, the Internet of Things (IoT), surveillance systems, intrusion detection system, cybersecurity, mobile, business, social media, health, and other devices. These data, often in an unstructured format, are analyzed and automated for identification of features leading to either classification or regression analysis [4]. The DL has been widely adapted to address application problems, including audio and speech, visual data, and NLP. Design patterns for DL have appeared as Convolutional neural network (CNN)—the most popular and widely used of DL networks—recursive neural networks (RvNNs), recurrent neural networks (RNNs), Boltzmann machine (BM), and auto-encoders (AE). While the RNN is often applied to text or signal processing, RvNN, which uses a hierarchical structure, can classify outputs utilizing compositional vectors [5]. Results obtained from different studies showed that DL had obtained good outstanding performance across a variety of applications [6]. This has now motivated its integration into reinforcement learning to achieve Deep Reinforcement Learning (DRL). Considering this evolvement and performance of ML and DL, we focus the next sub-section on presenting brief research and application of methods in this field among African researchers.

1.2 A Brief Background of Multi-disciplinary ML Research Contribution from Different Scientists Across Major African Universities

There is widespread research using ML to address contextual problems across African countries. In most cases, this is promoted by a local conference called Indaba, which promotes the application of DL and ML to help ensure that knowledge, capacity, recognizing excellence in ML research, and application are well harnessed to develop the continent. In this section, a summary of studies on ML and DL in Africa is reviewed to demonstrate the level of involvement of the researchers in research on ML. It is reported that AI-based research is improving communities across the Sub-Saharan Africa (SSA) regions. In Kenya, it is being applied to aid health worker–patient interaction to detect blinding eye disorders, and in Egypt, in aiding automated decision-making systems for health-care support. In South Africa, it is aiding drug prescription, and with a multinomial logistic classifier-based application, it is being applied to human resource planning. ML-trained models are primarily deployed in medicine in Nigeria, and an example is their use in the diagnosis of birth asphyxia and identification of fake drugs. Other cases are the use of ML to diagnose diabetic retinopathy in Zambia and the diagnosis of pulmonary tuberculosis in Tanzania [7].

Studies in the ML application from Morocco cut across medicine, solar power and climate. In particular, deep learning models, CNN, have been proposed for detecting and classifying breast cancer cases using histopathology samples [8]. The RNN variant of a DL model has been adapted to address the problem of daily streamflow over the Ait Ouchene watershed (AIO). The study used the Short-Term Long Memory (LSTM) network, a type of RNN, to achieve this simulation [9]. Research applying ML methods in the remote sensing field using a popular algorithm such as support vector machines (SVM) in mapping Souk Arbaa Sahel in a lithological manner has been reported by Bachri et al. [10]. In the financial sector, researchers have investigated the use of ML in revolutionizing the banking ecosystem for precise credit scoring, regulation and operational approaches [11]. In another study, the country's location motivates research on using ML to harness solar power in grid management at power plants. Both ML algorithms and DL have been drafted for predicting solar radiation using models such as ANN, multi-layer perceptron (MLP), back propagation neural network (BPNN), deep neural network (DNN), and LSTM [12].

In Egypt, research in ML has enjoyed application to learner-ship, face recognition, visual surveillance, and optical character recognition (OCR). In a study, the rate of school-dropout has been investigated and predicted using ML algorithms, specifically a Logistic classifier. The model can identify students at-risk of dropping out of school and isolate the causative of this challenge [13]. A novel hybrid DL model capable of detecting features supportive of face recognition has been proposed to apply the trained model to build a face clustering system based on density-based spatial clustering of applications with noise (DBSCAN) [14]. Similarly, generative adversarial networks (GANs), a composition of DL models adversarial positioned for generative purposes, have been investigated for kinship face synthesis [15]. Also, identification systems have been built using CNN by extracting input from video files to apply vision surveillance [16]. The contextualization of optical character recognition (OCR) systems to solve local problems has been researched using CNN, DNN and the SVM classifier to recognise different classes accurately [17]. Another interesting application of DL is in the task of Automatic License Plate Detection and Recognition (ALPR) for Egyptian license plates (ELP) [18].

The ML and DL models have been used primarily in Nigeria's medicine, security and climate issues. For instance, the use of CNN in investigating a solution to the classification problem of breast cancer using digital mammograms has been reported [19]. In related work, performance enhancement techniques such as data augmentation in improving DL models have been researched by Oyelade & Ezugwu [20] using the CNN model to detect architectural distortion in breast images. Concerning the challenge of deploying ML methods to address COVID-19, studies have been conducted using DL architectures to detect and classify the disease in chest x-ray samples [21]. Similarly, the need to harness the deployment of Internet of Things (IoT) devices to curb the spread of COVID-19 using ML algorithms has been advocated [22]. On the issue of security, an investigative study has been carried out assessing the level of deployment of AI and its associated ML methods in curbing terrorism and insurgency in Nigeria [23]. The use of artificial neural network (ANN) and logistic regression (LR) models have also been used to predict floods in susceptible areas in Nigeria [24]. Regarding finance and the digital economy, AI-based methods have been recommended for innovation and policy-making [25].

Researchers in Uganda have also employed AI in healthcare management by observing the performance of an AI algorithm called Skin Image Search, applied to dermatological tasks. The algorithm was trained using a local dataset from The Medical Concierge Group (TMCG) to diagnostically analyze and extract the gender, age and dermatological diagnosis [26]. A researcher from Kenya confirmed that an investment of US$74.5 million is being made to support the use of ML models in healthcare [27]. In the same country, DL architecture, namely the LSTM network, has been investigated for drought management by forecasting vegetation's health [28].

Research on the application of ML is widespread in South Africa, with more consideration given to language processing, medical image analysis, and astronomy. In addition to using ML algorithms, DL and NLP have been well-researched to aid development [29]. Generative model GAN has been applied to enable automatic speech recognition (ASR), improving the features of mismatched data prior to decoding [30]. In another related work, the ASR system has been researched by combining multi-style training (MTR) with deep neural network hidden Markov model (DNN-HMM) [31]. The use of CNN in exploring classification accuracy on SNR data has been reported by Andrew et al. [32]. A study has been channeled to investigate the role of loss functions in aiding the behavior of deep neural network optimization purposes [33]. Feedforward neural networks have been used to study the space physics problem in storm forecasting [34]. Optimizing hyperparameter issues in embedding algorithms has been considered for improving training word embeddings with speech-recognized data [35].

All these clear indications show that there is now a strong increase in research in ML, including its associated sub-fields of DL and NLP in African universities, with most applications aimed at healthcare, climate, and security. In the following sub-section, we summarize the major application areas of ML in the continent. This is necessary to give perspective to the current state of research on ML in the domain and to serve as a motivation for enabling future research on ML.

1.3 A Brief Highlight on the Significance of ML Application Within the Continent

Findings from the reviewed process detailed in the study showed that the fields of medicine and healthcare delivery management, agricultural studies, security and surveillance, natural language modelling and process and many others had benefited immensely from the application of ML on the continent. These ML applications include research on DL in cyber security intrusion detection and, likewise, the detection of DDoS in cloud computing. Disease detection in plants and crops has also been investigated using ML algorithms with an example of tomato disease detection. The sugarcane leaf nitrogen concentration estimation has been reported to map irrigated areas using Google Earth Engine. Several sub-fields of medicine have received research attention in promoting healthcare delivery and improving disease detection and management. Examples of ML methods in this aspect are automatic sleep stage classification, face mask detection in the era of the COVID-19 pandemic, protein sequence classification, and temporal gene expression data. Several studies have also been applied to study the design of optimization and clustering methods to solve difficult optimization problems in engineering, medicine and science. NLP methods have received wider consideration and study for mainstreaming the use of local languages across the continent. This includes translating the Yoruba language to French, automation regarding the use of Swahili, and automatic Arabic Diacritization. Other interesting areas generating the application of ML algorithms on the continent are optical communications and networking [36], deployment of AI to software engineering problems [37], and advancing medical research and appropriating clinical artificial intelligence in check-listing research [38].

1.4 Strong Motivation and Need for the Current Employment of Bibliometric Analysis Study

This study is motivated by the availability of large research databases providing a considerable number of publications and research outputs suitable for aiding the search required for the study. This data availability has helped to guide the decision on the need to use bibliometric techniques in drawing out important findings from the data collected from the scientific databases. Bibliometrics is used to facilitate the examination of large bodies of knowledge within and across disciplines. The use of bibliometric techniques in this study will support the aim of the study in identifying hidden but useful patterns capable of illustrating the research trend on ML and DL by researchers in African universities. This study intends to leverage the presentational nature of bibliometric analysis to allow policymakers to easily discover interesting research works in ML on the continent to aid their decision-making process.

Interestingly, we found that the proposed method will allow for discovering leading contributors to ML research. This method will undoubtedly enable this study to uncover new directions and themes for future research in ML. As observed in subsequent sections, bibliometric analysis enabled us to evaluate the impact of publications by regions, research institutions and authors and obtain relevant scientific information on a topic. The quantitative, scalable and transparent approach of bibliometric analysis fits them closely as informetrics and scientometrics. In the next sub-section, the approach to applying the bibliometric techniques in this study to achieve the aim of the study is outlined.

The following highlights are the major contributions of this study:

  • We first apply the analysis of research publications to uncover the developments with ML in African universities.

  • The study identifies core research in ML and DL and authors and their relationship by covering all the publications from African researchers.

  • We analyze the research status and frontier directions and predict the future of ML research in Africa.

  • An analysis of entities such as authors, institutions or countries in African universities is compared to their research outputs.

The remaining part of the paper is organized as follows: Sect. 2 describes the data collection process and the methodology used in this paper. Extensive bibliometric analysis is performed in Sect. 3, and this section covers the presentation of significant narratives and a detailed discussion of findings from the conducted study analysis. We provide a detailed literature review of the last few years in Sect. 4. Section 5 concludes the paper by summarizing the study’s findings of 30 years of ML-dedicated research efforts in several universities across the African continent.

2 Methodology

To do bibliometric analyses, data were extracted from the online databases of the Science Citation Index Expanded (SCI-EXPANDED) (data extracted on 10 October 2022). Quotation marks (“”) and Boolean operator “or” were used, which ensured the appearance of at least one search keyword in terms of TOPIC (title, abstract, author keywords, and Keywords Plus) from 1991 to 2021 [73]. The search keywords: “machine learning”, “machining learning”, “machine learnable”, “machine learn”, “machine learns”, “machine learners”, “machine learner”, “machine learnings”, “machines learning”, “machine learnt”, “machine learned”, and “machines learn” that were found in SCI-EXPANDED were considered. To have accurate analysis results, some terms missed spaces were found and employed including “machine learningmethods”, “machine learningmetrics”, “machine learningbased”, “machine learningalgorithm”, and “machine learningclassifiers”. Furthermore, related keywords which were misspelt such as “machine learnig”, “machine learnin”, “maching learning”, and “machin learning” were also used as search keywords. African countries including “Algeria”, “Angola”, “Benin”, “Botswana”, “Burkina Faso”, “Burundi”, “Cameroon”, “Cape Verde”, “Cent Afr Republ”, “Chad”, “Comoros”, “Dem Rep Congo”, “Rep Congo”, “Cote Ivoire”, “Djibouti”, “Egypt”, “Equat Guinea”, “Eritrea”, “Eswatini”, “Ethiopia”, “Gabon”, “Gambia”, “Ghana”, “Guinea”, “Guinea Bissau”, “Kenya”, “Lesotho”, “Liberia”, “Libya”, “Madagascar”, “Malawi”, “Mali”, “Mauritania”, “Mauritius”, “Morocco”, “Mozambique”, “Namibia”, “Niger”, “Nigeria”, “Rwanda”, “Sao Tome & Prin”, “Senegal”, “Seychelles”, “Sierra Leone”, “Somalia”, “South Africa”, “South Sudan”, “Sudan”, “Tanzania”, “Togo”, “Tunisia”, “Uganda”, “Zambia”, and “Zimbabwe” were also searched in terms of the country (CU). A total of 2770 documents, including 2477 articles, were found in SCI-EXPANDED from 1993 to 2021. In summary, a PRISMA flow diagram is shown in Fig. 3. It visually depicts the review process of finding published data on the topic and the authors decisions’ on whether to include it in the review. This study selected only articles from Science Citation Index Expanded (SCI-EXPANDED) with keywords as explained earlier. Quotation marks (“”) and Boolean operator “or” were used, which ensured the appearance of at least one search keyword in terms of TOPIC (title, abstract, author keywords, and Keywords Plus) from 1991 to 2021.

Fig. 3
figure 3

A summary of the data extraction and screening process from SCI-EXPANDED

Keywords Plus provides additional search terms extracted from the titles of articles cited by authors in their bibliographies and footnotes in the Institute of Science Information (ISI) (now Clarivate Analytics) database. It substantially augments title-word and author-keyword indexing [39]. It was noticed that documents only searched out by Keywords Plus are irrelevant to the search topic [40]. Ho’s group first proposed the “front page” as a filter to improve bias by using the data from SCI-EXPANDED directly, including the article title, abstract, and author keywords [41]. It has been pointed out that a significant difference was found by using the ‘front page’ as a filter in bibliometric research in wide journals classified in SCI-EXPANDED, for example, Frontiers in Pharmacology [42], Chinese Medical Journal [43], Environmental Science and Pollution Research [44], Water [45], Science of the Total Environment [46], and Journal of Foot and Ankle Surgery [47]. The ‘front page’ filter can avoid introducing unrelated publications for bibliometric analysis.

The entire record and the annual number of citations for each document were checked and placed into Excel Microsoft 365, and additional coding was manually executed. The functions in Excel Microsoft 365, for example, Concatenate, Counta, Freeze Panes, Len, Match, Proper, Rank, Replace, Sort, Sum, and Vlookup, were applied. The journal impact factors (IF2021) were based on the Journal Citation Reports (JCR) issued in 2021.

In the SCI-EXPANDED database, the corresponding author is designated as the “reprint author”; “corresponding author” will continue to be the primary term rather than the reprinted author [48]. In single-author articles where authorship is not specified, the single author is considered the first and corresponding author [49]. Likewise, in single-institutional articles, institutions are classified as first-author and corresponding author institutions [50]. All corresponding authors, institutions, and countries were considered in multiple corresponding author articles. For more accurate analysis results, affiliations were checked and reclassified. Author affiliations in England, Scotland, North Ireland (Northern Ireland), and Wales were regrouped under the heading of the United Kingdom (UK) [51]. Furthermore, SCI-EXPANDED has the article of the corresponding author. Only the address without the name of the affiliations is found, and the address is changed to the name of the affiliations.

Six publication indicators are used to assess the publication performance of countries and institutions [52, 53]: TP: total number of articles; IP: number of single-country (IPC) or single-institution articles (IPI); CP: number of internationally collaborative articles (CPC) or inter-institutionally collaborative articles (CPC); FP: number of first-author articles; RP: number of corresponding-author articles; and SP: number of single-author articles. Moreover, publications were assessed using the following citation indicators: Cyear: the number of citations from Web of Science Core Collection in a year (e.g. C2021 describes citation count in 2021) [48]; and TCyear: the total citations from Web of Science Core Collection received since publication year till the end of the most recent year (2021 in this study, TC2021) [53, 54].

Six citation indicators (CPP2021) related to the six publication indicators were also applied to evaluate the publication's impact on countries and institutions [55]: TP-CPP2021: the total TC2021 of all articles per the total number of articles (TP); IP-CPP2021: the total TC2021 of all single-country articles per the number of single-country articles (IPC-CPP2021) or single-institutions articles per the number of single-institutions articles (IPI-CPP2021); CP-CPP2021: the total TC2021 of all internationally per the number of internationally collaborative articles (CPC-CPP2021) or inter-institutionally collaborative articles per inter-institutionally collaborative articles (CPI-CPP2021); FP-CPP2021: the total TC2021 of all first-author articles per the number of first-author articles (FP); RP-CPP2021: the total TC2021 of all corresponding-author articles per the number of corresponding-author articles (RP); and SP-CPP2021: the total TC2021 of all single-author articles per the number of single-author articles (SP).

3 Results and Discussion

3.1 Document Type and Language of Publication

The characteristics of document type based on their CPPyear and the average number of authors per publication (APP) as basic document type information in a research topic were proposed [56]. Recently, the median of the number of authors was also applied to a research topic with a large number of authors in a document [57]. Using the citation indicators TCyear and CPPyear has advantages compared to citation counts directly from the Web of Science Core Collection because of their invariance and ensuring reproducibility [58]. A total of 2761 machine learning-related documents by authors affiliated with several institutions in Africa published in the SCI-EXPANDED from 1993 to 2021 were found among 11 document types which are detailed in Table 1. The majority were articles (89% of 2761 articles) with an APP of 15 and a median of 4.0.

Table 1 Citations and authors based on the document types

The largest number of authors in an article is “Y Machine learning risk prediction of mortality for patients undergoing surgery with perioperative SARS-CoV-2: the COVIDSurg mortality score” [59] published by 4,819 authors from 784 institutions in 71 countries including African countries: Egypt, Ethiopia, Gabon, Libya, Morocco, Nigeria, South Africa, Sudan, and Zimbabwe. The document type of reviews with 235 documents had the greatest CPP2021 value of 18, which was 1.6 times of articles. Five of the top 12 most frequently cited documents were reviews by Carleo et al. [60] (TC2021 = 426; rank 3rd), Merow et al. [61] (TC2021 = 268; rank 6th), Nathan et al. [62] (TC2021 = 231; rank 9th), Oussous et al. [63] (TC2021 = 206; rank 11th), and Ben Taieb et al. [64] (TC2021 = 204; rank 12th).

Web of Science document type of articles were further analyzed as they included the entire research hypothesis, methods and results. Only three non-English articles were published by the French in Traitement du Signal [65, 66] and Annales Des Télécommunications [67].

3.2 Characteristics of Publication Outputs

A relationship between the annual number of articles (TP) and their CPPyear by the years in a research field has been applied as a unique indicator [68]. Machine learning research was not considered in Africa before 2010, with an annual number of articles of less than 10. In Africa, Elgamal, Rafeh, and Eissa from Cairo University in Egypt first mentioned “machine learning” as the authors’ keywords in Case-based reasoning algorithms applied in a medical acquisition tool [69]. The number of articles increased slightly from 14 in 2010 to 98 in 2017 (Fig. 4). After that, a sharply rising trend reached 1035 articles in 2021. The highest CPP2021 was 54 in 2013, which can be attributed to the article entitled Multiobjective intelligent energy management for a microgrid [70], ranking at the top in TC2021 with 402 (rank 3rd).

Fig. 4
figure 4

Number of articles and the average number of citations per publication by year

3.3 Web of Science Categories and Journals

African published machine learning-related articles in 903 journals were classified in 159 of the 178 Web of Science categories in SCI-EXPANDED. Recently, the characteristics of the Web of Science categories based on TP, APP, CPP2021, and the number of journals in each category were proposed [71]. Table 2 shows the top 12 productive Web of Science categories with over 100 articles. A total of 906 articles (37% of 2468 articles) were published in the top four productive categories: electrical and electronic engineering containing 278 journals (385 articles; 20% of 2468 articles), information systems computer science containing 164 journals (439 articles; 18%), artificial intelligence computer science containing 145 journals (334 articles; 14%), and telecommunications containing 94 journals (308 articles; 12%). Comparing the top 12 productive categories, articles published in the ‘interdisciplinary applications computer science’ and ‘remote sensing’ categories had the greatest CPP2021 of 15, respectively. Articles published in the ‘information systems computer science category’ had a lower CPP2021 of 8.9. Articles published in the category of ‘environmental sciences’ had the greatest APP of 6.6, while articles in the category of ‘artificial intelligence computer science’ had an APP of 3.5. The interaction of publication development among Web of Science categories is discussed using Fig. 4, comprising the number of publications versus the year of publication [72]. Figure 5 shows the development trends of the top four Web of Science categories with more than 300 articles. The first articles were published in 1993 and 1997 in the ‘information systems computer science’ and ‘electrical and electronic engineering’ categories, respectively. However, more articles have been published in the ‘electrical and electronic engineering’ category since 2014. The first article in the category of ‘telecommunications’ was found in 2015. It had a sharp increase since 2018 and reached 139 articles in 2021, much higher than the 93 articles in the ‘artificial intelligence computer science category’.

Table 2 Top 12 most productive Web of Science categories with TP > 100
Fig. 5
figure 5

Development of the top four productive Web of Science categories, TP > 300

Recently, the characteristics of the journals based on their CPPyear and APP as basic information of the journals in a research topic were proposed [73, 74]. Table 3 shows the top 12 most productive journals with journal impact factors, CPP2021, and APP. The IEEE Access (IF2021 = 3.476) published the most, 192 articles, representing 7.8% of 2,468. Compared to the top 12 productive journals, articles published in the Expert Systems with Applications (IF2021 = 8.665) had the greatest CPP2021 of 30. In contrast, articles in the CMC-Computers Materials & Continua (IF2021 = 3.860) had only 2.2. The APP ranged from 16 in the Monthly Notices of the Royal Astronomical Society to 2.8 in the Journal of Big Data. According to IF2021, the top five journals which have an IF2021 of more than 60 were World Psychiatry (IF2021 = 79.683) with two articles, Nature (IF2021 = 69.504) with one article, Nature Energy (IF2021 = 67.439) with one article, Nature Reviews Disease Primers (IF2021 = 65.038) with one article, and Science with one article (IF2021 = 63.714).

Table 3 Top 12 most productive journals with TP > 20

3.4 Publication Performances: Countries

Altogether, 649 articles (26% of 2468 articles) were single-country articles from 16 African countries with an IPC-CPP2021 of 10 and 1819 (74%) were internationally collaborative articles from 146 countries, including 43 African countries and 103 non-African countries with a CPC-CPP2021 of 12. The results show citations by international collaborations increased slightly. Six publication indicators and six related citation indicators (CPP2021) [55] were applied to compare the 44 African countries (Table 4). Egypt dominated in all the six publication indicators with a TP of 777 articles (31% of 2468 articles), an IPC of 186 articles (29% of 649 single-country articles), a CPC of 591 articles (32% of 1819 internationally collaborative articles), an FP of 345 articles (14% of 2468 first-author articles), an RP of 449 articles (18% of 2467 corresponding-author articles), and an SP of 21 articles (32% of 66 single-author articles). Compared to the top 17 productive countries with 20 articles or more, Sudan had a TP of 33 articles, an IP of 3 articles, a CP of 30 articles, an FP of 6 articles, and an SP of 3 articles, with the greatest TP-CPP2021 of 20, IPC-CPP2021 of 23, CPC-CPP2021 of 20, FP-CPP2021 of 14, and SP-CPP2021 of 23 respectively. Libya had an FP of 3 articles and an RP of 4, with the greatest FP-CPP2021 of 14 and RP-CPP2021 of 23. Ten of the 54 African countries such as Angola, Cape Verde, Central African Republic (Cent Afr Republ), Comoros, Djibouti, Equatorial Guinea (Equat Guinea), Eritrea, Sao Tome and Principe (Sao Tome & Prin), Seychelles, and South Sudan had no machine learning-related articles in SCI-EXPANDED. Among the 44 African countries that published machine learning-related articles, 28 countries (64% of 44 African countries) had no single-country articles, while only Niger had no internationally collaborative articles. Similarly, 14 (32%), 10 (23%), and 35 (80%) countries had no first-author, corresponding-author, and single-author articles, respectively.

Table 4 African countries published machine learning articles

Development trends in the publication of the top six productive countries with more than 100 articles are presented in Fig. 6. From the results obtained, the first machine learning-related article in Africa (by Egypt) dates back to 1993. In 1995, 1998, 2001, 2004, and 2009, the first articles were published by South Africa, Tunisia, Morocco, Algeria, and Nigeria, respectively. Egypt and South Africa had similar development trends. However, Egypt sharply increased in the last three years to reach 324 articles in 2021. Algeria and Tunisia also had similar development trends.

Fig. 6
figure 6

Development of the top six productive countries with TP > 100

Ten of the 103 non-African countries had 100 internationally collaborative articles or more with Africa, as shown in Fig. 7. The USA had a CPC of 431 articles with CPC-CPP2021 of 15, followed by Saudi Arabia (CPC of 338 articles; CPC-CPP2021 of 9.3), the UK (295 articles; 14), China (252; 14), France (211; 10), India (174; 11), Germany (156; 20), Canada (154; 14), Australia (146; 13), and Spain (124; 14).

Fig. 7
figure 7

Development of the top five most collaborative countries with Africa, TP > 200

3.5 Publication Performances: Institutions

Concerning institutions, 382 African articles (15% of 2468 articles) originated from single institutions with an IPI-CPP2021 of 9.9, while 2086 articles (85%) were institutional collaborations with a CPI-CPP2021 of 12. The institutional collaborations slightly increased the citations. The top 20 productive African institutions and their characteristics are presented in Table 5. Cairo University in Egypt ranked top with a TP of 142 articles (5.8% of 2468 articles) and a CPI of 127 articles (6.1% of 2086 inter-institutionally collaborative articles). However, the University of KwaZulu-Natal in South Africa ranked top in three of the six publication indicators with an IP of 19 articles (5.0% of 382 single-institution articles), an FP of 48 articles (1.9% of 2468 first-author articles), and an RP of 64 articles (2.6% of 2467 corresponding-author articles). In addition, the University of Johannesburg in South Africa and the Council of Scientific and Industrial Research (CSIR) in South Africa ranked top with an SP of four articles (6.1% of 66 single-author articles), respectively. Compared to the top 20 African countries, the University of KwaZulu-Natal in South Africa had a TP of 104 articles, a CPI of 85 articles, an FP of 48 articles, and an RP of 64 articles, with the greatest TP-CPP2021 of 24, CPI-CPP2021 of 27, FP-CPP2021 of 27, and RP-CPP2021 of 34 respectively. The University of Pretoria in South Africa had an IPI of 15 articles with the greatest IPI-CPP2021 of 20, while the Mansoura University in Egypt had an SP of two articles with the greatest SP-CPP2021 of 29.

Table 5 Top 20 most productive African institutions

Five non-Africa institutions had 30 inter-institutionally collaborative articles or more with Africa. King Saud University in Saudi Arabia had a CPI of 62 articles with CPI-CPP2021 of 10, followed by Taif University in Saudi Arabia (CPI of 39 articles; CPI-CPP2021 of 2.3), University of Oxford in the UK (38 articles; 15), King Abdulaziz University in Saudi Arabia (34; 4.1), and Prince Sattam Bin Abdulaziz University in Saudi Arabia (31; 7.6).

3.6 Citation Histories of the Ten Most Frequently Cited Articles

The total citations in the Web of Science Core Collection are updated from time to time. To improve bibliometric studies directly using data from the database, total citations from the Web of Science Core Collection from the year of publication to the end of the most recent year of 2021 (TC2021) were applied [74]. The citation history of the most frequently cited articles assessed by TCyear in a research topic was presented to understand the impact history of the articles [48, 53, 74]. Highly cited articles may not always significantly impact a research field [49, 50, 53]. Table 6 shows the top ten most frequently cited machine learning-related articles in Africa. Five of the top ten articles were published by Egypt, followed by South Africa with two articles and one each by Nigeria, Kenya, and Morocco.

Table 6 The top ten most frequently cited articles by African countries

The most cited article was entitled Ranger: a fast implementation of random forests for high dimensional data in C +  + and R [74] by Wright and Ziegler from the University of Lubeck in Germany and the University of KwaZulu-Natal in South Africa and had a TC2021 of 683 (rank 1st) and a C2021 of 330 (rank 2nd). An article entitled Peeking inside the black-box: A survey on explainable artificial intelligence (XAI) [75] by Adadi and Berrada from the Sidi Mohammed Ben Abdellah University in Morocco had the most impact on the most recent year of 2021 with a C2021 of 435 (rank 1st) and a TC2021 of 675 (rank 2nd). These two articles keep increasing in citations.

3.7 Research Foci

In the last decade, Ho’s research group proposed distributions of words in article titles and abstracts, author keywords, and Keywords Plus of different periods to determine research foci and trends [83, 84]. Among 2468 articles, 2,464 articles (99.8% of 2468 articles) had record information of article abstracts; 2,103 (85.2%) articles had author keywords; and 2069 (83.8%) articles had Keywords Plus. The 20 most frequent keywords are listed in Table 7. The classification was ranked in the top 20 in article titles and abstracts, author keywords, and Keywords Plus, respectively. The development of the top four topics in machine learning in Africa, such as deep learning, classification, feature extraction, and random forest, is shown in Fig. 8.

Table 7 The 20 most frequently used keywords
Fig. 8
figure 8

Development trends of the four most popular topics in Africa

3.7.1 Classification

Articles containing supporting words such as classification, classifications, and misclassification in their title, abstract, or author keywords were classified as classification-related articles. In 1996, Gouws and Aldrich from the University of Stellenbosch in South Africa reported that using machine learning techniques and the classification rules on a supervisory expert system shell or decision support system for plant operators could consequently make a significant impact on the way notation plants [85]. Highly cited articles with TC2021 of 100 or more [50], such as Deep learning for tomato diseases: Classification and symptoms visualization [86] and Learning machines and sleeping brains: Automatic sleep stage classification using decision-tree multi-class support vector machines [87] were published by African authors from Algeria and Tunisia respectively. An article entitled A predictive machine learning application in agriculture: Cassava disease detection and classification with imbalanced dataset using convolutional neural networks [88] was published in the most recent year 2021 by Sambasivam and Opiyo from International Business, Science And Technology University (ISBAT) in Uganda.

3.7.2 Deep Learning

Supporting words for deep learning were deep learning, deep neural network, deep neural networks, deep transfer learning, deep reinforcement learning, deep convolutional neural network, and deep convolutional neural networks. Deep learning was first mentioned in an article on Deep learning framework with confused sub-set resolution architecture for automatic Arabic Diacritization [89] by authors from Egypt and Kuwait. Highly cited machine learning article was published by African authors, for example, Deep learning for tomato diseases: Classification and symptoms visualization [86] by authors from Algeria and Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study [90] by authors from Algeria and the UK. The most impactful article about deep learning in 2021 was A hybrid deep transfer learning model with machine learning methods for face mask detection in the era of the COVID-19 pandemic [91] by authors from Egypt, USA, and Taiwan.

3.7.3 Feature Extraction

Supporting words for the feature extraction were feature extraction, feature selection, and feature evaluation. Saidi et al. from France and Tunisia published the first feature extraction-related article entitled Protein sequences classification by means of feature extraction with substitution matrices [92] in Africa. Highly cited articles about feature extraction were Ensemble-based multi-filter feature selection methods for DDoS detection in cloud computing [93] by authors from South Africa, Australia, China, and the UK and Minimum redundancy maximum relevance feature selection approach for temporal gene expression data [94] by authors from the USA, Serbia, and Egypt. In 2021, Metaheuristic algorithms on feature selection: A survey of one decade of research (2009–2019) [95] was published by authors from India, Saudi Arabia, and Egypt.

3.7.4 Random Forest

Supporting words for the random forest were random forest, random forests, and random decision forest. In 2010, Auret and Aldrich [96] from the University of Stellenbosch in South Africa published the first article about the random forest in machine learning. Highly cited random forest-related articles were published in the last decade in Africa, for example, Ranger: A fast implementation of random forests for high dimensional data in C +  + and R [74] by Wright and Ziegler from Germany and South Africa and Random forest regression and spectral band selection for estimating sugarcane leaf nitrogen concentration using EO-1 Hyperion hyperspectral data [97] by authors from South Africa and Sudan. In 2021, The application of the random forest classifier to map irrigated areas using Google Earth Engine [98] was presented by authors from South Africa.

The yearly development trends of the four most popular topics in Africa, shown in Fig. 8, illustrated that the classification (TP = 841 articles) was the most concerned with machine learning in Africa. Research about deep learning was more popular than feature extraction. However, they have shown the same development trends in recent years.

4 Machine Learning Research

4.1 Preliminary Overview

Machine learning (ML) is a subfield of artificial intelligence. The central idea is that the machine learns by interacting with the input data and develops a corresponding model capable of classifying a new input or predicting an outcome based on new inputs. The input data is usually divided into two: the training data used to teach the machine and the classification data used for testing the accuracy of the trained model. Different ML algorithms have been used to solve problems such as early disease detection and classification in medicine and agriculture, plants or crops disease detection, data mining, clustering, quantum computing technology, engineering optimization, earth observation, food security, climate change, pollution, and many more.

4.2 Research Trends in Africa

ML algorithms have found significant application in bioinformatics, especially in genetic testing of microscopic spots stored in DNA microarrays, genomics, and proteomics. Also, the medical or biological fields have been receiving significant attention from ML researchers, particularly in areas of medical engineering, epidemiology, and the study and early detection of genetic diseases and disorders such as Alzheimer's disease, diabetes, cancer, arthritis, high blood pressure, hemochromatosis, cystic fibrosis, Huntington's disease, sickle cell anemia, and Marfan syndrome [99,100,101]. Focusing on diseases prevalent in Africa, machine learning has been used to improve the genetic resistance to malaria, early detection and eradication of diabetes, classification of sickle cell anemia, improvement of genetic resistance to HIV/AIDS, and detection of uterine fibroids in women [102].

ML has also found tremendous application in the economy and is argued to be the bedrock of the fourth industrial revolution [103]. The developed countries have keyed into this to avoid missing out on the revolution [104, 105]. Actors in government and private sectors have developed strategies that key into the revolution. Africa is lagging in this regard with little or no efforts towards actualizing the fourth industrial revolution. Some agencies from the West have tried to assist developing countries [106, 107]. Countries like Rwanda and others have taken the initiative of developing plans driven by AI to achieve economic sustainability [108].

Different AI techniques, such as ML and the Internet of Things (IoT), drive the energy sector. Africa is not left behind in this aspect. ML is used in pay-as-you-go energy products to predict demand, score users' activities, and develop models that make products available, affordable and adaptable [109]. For example, an energy company can use the predictive analysis aspect of ML to make available energy services or products to areas without access to energy products and services [110, 111].

The agricultural sector offers a fertile ground for ML to display the ability to improve productivity and efficiency all along the value chain. It provides solutions for subsistence and mechanized farmers to improve yield and increase profits through developing models for the detection and precision treatment of pests and diseases, optimal fertilizer application, soil monitoring, and many more. Solutions like Gro intelligence in Kenya deploy AI techniques such as ML to achieve food security [112]. Climatic conditions for precision agriculture have been achieved through the use of drone technology with the capability of knowing the optimal interventions needed for optimal yield [113].

ML has also been used to develop systems that could identify in real time the appropriate agronomic interventions that should be made using sensor data such as pH level, soil moisture level, temperature, and more. In Kenya and Mozambique, projects like Third Eye drive this process for better yield [114, 115]. Western technologies like Farmbeats have been applied in Africa using low-cost, sparsely distributed sensors and aerial imagery to generate precision maps. The system is attached to a smartphone carrying helium balloons, which is a low-cost drone system [116, 117]. Intelligent drones with high ML capabilities have been deployed to survey elephants in Burkina Faso, anti-poaching rhinos in South Africa, and analysis of flood risks in Tanzania [118,119,120].

In entrepreneurship, ML has been leveraged to deliver innovative research and products. Hepta Analytics developed a product called Najua, which uses ML to present web content in local languages [121]. A start-up company in Nigeria developed a mobile app called Ubenwa, which is used to detect early prenatal asphyxia in newborn babies by analyzing acoustic signatures [122].

4.3 Major Application Areas

ML is a major driver of the Fourth Industrial Revolution (4IR). It has improved outcomes in various application areas by utilizing its learning and prediction abilities. This section summarizes and discusses major popular application areas of machine learning. Figure 9 gives the main branches of machine learning and the offshoot disciplines of each. It also depicts how different researchers have used the major ML algorithms to solve problems in the respective domains. The application areas of ML are vast, as seen by the depiction in Fig. 9. Therefore, this study summarizes the application area into ten elaborate areas which are discussed below.

Fig. 9
figure 9

The main branches of machine learning and the offshoot disciplines of each

4.3.1 Predictive and Decision-Making

Most ML research has been carried out in this domain, where ML drives the intelligent decision-making process through data-driven predictive analytics, for instance, suspect identification, fraud detection [123], and many more. ML is also helpful in identifying customer preferences and behavior, production line management, scheduling optimization, and inventory management. As seen from Table 7, the keywords “prediction” and “detection” represent the third and fourth most frequently used keywords for research in ML. Nwaila et al. [124] designed a machine learning algorithm for point-wise grade prediction and automatic facies identification based on gold assay and sedimentological data for the South African Witwatersrand Gold ores.

4.3.2 Cybersecurity and Threat Intelligence

Cybersecurity is a cardinal area of intervention in Industry 4.0, typically protecting networks, systems, hardware, and data from digital attacks. Machine learning techniques have been used to detect security breaches through data analysis to identify patterns and detect malware or threats. The common ML technique for identifying cyber breaches is the clustering technique. Also, deep learning has been used to design security models that can be used on large-scale security datasets [125]. Mbona and Eloff [126] designed a semi-supervised machine learning approach to detect zero-day (new unknown) intrusion attacks based on the law of anomalous numbers to identify significant network features that effectively show anomalous behaviour. Similarly, Benlamine et al. [127], used a machine learning model to evaluate emotional reactions in virtual reality environments where the face is hidden in a virtual reality headset, making facial expression detection using a webcam impossible. Several machine learning techniques have been used to identify and classify spam e-mails [128].

4.3.3 Internet of Things (IoT) and Smart Cities

The Internet of Things (IoT) is another vital area of the fourth Industrial revolution. The goal is to make objects smart by allowing them to transmit data and automate tasks without human interaction. Therefore, IoT is a frontier in enhancing human activities, such as smart homes, cities, agriculture, governance, healthcare, and more. Adenugba et al. [129] proposed a machine learning-based Internet of Everything for a smart irrigation system for environmental sustainability in Africa. Their solar-powered smart irrigation system uses a machine learning radial basis function network to predict the environmental condition that controls the irrigation system.

4.3.4 Traffic Prediction

The economy of a city or country thrives when an efficient transport system exists. A community's economic growth comes with challenges such as high traffic volume, accidents, emergencies, high pollution, and more. Therefore, ML-driven smart city models can help predict traffic anomalies [130]. Also, ML techniques can analyze travel history data to predict possible hitches or recommend alternative routes to commuters [131].

4.3.5 Healthcare

Machine learning techniques have been applied in healthcare for diagnosing and prognostic diseases, omics data analysis, patient management, and more [132]. The Coronavirus disease (COVID-19) outbreak elicited the use of machine-learning techniques to help combat the pandemic [133]. Deep learning also provides exciting solutions to medical image processing problems and is a crucial technique for potential applications, particularly for the COVID-19 pandemic [134]. Machine learning technique has also been used in Malaria incidence prediction to address the serious challenge it poses to socio-economic development in Africa [135]. Heart failure phenotypes were clustered based on multiple clinical parameters using unsupervised machine learning techniques by Mpanya et al. [136] to assist in diagnosing, managing, risk stratification and prognosis of heart failure. Machine learning has been deployed in predicting the present or future status of a disease or a disease's future course using machine learning and regression models [137]. Patients can be classified based on disease risk or disease probability estimation through machine learning approaches [138]. Brain MRIs can be classified for detecting brain tumors using a machine learning-based deep neural network classifier [139]. Other medical diagnoses that use machine learning include electrocardiograms [140] and cancer disease diagnosis [141].

4.3.6 E-commerce

ML techniques have been used to build systems that help businesses understand customers' preferences by analyzing their purchasing histories. These systems can recommend products to potential customers. Companies would use these systems to know where to position product adverts or offers. Many online retailers can better manage inventory and optimize logistics, such as warehousing, using predictive modeling based on machine learning techniques [142]. Furthermore, machine learning techniques enable companies to maximize profits by creating packages and content tailored to their customer's needs, allowing them to maintain existing customers while attracting new ones. Customers' creditworthiness can be determined through customers' credit scoring based on machine learning classification methods [143]. In retail market operations, a machine learning tool has been designed to assist retailers in increasing access to essential products by improving essential product distribution in uncertain times due to the problem of panic buying [144].

4.3.7 Natural Language Processing (NLP)

NLP and sentiment analysis involve processes that could enable computer reading, understanding, and processing of spoken or written language [145]. Some examples of NLP-related tasks include virtual personal assistants, chatbots, speech recognition, document description, and language or machine translation. Sentiment Analysis or Opinion Mining uses the result of NLP to mine information or trends that could translate to moods, views, and opinions from huge data collected from different social media platforms [146]. For instance, politicians can use sentiment analysis to ascertain the perceived views of the electorate about their candidate.

4.3.8 Image, Speech, and Pattern Recognition

Machine learning has significant application in this domain, where different ML techniques have been used to identify or classify real-world digital images [147]. A typical example of image recognition includes labeling digital images from an X-ray as cancerous. Like image recognition, speech recognition deals with sound and linguistic models [148]. Finally, pattern recognition aims to identify patterns and expressions in data [149]. Several machine-learning techniques, such as classification, feature selection, clustering, or sequence labeling, have been used in this area.

4.3.9 Sustainable Agriculture

Sustainable agricultural practices help improve agricultural productivity while reducing negative environmental impacts [150, 151]. Sustainable agriculture is knowledge-intensive and information-driven, where farmers make decisions based on available information and technology such as the Internet of Things (IoT), mobile technologies, and devices. Machine learning techniques are applied to predict crop yield, soil properties, irrigation requirements, weather, disease detection, weed detection, soil nutrient management, livestock management, demand estimation, production planning, inventory management, consumer analysis, and more. Machine learning techniques have been used to predict the level of insect infestation with its associated damage in maize farms [152]. In Hengl et al. [153], spatial predictions of soil micro and macro nutrients were carried out using machine learning techniques to support agricultural development, monitoring and intensifying soil resources. Identifying and mapping ecosystems are important in supporting food security and other important environmental indicators for biotic diversity. Tchuenté et al. [154] developed two machine learning approaches to ecosystem mapping in the African continent-scale to classify the African ecosystem based on the Normalized Difference Vegetation Index (NDVI) dataset. Andraud et al. [155] applied machine learning for Benthic habitat mapping to characterise seafloor substrate using geophysical data at Table Bay, southwestern South Africa. Computer vision and machine learning techniques have been used in the evaluation of food quality and the grading of crops. Semary et al. [156] designed machine learning techniques using feature fusion and support vector machines for classifying infected or uninfected tomato fruits based on the external surface of the tomato fruits.

4.3.10 Pollution Control

Air pollution is regarded as one of the world's most immense public and environmental health challenges, with its adverse effects on the ecosystem, human health, and climate. Gaps in air quality data in the middle- and lower-income countries limit the development of policies relating to air pollution control with its resultant negative health impacts due to exposure to ambient air pollution. Long-term exposure to ambient air pollution is associated with an increase in mortality rates in these countries. There is a need for accurate and reliable estimates of air pollution prediction for land use regression. Coker et al. [157] proposed a land use regression model based on low-cost particulate matter sensors and machine learning to accurately estimate the exposure to air pollution in eastern and central Uganda—a sub-Saharan African country. The goal is to use low-cost air quality sensors in land use regression modelling to accurately predict the fine ambient particulates matter air pollution in the urban areas which will be estimated monthly. Amegah [158] also used machine learning techniques with low-cost air quality sensors for air pollution assessment and prediction in urban Ghana. Zhang et al. [159] developed a machine learning model using the random forest for estimating the daily fine particulate matter concentration in the industrialized Gauteng province in South Africa based on socioeconomic, satellite aerosol optical depth, meteorology and land use data.

4.3.11 Climate System

In estimating global gridded net radiation and sensible and latent heat alongside their uncertainties, machine learning has been deployed to merge energy flux measurements with meteorological and remote sensing data for accurate estimation [160]. The negative impact of climate change on human life informed the need for its study and prediction. Machine learning models have been employed to study the relationship between greenhouses gases emissions and climate variable change rhythm. Ibrahim, Ziedan & Ahmed [161] explored the application of ML techniques to climate data for building an ML models for predicting climate variable states for the long and short term in North-East Africa. This is employed in climate mitigation and adaptation as well as in determining the acceptable level of greenhouse gases with their corresponding concentration to avoid climate crises and events. Sobol, Scott & Finkelstein [162] utilized supervised machine learning to modern pollen assemblages in Southern Africa to understand biome responses to global climate change and determine specific biomes or bioregions representations. Probabilistic classification for fossil assemblages was generated for the reconstruction of past vegetation.

The continual negative effect of climate change and human-induced ecological degradation worsens the environmental pressures on human livelihoods in many regions, resulting in an increased risk of violent conflict. With reference to the African continent, Hoch et al. [163] projected sub-national armed conflict risk along three representative concentration pathways and three shared socioeconomic pathways using machine learning methods. The role of hydro-climatic indicators in driving armed conflict was assessed. According to their report, climate change increases the projection for armed conflict risk in Northern Africa and substantial parts of Eastern Africa. The role of ML in armed conflict risk projection is to assist the policy-making process in handling climate security. To combat the adverse effect of deforestation and climate change on accurate weather information, Nyetanyane & Masinde [164] proposed a machine learning model that uses climate data, vegetation index and indigenous knowledge to predict the onset of favourable weather seasons for crop cultivation, monitoring and prediction of crop health.

4.3.12 Soil Analysis

The need for detailed soil information to assist in agricultural productivity modelling as well as to aid global estimation of the organic carbon in the soil has grown over time. Moreover, in areas affected by climate change, the need arises for spatial information about the parameters of soil waters. According to Folberth et al. [165], obtaining accurate information about soil may be important in the prediction of the effect of climate change on food production. Hengl et al. [166] presented an improved version of the SoilGrids system for global predictions for standard numeric soil properties, including the organic carbon, Cation Exchange Capacity, bulk density, soil texture fractions, coarse fragments and pH, as well as predicting the distribution of soil classes and depth to bedrock based on the USDA and World Reference Base classification system.

In the following paragraph, we critically discuss one of the research niche areas in which Africa has led after the United States, Canada and China, specifically in Quantum Computing machine learning research. The South Africa Quantum Technology Initiative (SA QuTI) was established in 2021 as a national undertaking that seeks to create conducive conditions for a globally competitive research environment in quantum computing technologies. Moreover, the University of KwaZulu-Natal has been leading in producing significant research output in the quantum machine learning research domain, championed by Professor Petruccione. A more detailed discussion of the quantum computing research in presented next.

4.4 Quantum-Based Machine Learning Research

Another prominent research area in machine learning that has been actively engaged in Africa is the deployment of quantum computing to improve classical machine learning algorithms. Quantum computing manipulates the quantum system for information processing for a substantial computational speed. In quantum computing, the classical two states 0 and 1 of conventional computing are replaced with the superposition of qubit (quantum bit) of the two states ∣0⟩ and ∣1⟩, which allows many different computation paths simultaneously. Quantum machine learning involves the development of quantum algorithms for solving typical machine learning problems to harness the efficiency of quantum computing. The classical machine learning algorithms are adapted to run on a quantum computer. In the current era of the explosive growth of information, the adoption of quantum machine learning for various machine learning applications has been an active area of research as it is a promising area of an innovative approach to improving machine learning.

Schuld, Sinayskiy, & Petruccione [78] presented a systematic overview of the emerging field of quantum machine learning, describing the approaches, technical details, and future quantum learning theory. The presentation included discussions on the various approaches for relating seven standard methods of the classical machine learning algorithms: support vector machine, k-nearest neighbour, neural network, k-means clustering, hidden Markov model, decision trees and Bayesian theory to quantum physics. The discussion focused mainly on the quantum machine learning approach for pattern classification and clustering.

Pattern classification is one of the major tasks under supervised machine learning. Most quantum machine learning algorithms are built to address this area of machine learning to extend or improve the classical version. Schuld, Sinayskiy and Petruccione [167] used the pattern classification examples to briefly introduce quantum machine learning. Their work presented an algorithm for quantum pattern classification using Trugenberge's proposal to measure Hamming distance on the quantum computer. Schuld, Fingerhuth and Petruccione [168] implemented a distance-based classifier using a quantum interference circuit. In their approach, a new perspective was proposed where the distance measure of a distance-based classifier was evaluated using quantum interference in quantum parallel instead of the usual approach of the quantum machine merely mimicking the classical machine learning methods. Their approach was demonstrated on a simplified supervised pattern recognition task based on binary pattern classification.

The kernel-based machine learning method is another aspect of machine learning where quantum computing has been applied for data analysis application areas. The ability of quantum computing to efficiently manipulate exponentially large quantum space enables the fast evaluation of the kernel function more efficiently than classical computers. Blank et al. [169] presented a compact quantum circuit for constructing a kernel-based binary classifier. Their model incorporated compact amplitude encoding of real-valued data, which reduced the number of qubits by two and linearly reduced the number of training steps. Another kernel-based quantum binary classifier was presented by Blank et al. [170]. Their distance-based quantum classifier has its kernel designed using the quantum state fidelity between the training and the test data so that the quantum kernel can be systematically tailored with a quantum circuit. The training data can be assigned arbitrary weight, and the kernel can be raised to arbitrary power.

The development of the quantum kernel method and quantum similarity-based binary classifier exploiting feature quantum Hilbert space and quantum interference brought a great opportunity for enhancing classical machine learning through quantum computing. In Park, Blank and Petruccione's [171] work, the general theory of the quantum kernel-based classifier was extended to lay the foundation for advancing quantum-enhanced machine learning. The authors focused on using squared overlap between quantum states as the similarity measure to examine the minimal and essential ingredients for quantum binary classification. Their work also considered other extensions relating to measurement, ensemble learning and data type.

Schuld, Sinayskiy and Petruccione [172] designed an algorithm for pattern classification with linear regression on a quantum computer. Their approach focused on solving linear regression problems from the perspective of machine learning, where new inputs are predicted based on the dataset. Their algorithm produced the same result as the least square optimisation method for classical linear regression in a logarithmic time dependent on the feature vector's number N and independent of the training dataset size if presented as quantum information.

In Schuld and Petruccione [173], the authors introduced the quantum ensembles of quantum classifiers with parallel execution of each quantum classifier and the resulting combined decision accessed using a single qubit measurement. An exponentially large machine learning ensemble increases the performance of individual classifiers in terms of their predictive power and the ability to bypass the need for the training session. The ensemble was designed in the form of a state preparation scheme to evaluate each classifier's weight. Their proposed framework permits the exponential combination of many individual classifiers that require no training, like the classical Bayesian learning, and is credited with a quantum computing learning that is optimization-free.

In most kernel-based quantum binary classifiers, the algorithms require an expensive, repetitive procedure of quantum data encoding to estimate an expectation value for reliable operation resulting in high computational cost. Park, Blank and Petruccione [174] proposed a robust quantum classifier that explicitly calculates the number of repetitions necessary for classification score estimation with a fixed precision to minimize the program resource overhead.

4.5 Renewable Energy

In renewable energy and bioprocess modelling, Kana et al. [175] reported on the modelling and optimization of biogas production on mixed substrates of sawdust, cow dung, banana stem, rice bran and paper waste using a hybrid learning model that combines ANN and Genetic Algorithm. In another study, Whiteman and Kana [176] investigated the relevance of ANN in modelling the relationships between several process inputs for fermentative biohydrogen production and, after that, they suggested that the ANN model is more reliable for navigating the optimization space relative to the different parameters at play for the biohydrogen production system. The authors Sewsynker et al. [177] also reported the use of ensembles of ANNs in the modelling of biohydrogen yield in microbial electrolysis cells. The study showed that the employed ANNs model could accurately model the non-linear relationship between the physicochemical parameters of microbial electrolysis cells and hydrogen yield due to the ANNS capability to successfully navigate the optimization window in microbial electrolysis cell scale-up processes. ML has been used for multi-objective intelligent energy management for the microgrid to improve efficiency in microgrid operation [178]. A hybrid ML technique has been used for predicting solar radiation based on meteorological data [80] with an analysis of the influence of weather conditions in different regions of Nigeria. A machine learning model for predicting the daily global solar radiation was designed in Morocco by Chaibi et al. [179].

4.6 Prospects, Challenges, and Recommendations

The prospects of ML research in Africa are enormous. It also has challenges, such as bioinformatics research in Africa being limited by the availability of diverse and high-volume biomedical data for accurate analysis [101]. As data is central to ML, the Human Heredity & Health in Africa (H3Africa) consortium is championing efforts at generating and publicly publishing large genomics datasets of Africans [180]. Another obstacle is the lack of a computing backbone which includes internet connectivity and cloud computing, which leads to data outsourcing to the developed world [181].

Similarly, the prospects of ML will be inactive if appropriate investments in this direction are not made. Also, teaching AI techniques, including ML, must be improved and sustained. An adequate legal framework must be in place to ensure ethical research and innovative development [182]. A framework for support and collaboration with foreign agencies must be encouraged. For instance, the strategic partnership between the Smart Africa alliance and the German Ministry for Economic Cooperation and Development aims to support Africa's development through digital innovations [106, 183].

The diverse applicability and techniques promoting the use of AI systems have received more research efforts from ML. The increasing use of ML algorithms and their subsidiary methods, such as DL, has further shown the computational power of CNN, RNN, LSTM and hybrid models. These models have demonstrated outstanding performance in pattern recognition, classification, feature extraction, segmentation and other learning approaches. Interestingly, while current studies and state-of-the-art are majoring in hybridizing sequence models such as RNN with pixel models such as CNN for multimodal computation, little is mentioned on machine reasoning. The descent of machine reasoning from the aspect of knowledge representation and reasoning may not be directly associated with machine learning. Still, the successful integration of these two branches of AI holds the possibility for achieving high-performing systems in the near future. Machine learning, on the one hand, allows for fine-tuning models and their parameters in a manner that sets those parameters to enable the machine to behave in a manner simulated by a human.

On the other hand, machine reasoning provides means for formalising the existing body of knowledge siloed away in legacy systems for achieving reasoning and inference. Combining these two aspects of machine automation will promote what is termed neuro-symbolic systems, which allows for neural networks and rules with formalized knowledge to interface in a manner to drive new state-of-the-art AI applications. We motivate for redirection of study in AI, ML, and DL among African researchers to consider this aspect of learning and reasoning.

Another prospective integration of branches of AI which promises to promote the discovery of super intelligent systems is the application of clustering and optimization methods to the models of DL and deep reinforcement learning (DRL). Research in the design of DRL models is now yielding and controlling self-driving cars, fully automated systems, robotics and other aspects of autonomous systems. Although DRL draws from the concept of DL, we consider that identifying some features in DL models (e.g. CNN, RNN, LSTM, GRU and their hybrids) and effectively integrating them with DRL will uncover some outstanding high-level performance with regards to machine intelligence. Researchers in Africa are likely to develop an interesting outcome in this aspect, considering their progress in using these models in their current isolated form of use. Moreover, clustering and metaheuristic methods promise to provide relevant and hardcore optimization solutions to improve the integration of the hybrids mentioned earlier in this paragraph. Of course, we have seen several usages of metaheuristic methods in DL models and with the increasing use of clustering methods. This study motivates a way forward for an in-depth look into the possible interfacing of DRL, DL and some clustering methods with the use of optimization techniques for bolstering performance and computational cost.

The applicability of the resulting intelligent systems from the current and future state-of-the-art in AI, ML and DL is still in its infancy stage in Africa. The COVID-19 pandemic demonstrated that Africa still lags behind in adopting some of the research outcomes from its researchers. Although the effect of the pandemic is considered not to be very destabilizing when compared with other continents, the lesson that must be learnt is that Africa must prepare for a future pandemic by leveraging on the research outcome coming from research centres in Africa. Therefore, this holds prospects and challenges that can spur on or open up new interesting research areas. For instance, consider applying ML methods to building smart cities across Africa. This will draw from significant AI methods and systems successfully designed and developed for smarting out all infrastructures and facilities in such cities. Consider also the application of research efforts in Computer Vision to the challenge of aiding Africa's transport and communication (T&C) system. Firstly, the pedestrian system must be automated and integrated with the T&C system for an effective AI-driven computing network. We advocate for state-sponsored research in this direction as it holds the prospect of improving road connectivity and trade across the continent. Another interesting aspect of AI's applicability to Africa's peculiarities is in the area of crime monitoring and surveillance. For the latter, the progress made in Computer Vision combined with the Internet of Things (IoTs) has already provided for the deployment of facilities to aid the state's surveillance system and the law enforcement commissions. The former crime detection and monitoring concept will benefit from recent deep learning-driven natural language processing (NLP) methods to analyze a pool of data floating on different social media platforms and other text-driven systems for effective crime detection. Motivated by the increasing hosting of deep learning indaba conferences in Nigeria, Tunisia and South Africa, with most of them promoting DL-NLP, there is now a greater prospect of the application of these methods to crime detection and monitoring. In addition to this, this DL-NLP method showed that the rich multi-lingual formation across all tribes and peoples in Africa could interact more effectively and develop information-sharing mechanisms through the use of machine translation. For instance, it is well known that peoples speak languages like Hausa, Swahili, Yoruba, Arabic, and isiZulu in different countries. The adoption of machine translation will therefore help to build on this communication skill and close gaps. Lastly, with the plethora of research outcomes in medical image analysis and AI-driven computer-aided diagnosis (CAD) systems, healthcare delivery and medical sciences will receive a boost in health centres across Africa.

A current challenge which needs to be addressed to promote research in ML in Africa is an intensive and intentional investment in computational infrastructure. ML and DL experiments demand high computational power with the requirement for memory and graphical processing units (GPU), and reliable power grids. Stakeholders and government must integrate their thinking and resources to build a cohesive and robust computational infrastructure to help support researchers' efforts during experimentation and deployment. This is necessary to allow for rigorous testing and experimentation of new models capable of becoming new state-of-the-art globally. Moreover, the sustenance of startup hubs, as seen in Morocco, Nigeria, Ghana, Kenya and South Africa, needs to be promoted to allow for the convergence of test hubs for AI solutions being developed by African youths.

5 Conclusions

Machine learning evolved as a branch in AI, focusing on designing computational methods and learning algorithms that model humans' natural learning patterns to address real-life problems where human capability is limited or restricted. This paper presents a background study of ML and its evolution from AI through ML to DL, elaborating on the various categories of learning techniques (supervised, unsupervised, semi-supervised and reinforcement learning) that have evolved over the years. It also presents the contribution of different ML researchers across major African universities from niche areas or multi-disciplinary domains.

Moreover, a bibliometric study of machine learning research in Africa is presented. In total, 2761 machine learning-related documents, of which 89% were articles with at least 482 citations, were published in 903 journals in the Science Citation Index EXPANDED from 54 African countries between 1993 and 2021. There are 12 topmost frequently cited documents, of which five were review articles. Significant interest in machine learning research in Africa began in 2010, with the number of articles increasing slightly from 14 to 98 in 2017 and which then increased with a huge leap to 1035 articles by 2021. The highest article citation was recorded in 2013. The top four productive categories in the Web of Science, where more than 100 articles were published, include “electrical and electronic engineering”, “information systems computer science”, “artificial intelligence computer science”, and “telecommunication”, each recording 20%, 18%, 14% and 12% of the total number of articles respectively. The most productive journal is IEEE Access, with 192 articles (7.8%).

The top five journals with IF2021 of more than 60 published six of the articles: World Psychiatry (2), Nature (1), Nature Energy (1), Nature reviews (1) and Science (1). International collaborative articles recorded the highest number of articles, 74% involving 43 African countries and 103 non-African countries, while the remaining single-country articles were from 16 African countries. Egypt dominated with 31% of the total article publication, 29% being single-country articles and 32% being internationally collaboratively published. Ten African countries had no publication in machine learning-related articles, while 64% of the remaining countries had no single-country articles. Egypt and South Africa had similar development trends, but Egypt recorded a noticeably sharp increase in the last three years. Cairo University in Egypt ranked top among the most productive African institutions, with the University of Kwazulu-Natal in South Africa ranking top in three of the six publication indicators. King Saud University in Saudi Arabia tops the list of the five non-African institutions with 30 or more inter-institutionally collaborative articles with Africa.

Among the top ten most frequently cited machine learning-related articles in Africa, authors published five from Egypt, followed by authors from South Africa with two articles. The most cited article was published by Wright and Ziegler in 2017 from the University of Lubeck in Germany and the University of KwaZulu-Natal in South Africa, while Adadi and Berrada published the article with the most impact in the recent year 2021 by Sidi Mohammed Ben Abdellah University in Morocco in 2018. The four top keywords used by authors in African machine learning-related articles are classification, deep learning, feature extraction and random forest.

Furthermore, a review of machine learning techniques and their applications in Africa in recent years was presented, identifying the main branches of ML and their offshoot disciplines. The nine most significant machine-learning application areas in Africa were identified and discussed. Research on quantum implementations of machine learning algorithms in Africa for performance improvement of the classical machine learning techniques was also reviewed. Moreover, quantum machine learning is one area of interest in ML research which has positively projected the image of African research scholars from the University of KwaZulu-Natal and has equally attracted global attention from quantum computing enthusiasts. Finally, the prospects and challenges with recommendations regarding ML research in Africa were discussed in detail.