Introduction

Suicide is a global concern recognized by the World Health Organization (WHO), with a life lost to suicide every 40 s, making suicide prevention a pressing priority worldwide [1]. This form of violent death not only brings personal tragedy but also poses a significant threat to communities’ socio-psychological well-being and stability [2]. While suicide is a complex phenomenon influenced by multiple factors, behavioral, lifestyle, and clinical, can significantly contribute to an elevated risk of suicide [3]. For example, substance use can be considered a significant factor for suicide within the behavioral category [4]. Job and financial problems serve as important examples of lifestyle-related suicide risk [5]. Additionally, mental disorders are crucial clinical factors associated with suicide risk [6]. Early identification of risk factors is crucial in predicting suicide [7, 8]. The prevalence of suicide is exceptionally high among adolescents and young adults, specifically those aged 15 to 44, it is not a universal phenomenon [9]. Research indicates that, in some countries the lower risk of suicide among older individuals may be due to their enhanced resilience and greater capacity to cope with adversity, potentially reducing the likelihood of suicidal behavior [10, 11]. The other common factor can be gender. Some studies have revealed that gender differences in suicide rates indicate that men are more likely to die by suicide. However, this remains controversial because each gender is influenced by many other biological and environmental factors [12]. Suicide imposes financial burden on the healthcare system. For example, in Canada, New Zealand, and Ireland, the estimated direct and indirect costs of each suicide are approximately 443,000, 1.1 million, and 1.4 million pounds, respectively [13,14,15]. A comprehensive review of the works by these authors leads us to the conclusion that suicide is a global issue. Consequently, it is imperative for countries worldwide to collaborate in addressing this concern [1]. There is a growing interest in utilizing machine learning (ML) techniques for predicting suicide risk to address the issue. ML is a combination of statistical and computational models that can learn from data and improve through experience [16]. It is categorized into two main types: supervised and unsupervised. In supervised learning, the model is trained on labelled databases; however, in unsupervised learning, the model relies on unlabeled databases [17]. Both supervised and unsupervised algorithms can be utilized for suicide prediction depending on the type of database and the nature of the prediction.

Research by Walsh, Ribeiro, and Franklin (2017) demonstrated the superior performance of ML over conventional methods in accurately identifying suicide attempts [9]. ML methods have gained prominence due to their ability to extract valuable insights from diverse datasets and organize data efficiently [10, 11]. While ML shows promise in predicting suicide events, it is vital to consider the varied outcomes produced by different ML algorithms. The study conducted by various researchers suggests that while there have been notable scientific advancements in leveraging digital technologies, such as ML algorithms to prevent suicide attempts and identify at-risk individuals, there are still limitations in terms of training, knowledge, and the integration of databases [18,19,20]. Current suicide risk assessment methods heavily rely on subjective questioning, limiting their accuracy and predictive value [21]. As such, this study aims to systematically review previous research that has applied ML methods to predict suicide attempts and identify patients at high risk of suicide. The primary objectives are to evaluate the performance of various ML algorithms and summarize their effects on suicide. Additionally, the study aims to identify significant variables that serve as more effective suicide risk factors.

Materials and methods

Search strategy and study selection

We adhered to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines to systematically identify, select, and assess relevant studies for inclusion in our review. Our search strategy focused on PubMed, Scopus, Web of Science and SID databases, and there were no limitations on the publication date, ensuring comprehensive coverage of the literature. The project was initiated on June 1, 2022, and concluded on August 8, 2023, with a focus on two domains: machine learning (ML) and suicide.

To capture relevant studies, our search strategy incorporated keywords such as “self-harm”, “self-injury”, “self-destruction”, “self-poisoning”, “self-mutilation”, “self-cutting”, “self-burning”, “suicid*”. Additionally, we explored using artificial intelligence and ML techniques to predict suicidal attempts by employing “AND” and “OR” operators. The management of literature was facilitated through Endnote X7.

The study encompassed two primary outcomes: first, identifying the most effective ML algorithms based on their outcome measures, and second, identifying influential risk factors for predicting suicide. These outcomes were instrumental in achieving a comprehensive understanding of the field and informing our research objectives.

Inclusion & exclusion criteria

Inclusion criteria were applied to identify relevant studies for our review. The following criteria were considered:

  1. 1.

    Population: Studies that included participants from various age groups, including pediatrics, geriatrics, and all-age populations, were included.

  2. 2.

    Language: Only studies published in the English language were included.

  3. 3.

    Methods: Studies employing ML methods to predict suicide were included.

  4. 4.

    Publication format: Studies published as journal articles, theses, and dissertations were included.

  5. 5.

    Study design: Various study designs, including prospective, retrospective, retrospective cohort, case-cohort, case-control, cohort, diagnostic/prognostic, longitudinal, longitudinal cohort, longitudinal prospective, prognostic, prospective cohort, retrospective, retrospective cohort, and randomized control trial studies, were considered for inclusion.

Exclusion criteria were applied to select relevant studies for our analysis. Studies were excluded if they met the following criteria:

  1. 1.

    Population: Studies focusing specifically on military personnel and veterans were excluded. Including military personnel and veterans in our analysis could introduce unique variables and considerations related to their distinct healthcare needs, access to services, and experiences. For example, military personnel and veterans often have specific healthcare requirements stemming from their service-related experiences. These may encompass a range of issues, including physical injuries sustained during deployment, exposure to hazardous environments leading to unique health challenges, and complex medical histories shaped by their military service.

    Moreover, their access to healthcare services can differ significantly from that of the general population. To maintain the homogeneity of our study population and to ensure the relevance and applicability of our findings to the specific context of hospitals, we have opted to exclude this subgroup.

  1. 2.

    Social media-based studies: Studies aiming to predict suicide attempts using ML among adults who posted or searched content related to suicide on internet platforms such as Twitter, Instagram, and Google were excluded.

  2. 3.

    Natural language processing (NLP) and image processing methods: Studies utilizing NLP and image processing techniques for predicting suicide attempts were excluded.

  3. 4.

    Publication type: Conference papers, reviews, letters to editors, book chapters, and commentary papers were excluded from the analysis.

By applying these inclusion and exclusion criteria, we aimed to select studies that align with the objectives and focus of our research.

Data collection process

Data extraction was conducted using Microsoft Excel 2016 spreadsheet. The following information was extracted from each included study:

  1. 1.

    Study title: The title of the research article.

  2. 2.

    Authors: The names of the authors involved in the study.

  3. 3.

    Year of publication: The year in which the study was published.

  4. 4.

    Country of study: The geographical location where the study was conducted.

  5. 5.

    Population: The target population or participants involved in the study.

  6. 6.

    Type of study: The research design employed in the study.

  7. 7.

    Sample size: The number of participants included in the study.

  8. 8.

    Study objective: The main objective or aim of the study.

  9. 9.

    Suicide risk factors: Factors or variables considered in predicting suicide risk.

  10. 10.

    ML models: The specific ML models used in the study.

  11. 11.

    Outcome measures: Various performance metrics used to evaluate the models, including area under the curve (AUC), sensitivity, specificity, accuracy, false negative rate, false positive rate, true positive rate, negative predictive value, positive predictive value, precision, and recall.

Quality assessment

The quality of the included articles was assessed using the Mixed Methods Appraisal Tool (MMAT 2018) following the search process. We adopted MMAT’s five sets of criteria to evaluate the quality of each type of study included in our analysis, namely qualitative, randomized controlled, nonrandomized, quantitative descriptive, and mixed methods studies [8]. This rigorous assessment process allowed us to evaluate the included studies’ methodological quality and ensure our findings’ reliability and validity.

Data analysis methods

During the quantitative phase, the extracted data were analyzed using STATA 14.1 statistical software to conduct meta-analytic procedures. We applied the Freeman-Tukey double arcsine transformation to estimate the pooled prevalence of study outcomes and their corresponding 95% confidence intervals (CI). This transformation was also utilized to stabilize variances when generating the CIs. A random-effects model based on DerSimonian and Laird’s method was employed in the pooled data collection to account for between-study variability. This model incorporated the variability using the inverse-variance fixed-effect model [22, 23].

In the qualitative phase, the extracted data was imported into MAXQDA 20 software to facilitate meta-synthesis procedures. This critical stage involved coding the suicide risk factors from our final studies based on various themes or categories, such as demographic (demographic factors, such as age, gender, marital status), clinical and behavioral (certain behaviors, like impulsivity, self-harm, or aggression, and clinical factors involve mental health diagnoses and conditions), lifestyle (encompass aspects of an individual’s daily life, including habits and routines), laboratory and biomarkers (these could include genetic markers, hormonal imbalances), and questionnaires (the use of standardized scales and questionnaires helps quantify and measure psychological factors associated with suicide risk). Through this process, we aggregated the coded data to identify common suicide risk factors across all the studies, allowing for a comprehensive understanding of the topic.

Publication bias

We researched various languages and databases to address study, citation and database biases. We enhanced our search strategy, resulting in the identification of 7529 publications. This abundance of sources highlights the prevalence of multiple-citation publications within our dataset. Given the common occurrence of publishing study results in similar or consecutive articles, we utilized EndNote software to identify duplicates and mitigate the risk of multiple publications and bias.

Results

Figure 1 presents the PRISMA flow chart, which provides a concise review process overview. The initial search yielded 7,529 published studies. After removing 569 duplicate records, we screened the titles and abstracts of the remaining 6,624 papers. Based on this screening, 5,624 papers were excluded as they did not meet the inclusion criteria. Subsequently, the full texts of the remaining 369 studies were thoroughly assessed to determine their eligibility for inclusion in the analysis. Among these, 328 studies were deemed ineligible as they did not meet the predetermined criteria. Ultimately, 41 studies were selected for the meta-analysis and meta-synthesis, meeting the quality assessment criteria. Overall, the selected studies demonstrated satisfactory quality.

Fig. 1
figure 1

PRISMA flow diagram for the selection of studies on ML algorithms used for the purpose of suicide prediction

The included studies had sample sizes ranging from 159 to 13,980,570, as reported in previous research [24, 25]. The mean sample size of (Mean = 549,944.51) refers to the average number of participants across the studies included in our analysis. This value is important as it indicates the generalizability of the findings. Larger sample sizes contribute to more robust and reliable results, allowing for broader applicability of our conclusions.

Standard deviation of (SD = 2,242,858.719), reflects the variability in sample sizes observed across the individual studies. Some studies may have significantly larger or smaller sample sizes compared to the mean, resulting in a wide dispersion of values. This variability influences the heterogeneity of the overall findings and underscores the need to consider the diversity in sample sizes when interpreting the results. The median of sample size, representing the central value, is 13,420. Most of these studies were conducted in the United States and South Korea, with cohort and case-control designs being the most employed study designs. The participants in these studies predominantly represented the general population. The outcome measurement criteria of the data collection process and its results are presented below.

Pooled prevalence of ML outcomes

Additional details of the studies included can be found in Table 1 (after reference section). Note that the statistical analysis revealed that the negative predictive value and the false positive rate did not show a significant difference, with a p-value greater than 0.05. To identify single study influence on the overall meta-analysis, a sensitivity analysis was performed using a random-effects model and the result showed there was no evidence for the effect of single study influence on the overall meta-analysis.

Table 1 Details of included studied, systematic review mixed-method in order to identify most common suicide risk factors and ML outcomes

Accuracy

Accuracy refers to the ability of ML models to differentiate between health and patient groups [56] correctly. Out of the 41 final studies, 13 studies reported information on accuracy. The reported accuracy rates varied across the studies as indicated in Panel A of Fig. 2, with the lowest being 0.70 for NN in the study conducted in [30], and the highest being 0.94 for the random forest in the study conducted in [32]. The overall pooled prevalence of accuracy was 0.78 ((\({I}^{2}=56.32\%\); 95% CI: 0.73, 0.84), Table 2.

Fig. 2
figure 2

Panel A. Accuracy of the machine learning models; N studies = 13

Table 2 The list of different ML outcomes, along with the pooled estimates for those outcomes that have sufficient records

AUC

The area under the curve (AUC) as a metric used in this study to compare the performance of multiple classifiers [26]. In our analysis, a total of thirty-two studies reported AUC values as indicated in Fig. 3, Panel B. Balbuena et al.’s (2022) study reported the lowest AUC of 0.54, based on the COX and random forest models. On the other hand, Choi et al.’s (2021) reported the highest AUC of 0.97, using the XGBoost classifier. The pooled prevalence of AUC across the studies was estimated to be 0.77 (\({I}^{2}=\) 95.86%; 95% CI: 0.74, 0.80), Table 2.

Fig. 3
figure 3

Panel B. AUC of the machine learning models; N studies = 32

Precision

Precision is a measure that determines the number of true positives divided by the sum of true positives and false positives [27]. In our analysis, three studies reported precision values as depicted in Fig. 4, Panel C. Two studies reported the highest precision rate of 0.93. The first study, conducted by Choi et al., utilized the XGBoost classifier, and the second study, by Kim et al. (2021), employed a random forest model. On the other hand, the lowest precision rate of 0.86 was documented in the Delgado-Gomez et al. (2016) study, which used a decision tree model. The pooled prevalence of positive predictive value was estimated to be 0.91 (\({I}^{2}=\) 0.001%; 95% CI: 0.85, 0.98), Table 2.

Fig. 4
figure 4

Panel C. Precision of the machine learning models; N studies = 3

Positive predictive value

Positive predictive value (PPV) represents the proportion of true positive cases among all positive predictions [27]. Among the studies included in our analysis, six studies reported PPV values as depicted in Fig. 5, Panel D. The PPV varied across the studies, ranging from 0.01 in Cho et al.’s study conducted in 2020 and 2021, which utilized a random forest model, to 0.62 in Navarro’s (2021) study, also employing a random forest model. The pooled prevalence of PPV was estimated to be 0.10 (\({I}^{2}=\) 97.02%; 95% CI: 0.03, 0.21), Table 2.

Fig. 5
figure 5

Panel D. Positive predictive value of the machine learning models; N studies = 6

True positive rate

The true positive rate (TPR), also known as sensitivity, represents the proportion of actual positive cases correctly identified by the model [27]. In our analysis, only one study conducted by Ballester et al. (2021), utilized the gradient tree boosting model, reported the TPR as depicted in Fig. 6, Panel E. The pooled prevalence of TPR in this study was estimated to be 0.77 (95% CI: 0.40, 1.34), Table 2.

Fig. 6
figure 6

Panel E. True positive rate of the machine learning models; N studies = 1

Sensitivity

Sensitivity, also known as the true positive rate, measures the proportion of actual positive cases correctly identified by the model (specified patient cases) [28]. In our analysis, fifteen studies provided data on sensitivity as illustrated in Fig. 7, Panel F. The sensitivity ranged from 0.43 in Navarro’s (2021) random forest study to 0.87 in Delgado-Gomez et al.’s (2016) decision tree study. The pooled prevalence of sensitivity was estimated to be 0.69 (\({I}^{2}=\) 95.94%; 95% CI: 0.60, 0.78), Table 2.

Fig. 7
figure 7

Panel F. Sensitivity of the machine learning models; N studies = 15

Specificity

Specificity is a measure that identifies the proportion of actual negative cases correctly identified by the model [28]. In our analysis, fifteen studies reported specificity rates as illustrated in Fig. 8, Panel G. The specificity ranged from 0.63 in Melhem et al.’s (2019) study using logistic regression, to 0.90 in Barak-Corren et al.’s (2017) study using Naive Bayesian classifier. The pooled prevalence of specificity was estimated to be 0.81 (\({I}^{2}= 80.31\%;\)95% CI: 0.77, 0.86), Table 2.

Fig. 8
figure 8

Panel G. Specificity of the machine learning models; N studies = 15

Recall

Recall is a measure that determines the proportion of true positive cases correctly identified by the model [27]. In our analysis, three studies reported recall rates as depicted in Fig. 9, Panel H, ranging from 0.11 in McKernan et al.’s (2019) study using bootstrapped L-1 penalized regression to 0.95 in Kim et al.’s (2021) study using random forest. The pooled prevalence of recall was estimated to be 0.58 (\({I}^{2}= 98.43\%;\)95% CI: 0.15, 1.29), Table 2.

Fig. 9
figure 9

Panel H. Recall of the machine learning models; N studies = 3

False negative rate

False negative rate represents the proportion of actual negative cases incorrectly identified by the model [29]. Two studies provided data on false negative rates, with rates that were similar to each other as shown in Fig. 10, Panel I. These studies utilized the random forest and binary logistic regression models. The pooled prevalence of the false negative rate was estimated to be 0.26 (\({I}^{2}= 0.001\%;\)95% CI: 0.24, 0.28), Table 2.

Fig. 10
figure 10

Panel I. False negative rate of the machine learning models; N studies = 2

Suicide risk factors

In our meta-synthesis analysis, we studied 41 studies in which we identified 261 suicide risk factors. We implemented a rigorous extraction process to identify the most significant risk factors. While some studies presented vast datasets with over 2500 entries of potential risk factors, the focus was on extracting those factors consistently cited as common and important indicators of suicide risk across multiple studies [30, 31]. To ensure robustness, we excluded risk factors reported less than three times, resulting in the compilation of 55 frequently occurring risk factors. We aimed to focus on more prevalent risk factors in the database to enhance the generalizability of the findings to the broader population. Some factors with lower frequencies can introduce noise in the analysis, making it more challenging to identify true patterns. The minimum threshold helped us filter out less relevant factors. This decision was based on a focus group session that included two psychiatrists and one emergency physician. The focus group selected the most common variables that were repeated more than three times based on their scientific knowledge and experience. These factors were categorized into five distinct categories in our study, as outlined in Table 3.

Table 3 Frequently occurring suicide categories and risk factors

Discussion

This study employed a systematic review, meta-analysis, and meta-synthesis approach to examine the pooled prevalence of ML outcomes for predicting suicide and provide a comprehensive list of suicide risk factors. The intricate nature of suicide as a behavior is underscored by a diverse array of risk factors, spanning clinical variables to lifestyle influences [32]. Our study adopted a comprehensive approach, employing both qualitative and quantitative methods. Additionally, the study was limited to studies with prospective, retrospective, retrospective cohort, case cohort, case-control, cohort, diagnostic/prognostic, longitudinal, longitudinal cohort, longitudinal prospective, prognostic, prospective cohort, retrospective, retrospective cohort, and randomized control trial designs due to the large number of studies in the final stage, and to ensure methodological rigor. Ultimately, 41 studies were selected for the meta-analysis and meta-synthesis, meeting the quality assessment criteria. Results revealed the neural network (NN) algorithm with the lowest accuracy at 0.70, contrasting with the random forest exhibiting the highest accuracy at 0.94. Furthermore, the XGBoost classifier demonstrated the highest Area Under the Curve (AUC) value, reaching an impressive 0.97. These findings not only contribute to our understanding of suicide risk factors but also highlight the significance of methodological considerations and algorithmic performance in predictive models.

The findings of this study are consistent with previous research conducted by [33, 34] which suggested that ML algorithms and the identification of innovative indicators play a valuable role in predicting suicide and detecting mental health issues. However, these findings contradict the results of [35], which indicated insufficient evidence to support the superior performance of ML over logistic regression in clinical prediction models. The studies included in the analysis that used ML techniques to predict suicidal attempts demonstrated overall good performance on the most commonly used algorithms, namely XGBoost. For example, the AUC values reported in these studies were consistently high, ranging approximately between 0.65 and 0.97. An AUC value of 0.5 indicates a random prediction, while a value of 1 represents a perfect prediction. The AUC values in the range of 0.97 for XGBoost model suggest that the ML models had a high degree of accuracy in classifying individuals with respect to their risk of suicidal attempts. The findings of this study are consistent with previous research conducted by [36] which confirmed acceptable performance of XGBoost algorithm in cognition of patients with major depressive disorder. This result may be due to the fact that XGBoost is an ensemble model that constructs various models to reduce classification errors on each iteration. According to [37], certain ML algorithms, such as support vector machines (SVM) and decision trees (DT), are preferred over others due to their superior performance in predicting suicide risk. Furthermore [38], confirmed that the application of ML techniques to analyze large databases holds great potential for facilitating the prediction of suicide, offering promising avenues for future research. The results of this study align with the findings of [39], which highlighted the ability of ML to enhance suicide prediction models by incorporating a larger number of suicide risk factors. Applicability of these methods in specific patient groups is invaluable. For example [40], indicated that predicting whether a person has a mental illness itself poses a significant challenge. Therefore, if machine learning can offer a new avenue of hope for clinicians, it is commendable. However [41], discovered that although these models have demonstrated accuracy in the overall classification, their ability to predict future events remains limited in the context of suicide prediction models.

Consequently, it is important to note that the performance of ML algorithms can vary depending on various factors, including the quality and size of the dataset, the specific features used as input, the preprocessing steps applied, and the hyperparameters selected for the algorithms. Therefore, the overall performance of these algorithms in predicting suicide showed strong discriminatory power in distinguishing between individuals who are at risk of suicidal attempts and those who are not. Future research should continue exploring and refining ML approaches for suicide prediction, considering these factors to improve the accuracy and reliability of predictions.

The findings of our study revealed that various factors, such as age, sex, substance abuse, depression, anxiety, alcohol consumption, marital status, income, education, low-density lipoprotein (LDL) and occupation, were identified as the most prevalent risk factors based on the analysis of included studies. Age plays a complex role in suicide, with several studies indicating a higher incidence of suicide among middle-aged and older adults. However, it is important to note that age is not the sole factor contributing to suicidal behavior [42, 43]. The prevalence of suicide is exceptionally high among young adults, specifically those aged 15 to 19 as it is a fourth cause of death in the world [44]. Sex is a significant risk factor for suicide. In general, men are more likely to die by suicide than women, but women attempt suicide more often than men. This may be because men are more likely to use lethal methods [42, 45, 46].

According to the meta-synthesis results, there appears to be a significant correlation between substance abuse and depression with suicide. This correlation may be because substance abuse can impair judgment and increase impulsivity. On the other hand, a person who is depressed may experience feelings of hopelessness, helplessness, and despair, which can lead to suicidal thoughts or behaviors. These findings align with the study conducted by [47, 48]. Anxiety as a mental health condition can lead to various negative outcomes, including an increased risk of suicide [49]. Alcohol use can increase impulsivity and decrease inhibitions, leading to risky behaviors such as self-harm or suicide attempts [5051]. found that the consumption of alcohol while feeling sad or depressed could indicate suicidal behavior in adolescents who had not previously reported having thoughts of suicide before attempting it.

Marital status is a common suicide risk factor. Researchers have found that married individuals have lower suicide rates than their unmarried counterparts. This trend is observed in both men and women across different age groups and cultures [52]. Low income has been associated with an increased risk of suicide. The reasons for this link are complex and multifactorial, but some possible explanations include limited access to healthcare and mental health services, financial strain, and social isolation [53]. Lower education levels are also associated with higher suicide rates. This may be because lower education-level individuals have fewer job opportunities and may experience more financial stress [53]. In addition to the clinical and demographic factors discussed, it is crucial to recognize the significant role that certain biomarkers and laboratory factors play in the vulnerability to suicide. One notable example is the impact of low serum cholesterol levels, which have been found to significantly heighten the risk of suicide [54]. Some studies have shown that LDL level is an important factor in the incidence of suicide [55]. Moreover, some studies have indicated that individuals who have committed suicide had higher levels of LDL compared to non-attempters [56].

Machine learning (ML) techniques are suitable for predicting suicide risk, overcoming the constraints of traditional methods. However, ML requires sufficient and relevant data to train and validate the early identification of risk factors and suicide prediction. We acknowledge the importance of anticipating and addressing immediate concerns related to suicide in a clinical setting. Due to this, some studies have focused on utilizing certain scales in psychiatric outpatients [57]. However, reliance solely on these scales may instill an unwarranted sense of assurance among healthcare providers. Hence, it is crucial to factor in data availability and the computational demands of handling extensive datasets and intricate models. Our evaluation underscores the proficiency of ML algorithms in uncovering concealed relationships and delivering precise predictions of suicide risk, contingent upon the judicious selection and meticulous evaluation of algorithms. This underscores the indispensable role of ML algorithms in exhaustively analyzing data and pinpointing crucial risk factors, thereby advocating for further exploration in the field. This methodological breadth mirrors the multifaceted nature of suicide risk prediction, enhancing the generalizability of our findings. However, our study may be susceptible to limitations arising from the included studies and the meta-analysis methodology. Additionally, reliance on published literature may introduce publication bias, favoring studies with statistically significant results and potentially skewing overall findings. Furthermore, it is suggested to report τ² and the Q-statistic in future studies to assess heterogeneity. Despite these challenges, our study offers valuable insights into the role of machine learning algorithms in predicting suicide risk and sheds light on important risk factors associated with suicidal behavior. Future research endeavors will continue to tackle these methodological hurdles, striving for enhanced standardization and transparency in study reporting to fortify the reliability and reproducibility of findings in this crucial domain of inquiry.

Ethical considerations in the use of ML for suicide prediction

Machine learning (ML) for suicide prediction requires the implementation of ethical considerations as the well-being and rights of individuals and the privacy and confidentiality of individuals’ data are crucial. Participants should be fully informed about the study’s purpose, potential risks, and benefits and have the right to withdraw their consent at any time. Understanding and interpreting the factors and variables that contribute to the predictions is important. This transparency is required to gain the trust of both individuals at risk and healthcare professionals. Ensuring that ML algorithms cannot be replaced by human intervention and clinical judgment is important. Human oversight is critical in interpreting and acting upon the predictions made by the algorithms. Healthcare professionals should make informed decisions based on ML predictions, considering the individual’s unique circumstances and context [58].

Conclusion

Suicide is a complex and multifaceted public health issue with significant implications for individuals and communities. Our study examined the application of ML techniques for predicting suicide risk. Our research findings highlight the diverse performance of ML algorithms in predicting suicide, indicating the need for further investigation and refinement.

Our analysis identified several general risk factors contributing to an individual’s heightened risk of suicide. These factors include age, sex, substance abuse, depression, anxiety, alcohol consumption, marital status, income, education, and occupation. Recognizing that these risk factors interact in complex ways is important, and their presence does not guarantee suicidal behaviour. Nonetheless, understanding and addressing these risk factors can aid in developing targeted prevention and intervention strategies.

While ML algorithms have shown promise in predicting suicide risk, their performance can vary depending on the specific dataset and risk factors being considered. Further studies are warranted to explore using ML algorithms across diverse databases encompassing various risk factors. This would allow for a more comprehensive understanding of the predictive capabilities of ML in different contexts and populations.

Moreover, future research should focus on enhancing the interpretability and explainability of ML models in suicide prediction. Understanding the underlying mechanisms and variables contributing to predictions is essential for effective intervention and decision-making. Additionally, rigorous validation and evaluation of ML algorithms should be conducted to assess their accuracy, generalizability, and potential biases.

To advance the field of suicide prediction using ML, collaboration between researchers, clinicians, and policymakers is crucial. This interdisciplinary approach can foster the development of comprehensive and ethical frameworks for implementing ML algorithms in suicide prevention efforts. Ensuring that ML techniques are used responsibly, prioritizing patient well-being, privacy, and equitable outcomes is imperative.

In conclusion, our study sheds light on the potential of ML algorithms in predicting suicide risk. However, further research is needed to refine and validate these algorithms across different datasets and risk factors. By understanding the complexities of suicide and leveraging the power of ML, we can work towards more effective strategies for suicide prevention and intervention.