1 Introduction

Artificial Intelligence (AI) systems came into being to reduce errors that the human beings are vulnerable to in all types of decision making, and the interaction between mind and machine has come a long way from mere duality of man and machine to sophisticated AI systems. Fiction writers dominated in the past how hypothetically mind and machines interacted (Malik 2001); science and technology dominates in this relationship dominates today (Desouza 2002; Pylyshyn 1986). In full force, this human and computer duality came into being after the arrival of computing technology, transformed after the arrival of the internet and hyped after the arrival of AI. Today, we have entered the era of CGPT (ChatGPT), staring at our intellectual windows with 6640 entries in google scholar on March 24th, 2023. Appendix A shows general diffusion of AI systems in various fields, and Appendix B shows CGPT in in various fields through google search. In research, it has been touted as research assistant (Patel and Lam 2023), priorities of research (van Dis et al. 2023), fun but not author (Thorp 2023), ethical challenge (Liebrenz et al. 2023), and deficient of trust (Tsigaris and Teixeira da Silva 2023). Thus, the diffusion of CGPT created its proponents and opponents.

Proponents of CGPT bundle its blessings for error reduction, and opponents cry its cures for error expansion. It is a blessing because it complements scholarly efforts, saves time and reduces errors. It is a curse because expands the scope of errors of commission and omission. High number of errors and increases time required to correct them is a curse. While the public media has hyped its values and entrepreneurs have begun to produce vlogs about making money through this blessing, scholars are in a state of ambiguity, curser claim errors of commission and omission have increased rather than decreased (Kolt N: Algorithmic Black Swans, forthcoming). Despite the disagreement on the types and level of errors, both sides agree that AI systems like CGPT are research assistance for scholars and practitioners in R&D setting. While assumptions are abounding that CGPT has waste information, capacity to assemble it and efficiency to save time on its retrieval (e.g., literatures), it is still early to conclude because little evidence exists in response to the natural question: does CGPT reduce errors, and if so, when does it succeed or fail? This article conducts an experiment in a case study of published literature to create a cycle of errors detection:

  • Citation to CGPT summary,

  • CGPT summary to the actual abstract and the summary proximity,

  • CGPT summary to citation from which the summary was generated

  • Citation to actual abstract

This experiment is relevant and important to understand the published literature CGPT is glorified by some and disdained by others as a research assistant or tool. Like a robot has substituted human resources in factories, AI systems have come to threaten professions such as accounting, law and now research assistants. It is claimed that CGPT can provide research assistance to scholars by replacing the long hours or human skills. But no one has explored whether such AI tools are effective, accurate and meaningful. The experiment is based on a single author (who is able and willing to take part in the study for timely feedback). The participating provided a list of publications, among which the current study selected 34 published articles in various journals. The experiment revealed insights about errors of omission and errors of commission, stretching from human biases to AI biases.

2 Theory and framework

Error of commission and error of omission are two concepts used in decision-making and management that refer to different types of mistakes or errors that can occur. Error of Commission: An error of commission occurs when an action is taken, but the action taken is the wrong one, or it is not the appropriate one for the situation. In other words, it refers to an error of action or decision-making where an incorrect or inappropriate action is taken. It involves doing something that should not have been done or doing it in the wrong way. Error of Omission: An error of omission occurs when an action or decision is not taken when it should have been taken. In other words, it refers to an error of inaction where an action that should have been taken was not taken or was delayed. It involves not doing something that should have been done.

As an example of the error of commission, a doctor prescribes the wrong medication to a patient, causing adverse side effects that worsen the patient's condition. Here, the doctor made an error of commission by taking action and prescribing the wrong medication. In contrast, an emergency medical technician arrives at an accident scene but fails to provide immediate first aid because the technician did not follow the standard protocols in the training manual. In this case, the technician made an error of omission by not taking the necessary action that should have been taken. While they are both errors that occur intentionally or unintentional, they are different in assumptions, processes and implications for the decision. Theorists and scholars have tackled these two types of errors for centuries, recent decades have shown distinctive progress in different fields. Five theoretics of the error of commission and omission in decision-making theories have been use widely. Here, they are discussed in temporal order of their related publications on the two types of errors.

Simon (1959) used the concept of bounded rationality to explain the cognitive limitation of decision-makers, narrated the cognitive biases and heuristics on decision-making, and implied errors of commission and omission because of the limited information on decision making. Reason (1990), a British psychologist introduced the "Swiss Cheese Model" of accident which illustrates that accidents can occur when multiple factors or errors align like holes in a series of slices of Swiss cheese. Taleb (2010) suggests that errors of omission can arise from failing to anticipate or prepare for these types of events, while errors of commission can arise from over-preparing or over-reacting to perceived risks. Kahneman (2011) written extensively on the cognitive biases and heuristics that affect decision-making, including the biases that can lead to errors of omission or commission. His work suggests that errors of omission and commission can arise from different cognitive biases and heuristics, such as the availability heuristic, confirmation bias, and anchoring bias. These biases can lead decision-makers to overlook or omit important information, or to commit to incomplete or inaccurate information. Klein (2017) emphasises the role of intuition and pattern recognition in in decision-making process in high-pressure environment, leading to errors of commission that arise from relying too heavily on intuition and failing to consider alternative courses of action or errors of omission that arise from failing to recognize important patterns or information.

These theorists and their perspectives suggest that errors of omission and commission can arise from a range of factors, including cognitive limitations, biases and heuristics, failure to recognize potential weaknesses in a system, reliance on intuition, and failure to prepare for rare and unpredictable events. The common thread between these authors is twofold: cognitive limitations and dualities of logics—one of cognitive information and the other of heuristics based on experience. The tilt of balance between the two can lead to commission or omission. Overall, while these theorists share an interest in understanding error of omission versus commission, they approach the topic from different angles and emphasize different factors in their analyses.

They also differ from each in subtle ways. Simon's (1959) bounded rationality theory and Kahneman's (2011) cognitive biases and heuristics focus more on the cognitive limitations. and biases of decision-makers. Reason’s (1990) Swiss Cheese Model emphasises the importance of understanding the different layers of defence in a system for the errors detection and mechanisms that lead to accidents. Likewise, Klein and Taleb subtly differ. Klein (2017) emphasizes the role of intuition and pattern recognition in decision-making; Taleb (2010) emphasizes the importance of preparing for rare and unpredictable events. The following examples demonstrate the error of commission versus omission.

2.1 AI and errors

Although AI systems are made to reduce these errors, the reasoning of inclusion or exclusion (commission or omission) relates to systems through the scope of information, selection process and retaining order. The scope of information implies the range of information from proximal to distal; the selection process implies that the output is dependent on the input, and the training or retain order implies whether rewards or risks are used as primers. Naturally, the biases from the information size, quality and arrangement are human dependent activities, and they can introduce those biases into systems. Biased retrieved output in artificial intelligence systems can be contextualized in terms of errors of omission and commission. Error of omission occurs when an AI system fails to include relevant information in its output, while error of commission occurs when the AI system includes irrelevant or incorrect information in its output. Errors of omission in AI systems can occur due to biases in the data used to train the system. For example, if a training dataset is not diverse enough, the AI system may not be able to recognize certain patterns or make accurate predictions for certain groups of people. This can result in the system omitting important information that could be relevant to certain individuals or communities.

On the other hand, errors of commission in AI systems can occur due to the inclusion of biased or incorrect information in the training dataset or in the algorithms used by the system. For example, if a dataset contains stereotypes or discriminatory information, the AI system may incorporate these biases into its output. This can lead to the perpetuation of harmful stereotypes and discriminatory practices. In theory, errors of omission and commission can be contextualized within the broader framework of algorithmic bias. Algorithmic bias refers to the systematic and unfair treatment of certain individuals or groups of people by algorithms. Errors of omission and commission are both types of algorithmic bias that can result in harm to individuals and communities.

2.2 Hegemonic errors

Hegemonic errors refer to the notion of Mathew Effect explained elsewhere in this article, which alludes to the position of the person in the social status of academic fields rather than intellectual capabilities or contribution of the field. While cognitive bias and error refer to the mental capacity of information process (Tversky and Kahneman 1974), the hegemonic error occurs because social status in different domains or levels. For example, in the domain related hegemonic error, English language has gained hegemony and error in English language or English outlets have different social status and response from the decision maker or the public (Choi 2010). In the level related hegemonic error, the ethical principles and practices within American business organizations are influenced by the changing dynamics of power and hegemony (Marens 2010). This social status bias suggests two types of errors, people in positions of power or privilege may be more likely to make errors of commission because they feel pressure to take action, even if that action may not be warranted or may have negative consequences.

The Chinese Ministry of Foreign Affairs (CMFA 2023) refers to USA hegemony and its two types of errors because of Political Hegemony, Military Hegemony, Economic Hegemony, Technological Hegemony, and Cultural Hegemony. Mearsheimer (2018) calls it ‘a great delusion’, which amounts to a potential for errors. The hegemonic power gives a sense of superiority to entities. They are more likely to overlook important information or perspectives that challenge their assumptions, leading to errors of omission. Similarly, people who are marginalized or underprivileged may be more likely to make errors of omission because they lack the resources or social capital to take action or to ensure their voices are heard. They may also be more likely to underestimate the potential impact of their actions, leading to errors of commission. In both cases, social status biases can reinforce existing power structures and perpetuate inequality. For example, if decision-makers consistently make errors of commission because they feel pressure to take action, they may be reinforcing the idea that action is always necessary, even if it has negative consequences for marginalized groups. Similarly, if decision-makers consistently make errors of omission because they overlook marginalized perspectives, they may be perpetuating the marginalization of those groups.

At the second level, the social status of others influences hegemonic errors. the social status of others can certainly influence errors of commission or omission in decision-making, and the Matthew effect is one way to understand this phenomenon. The Matthew effect is the idea that those who have more resources, opportunities, or advantages are more likely to accumulate even more resources, opportunities, or advantages over time. This can occur because people who are already successful are more likely to be recognized, rewarded, and given additional opportunities to succeed. Conversely, people who are not as successful may be overlooked or undervalued, which can make it more difficult for them to achieve success in the future.

In the context of decision-making, the Matthew effect can contribute to errors of commission or omission in several ways. For example, decision-makers who are biased towards those with higher social status may be more likely to take action or provide resources to those individuals, even if they do not necessarily need them or are not the most deserving. This can lead to errors of commission, where resources are wasted or misdirected, and may perpetuate existing power structures. Conversely, decision-makers who overlook or undervalue those with lower social status may be more likely to make errors of omission, where important perspectives or needs are overlooked. This can perpetuate existing inequalities and limit opportunities for those who are already marginalized.

3 Methods

In this experiment, the main participant is one researcher who was able to commit time and efforts during the data gathering and analysis in this case study. It is a case study because the literature produced by a single scholar is used in the analysis. Other author four authors approached were unavailable or unwilling to participate in this project. The author two participants are Artificial Intelligence (AI) systems: Google Scholar and CGPT. The former was used to gather basic statistics on AI in general and CGPT in particular across disciplines for comparative analysis with research assistance. Appendices A and B show the summary of the search entries and retravel frequencies. The latter was the focus of this analysis; hence, CGPT is the AI system participant in the case. Together, the participating author and CGPT contributed to the progress of this study to answer the research question with accurate information.

The unit of analysis within the case study was each publication of the total of 34 publications in peer reviewed journals. These articles were diversified in theories such as institutional theory, cultural theories, psychological theories, transaction cost theory, and governance theories. The fields education, biotechnology, SMEs (small and medium enterprises), information and communication technology, and military technology. The location included city regions, national regions and international regions. The level of analysis included person, organisation, alliances and nations. Regarding methodologies these 34 studies include quantitative, qualitative and mixed with deductive, inductive and abductive reasoning to support the arguments proposed. Thus, the reviewed article in the case study of this single author (who is either first or corresponding author) is diversified for the purpose set out in this study.

In the interjection between the case and CGPT, the process followed three distinctive steps. First, each article ordered in its chronological order. Then these were given an identification number. Second, the full of the publication entered into CGPT as follows. For example: “Write a summary: Article Ref X”. For consistency, all instructions were consistent in this experiment. Third, at the end of the procedure, these summaries generated from CGPT were given back to CGPT one by one. CGPT was asked to do the following: “Find the original citation of this summary of an article: Text of the Summary.” While not all articles were assessed at one time in one day, the steps were consistently identical in all interventions.

3.1 Data analysis

Three simple data analysis patterns were used. First, the accuracy of the link between the citation and summary were assessed to understand whether CGPT is efficient and effective in finding a literature based on a reference provided to it. Second, the proximity between the CGPT summary generated and the author’s proposed abstracts were matched for compatibilities. Third, the link between the CGPT’s generated summary and CGPT’s recommended citation were matched for accuracy versus errors. Table 1 shows the literature in Column 1, citation ID in column 2, CGPT’s summary in Column 3, and CGPT’s generated proximity between the two texts: actual abstract of the publication vis-à-vis the summary generated by CGPT based on just a simple reference give to it.

Table 1 Data, analysis and results

4 Results

4.1 Citation to summary

Table 1 shows the summaries generated by CGPT in correspondence to the citation. After the summaries, CGPT was asked to rate proximity between the summary and actual abstract in the publication. Based on content rather than context, CGPT’s proximity score on 10-scale appears in the last column of the table. While the range varies, the average is about 5 (50%).

4.2 Summary to Citation

Whether CGPT can find references for the summary of a citation it produced earlier. Either it failed to find or found wrong references. All tests of 34 attempts retried wrong citations, and two examples of the wrong references from the table are shown here: #03 and #30.

#03: CGPT:

  • Gulati, R., & Gargiulo, M. (1999). Where do interorganizational networks come from? American Journal of Sociology

  • Kale, P., & Singh, H. (2007). Building firm capabilities through learning: The role of the alliance learning process in alliance capability and firm-level alliance success. Strategic Management Journal.

  • Provan, K. G., & Milward, H. B. (1995). A preliminary theory of interorganizational network effectiveness: A comparative study of four community mental health systems. Administrative Science Quarterly

  • Ring, P. S., & Van de Ven, A. H. (1994). Developmental processes of cooperative interorganizational relationships. Academy of Management Review

#30: CGPT:

  • Hoge, C. W., Grossman, S. H., Auchterlonie, J. L., Riviere, L. A., & Milliken, C. S. (2014). Mental health problems, use of mental health services, and attrition from military service after returning from deployment to Iraq or Afghanistan. JAMA,

  • Pietrzak, R. H., Johnson, D. C., Goldstein, M. B., Malley, J. C., & Southwick, S. M. (2009). Perceived stigma and barriers to mental health care utilization among OEF-OIF veterans. Psychiatric Services

  • Savitz, G. L., & Brown, R. W. (2018). Military-civilian partnerships for health system transformation.

  • Simpson, G. K., & Tate, R. L. (2007). Preventing depression after traumatic brain injury: A randomised controlled trial of a psychoeducational intervention. British Journal of Clinical Psychology

Actual abstract to citation:

In the last stage, the analysis tested whether the actual abstract of a research article was correctly linked to the respective authors. It tested whether CGPT accurately produces the citation for the actual abstract of a published literature. The observers in the experiments estimated that it would produce more than 90% accurate links between the published abstract and citation. CGPT wrongly produced citations for the abstract published in the journals on 34 studies in the sample. Only two of them are shown below for demonstration.

#22: Abstract

  • Kim, S., Lee, H., & Park, Y. (2019). The impact of cultural dimensions on anxiety management projects: The moderating role of uncertainty avoidance and long-term orientation. Journal of Business Research, 98, 172–181. https://doi.org/10.1016/j.jbusres.2019.01.043

#08: Abstract

  • Park, S. H., & Ungson, G. R. (2001). The effect of national culture, organizational complementarity, and economic motivation on joint venture dissolution. Academy of Management Journal, 44(2), 233–240.

4.2.1 Q1: Does CGPT retrieve relevant and specific summaries of the publication based on citation-based research?

Overall, CGPT correctly identifies and retrieves summaries of the literature based on a citation. It produces introductory sentences that is often a repetition of the citation, followed by method and argument with findings where relevant. It concludes with a sentence that captures the overall abstraction of the main point. These patterns have remained consistent during the 34 searches of the published literatures based on citations. However, CGPT often ignores the higher-level theoretical importance in the summary and relies on lower-level theoretical importance in which the predictors are considered some kind of theories.

4.2.2 Q2: How proximal is the CGPT produced summary and the author’s abstract?

This question of the proximity between the summary and abstract has two findings. First, proximity on 10-scale between the CGPT produced summary and the abstract was rated CGPT. A consistent instruction was given to CGPT to rate the proximity between the two passages and the processed was repeated in each passage. The CGPT rated proximity varies from 2 for low proximity and 10 for high proximity. On average, CGPT produced 5 level of proximity on the 10-scale, which is in the middle. Second, the author was asked to rate the proximity between the abstract and summary on each of the 34 publications. The author-rated proximity ranged from 5 to 9, average falling near 7 on the 10-scale. In both cases, the proximity was rated based on contents rather than the context in this second question.

4.2.3 Q3: Does CGPT correctly identifies the citations for the summary it has produced on the citation it was given earlier?

Unfortunately, CGPT failed entirely to match the summary it produced with the correct citation. Three points make this answer clearer. First, CGPT linked summaries to their citations 100% incorrectly, which is an error of commission and of omission. It is an error of commission because it includes the authors who are not part of the research, and it is an error of omission because it excludes the author whose literature was used to retrieve the summary. Second, CGPT produced worse outcome of errors of omission/commission by precisely linking wrong citation to the abstract. Third, the error of commission/omission revealed strong biases towards authors and journals of higher ranking. It reveals that CGPT has more socio-cultural value-laden patterns than its neutrality intuitively perceived. It is an error of omission.

4.2.4 Q4: Does CGPT correctly identifies the citation to actual abstract and vice versa?

Unfortunately, CGPT has failed to produce the correct link from the citation to the actual published abstract and the author. Like in the error of commission through wrong references, the experiment revealed 100% failure of CGPT to correctly identify the abstract or author. Instead, it wrongly associated authors unrelated to the abstract. It means it has committed an error of commission by including wrong citations.

5 Discussion

With the advent of Artificial Intelligence (AI), there has been an increasing interest in using large language models, such as CGPT series, for automated text summarization. However, there are concerns about the effectiveness and reliability of these models, particularly in academic research where accuracy is critical. One of the main challenges in academic research is to effectively summarize a large body of literature. This task is often time-consuming and can be error-prone, especially when the researcher is dealing with a large number of sources. The purpose of this study is to evaluate the effectiveness and reliability of CGPT in summarizing academic publications, attributing and diffusion of their implications. The study aims to provide insights into the strengths and limitations of the CGPT feature and its potential implications for academic research, whether it can be a meaningful research assistant with level and scopes of errors of commission and errors of omission. Therefore, whether CGPT accurately and efficiently summarize academic publications, inspired the of following questions for this article. Does CGPT retrieve relevant and specific summaries of the publication based on citation-based research?

  • How proximal is the CGPT produced summary and the author’s abstract?

  • Does CGPT correctly identify the citations for the summary it has produced on the citation it was given earlier?

  • Does CGPT correctly identify the citation to actual abstract and vice versa?

The experiment found that CGPT is able to retrieve relevant and specific summaries of publications based on citations, but often ignores higher-level theoretical importance in the summary. The proximity between the CGPT-produced summary and author's abstract was rated in the middle on a 10-point scale, while the author's rating fell near 7. However, CGPT failed entirely to match the summary it produced with the correct citation and has a strong bias towards authors and journals of higher ranking. Additionally, CGPT failed to correctly identify the citation to the actual abstract and author. Hence, CGPT has committed errors of commissions and errors of omission, which raise the natural question: why CGPT is prone to commission and omission fallacies?

Two explanations come to the fore: technical and social. The technical refers to the information scope and scale to train the AI system. The social error refers to the context, structure and people biases based on a variety of caste systems in academic fields (Burris 2004). The former error (technical) builds on the idea of ‘garbage in, garbage out’. CGPT makes errors of commission and errors of omission because it relies on statistical patterns in the training data it was trained on, and it may not have encountered certain patterns or information during its training. As a language model, CGPT generates text based on the input it receives and tries to predict what would be the most likely continuation of the text. However, if the input it receives is incomplete or ambiguous, it may generate text that includes irrelevant or incorrect information, leading to errors of commission. On the other hand, if the input it receives does not contain certain information that is necessary for generating an accurate response, it may omit that information and lead to errors of omission. Additionally, the accuracy of CGPT may also be affected by biases present in the data it was trained on, which can influence the patterns it learns and the responses it generates.

The latter error (social) builds on the notion of ‘Mathew Effect’. The Matthew Effect is a phenomenon influenced by the prestige effect associated with countries, organizations, networks, and caste systems (institutions). The term "Matthew Effect" was first coined by sociologist Robert K. Merton (Merton 1968), who drew a comparison between this phenomenon and a quote from the Bible's Gospel of Matthew, which states: "For unto every one that hath shall be given, and he shall have abundance: but from him that hath not shall be taken even that which he hath. In sciences, the Matthew Effect suggests that researchers who have already established themselves as authorities in their field tend to receive more citations and recognition than others. Furthermore, the Mathew Effect is due to the prestige effect associated with the institutions to which they belong. The more prestigious the institution, the more likely it is for its members to receive recognition and visibility in their respective fields. This can create a feedback loop, where the most cited researchers continue to receive more citations and attention, while lesser-known researchers struggle to gain recognition for their work. This phenomenon can result in errors of omission and commission, where certain researchers or publications are overrepresented or underrepresented in the literature and citations due to the prestige effect associated with their respective institutions.

Furthermore, the Matthew Effect can also impact the recognition and visibility of certain publications. High-profile journals and conferences are more likely to attract the work of established researchers, while new or lesser-known publications may struggle to attract attention or citations. This again can lead to errors of omission and commission in the citation and recognition of research.

In conclusion, the errors discovered above reflect on the technical and social context. Technically, the AI system developer, the software engineer, the database for the literature, the author’s knowledge of the ICT (information and communication technology), the fashion of concepts, and the interaction between systems explain these errors. Socially, social status of the publication, the author, the publisher, the university, the country, author’s names, concepts used and other biased contexts explain the Mathew effect.