The relevance of title, abstract, and keywords for scientific paper quality and potential impact

Authors, editors, and reviewers need to have a good perception regarding the quality of a manuscript in order to improve their skills, save effort, and prevent errors that can affect the submission procedure. In this paper, we compared the author’s perception of a manuscript’s quality with the manuscript’s actual impact. In addition, we analyzed the uncertainty of the author’s perception of the manuscript’s quality. From there, we defined ‘partition’ as the author’s ability to perceive the actual quality. We did this by launching a website for the use of the scientific community. This webpage provided a tool to help improve an investigator’s skill in understanding and recognizing the quality of a manuscript so as to help researchers improve and maximize their works’ potential impact. We carried out the experiment with 106 experienced users who tested our webpage. We found that the Abstract, the Title, and the Keywords were enough to perform a substantially decent evaluation of a manuscript. Most of the researchers were able to determine the quality of a paper in less than a minute from this small amount of information.


Introduction
In the academic field, authors need to publish their results in order to make them available to the public, and manuscripts are one of the most important ways to achieve those needs. We can https://doi.org/10.1007/s11042-023-14451-9 * Jorge Chamorro-Padial jorgechp@correo.ugr.es Rosa Rodríguez-Sánchez rosa@decsai.ugr.es simplify the manuscript publishing process into the following steps: 1) Authors write a manuscript and send it to a journal, 2) The journal reviews the manuscript to check if their quality standards are met, and 3) After the review process, the manuscript is either published or rejected. We can consider manuscripts as potential papers once they pass a review process and various corrections or modifications are made (if required). Usually, there are plenty of candidate journals where a manuscript could suitably fit according to the topic(s) and the manuscript's field of knowledge. This situation forces authors to select a journal where they would prefer to have their work published. Typically, every candidate journal has a different level of impact. The impact is defined by [1] as one of academia's strongest currencies. For a journal, impact is an important asset which ensures that the scientific community, libraries, and academic researchers all continue to pay attention to its publications. Low impact journals face the risk of being removed from scientific indexes and losing interest from the community. For authors, the impact directly affects their visibility and prestige, which can, in turn, directly affect their research career.
Articles published in high-impact journals will presumably have a more significant impact than articles published in low-impact journals. Quality is another critical asset along with impact. We consider a manuscript's quality in terms of originality, importance, soundness of theory, and verified conclusions. From the point of view of a journal, high-quality articles have a greater probability of attracting the scientific community's interest. Thus, journals tend to define measures in order to select articles of the highest possible quality [7]. When a journal receives an article, the manuscript is often initially checked to determine if a minimum quality is met. If not, the article would be promptly rejected as a desk decision. If the article fits the journal's minimum quality standards then it is usually sent to the next step, a peer-review [11].
In this paper, we want to offer support to the peer-review process by providing a tool that can be useful for training and improving the skills of authors, reviewers, and editors while imparting valuable knowledge: & For authors: We propose that our tool can help authors recognize their ability when distinguishing the quality of a manuscript. This knowledge can be useful when choosing a journal for their manuscripts. Also, authors tend to suffer from confirmation bias, which leads to overconfidence as authors believe their manuscripts are of a higher quality regardless of what the actual, objective quality is [18]. The proposed tool can help authors correct this bias by comparing their choices about the manuscript's quality (Low, Medium, High quality) with its actual quality. & For reviewers: We want to give them feedback about their ability to identify an article's quality. For example, a reviewer who incorrectly matches the quality of a manuscript and the quality standard of a journal would incur a cost to the journal [6]. & For editors: For editors, knowing their skill-level when properly identifying the quality of a manuscript can be also useful. For example, a desk decision can save effort and time for reviewers and authors if the editor believes that the manuscript's quality does not match the journal's quality standards [8,17].
In order to do all of this we built a web training system 1 where users can review information about different real papers and decide what their quality is while receiving direct feedback about their decisions. On our website, authors are presented with only the key elements of an article (abstract, title, and keywords). During the training process, their time per response is measured. From this we wanted to answer the following questions: 1. What is the level of uncertainty that authors have about the quality of a manuscript with respect to its actual quality? 2. Do the Title, the Abstract, and the Keywords contain enough information to determine the quality of a manuscript? 3. What is the average time that an author spends reviewing the key information of a manuscript?
We assumed that an article's actual quality matched the impact category (Low, Medium, or High) of the journal where the manuscript was published.
Our work is structured in the following manner: The last section of our work presents the critical information from our paper.

State of the art and related works
Peer review is a standard quality control procedure that is part of a consensus-seeking scientific discussion on quality assurance [13]. During the peer review stage, an editor selects reviewers who are expected to read the candidate manuscript and give a critical assessment regarding the work's quality. Peer review acts as an editor's source of knowledge to help them decide if a manuscript should be accepted, rejected, or returned to the author(s) with corrections [3]. Manuscripts relevant to the journal's scope, which are innovative and well written, have a higher probability of being accepted [6].
In this context, authors will want to send their manuscript to as high an impact journal as possible, while journals will want to publish articles of the highest quality possible. From the perspective of an author, if he or she sends a high-quality manuscript to a low-impact journal there will be a substantial cost in terms of lack of visibility. It is important to mention that journal impact factor is a strong predictor of the number of citations [2]. Nevertheless, sending a low-quality manuscript to a high-impact journal increases the risk of rejection, which would affect the time and effort spent trying to publish, as well as the author's motivation. Writing a manuscript for a high-impact peer-reviewed journal can be a challenging and frustrating experience. For example, [22] concludes that "the authors do few manuscript submissions prior to journal acceptance, most commonly by lower impact factor journals".
In [18] the authors analyzed the evolutionary game derived from journal quality controls. An author produces low or high-quality manuscripts which are then submitted to journals who accept manuscripts of different qualities with a certain probability. The authors also identified different strategies and their survival chances according to evolutionary games. These strategies are based on the concept of authors' and editors' quality profiles. An author's profile is based on the probability of an author submitting articles of a certain quality (low or high). In contrast, an editor's profile is based on the frequency with which an editor accepts articles with a specific quality (low or high).
A vast majority of authors still feel the need to enhance their skills in popular science writing [16]. Nowadays, the author can use different tools that can help in the process of writing a manuscript. Among these tools, we can distinguish Jasper 2 and Hemingway Editor. 3 Jasper uses AI to help write different parts of the manuscript. For its part, Hemingway helps the author highlight problems with their writing. Its goal is to make complex sentences easier. However, while those tools help to write a manuscript, the author must have the ability to recognize the quality of the manuscript.
Different works focus on the author's perception of the manuscript's quality. This topic is important to analyze since according to [15] absolute impact factor of the journal, match between perceived "quality" of their study, and journal impact factor were considered to be the three most important factors by the authors when they have to submit a manuscript.
In [19] concrete suggestions for improving the perception of a paper in the reader's minds is presented. Also, [23] proposed a pilot study to evaluate a method of teaching neurology residents the basic concepts of biostatistics, research methodology, and review of scholarly literature by employing a program of peer-reviewed scientific manuscripts.
Selecting a journal is not always without problems, as authors can suffer from having a flawed perception about their article's quality. Additionally, reviewers can have imperfect knowledge or bias when determining the quality of a reviewed work. If authors can distinguish the actual quality or impact of a manuscript, they have a fine partition. Conversely, if they cannot distinguish the actual quality of an article, they have a coarse partition. Here, a partition is defined as a map between the author's perception of the manuscript's quality and the actual impact of the manuscript. If the author's perception coincides with reality, we can say that they have a fine partition. The actual impact of a manuscript can be measured by the impact of the journal where it was published. We also used the author's profile as one of the possible indicators of the quality of a manuscript given the author's partition. For example, suppose an author cannot distinguish between a low, medium, and high impact manuscript (they have a coarse partition) when the author has to evaluate a manuscript's impact. In that case, he or she would have three profiles: low, medium, or high impact. On the contrary, if an author has a fine partition, they could have 27 possible profiles. The perception of the quality of a manuscript depends on the author's partition and the distribution of articles over three different categories (High impact, Medium impact, and Low impact).
The same concept is applied to reviewers [5]. The quasi-species model inspires our work to determine the evolution of an authors' profiles after the peer-review process. This model was intended to represent the Darwinian evolution of self-replicating entities when a high mutation rate occurs [12,20]. According to this model, a quasi-species is a big group, or cloud, of genotypes in an environment where their descendants will have a high probability of mutation. The evolutionary success of a quasi-species strongly depends on the replication rates of clouds. In [5] the authors adapted the quasi-species model from biology to the author-editor game's evolutionary environment. Self-replicating entities are submission profiles under a given partition of manuscript categories. Errors produce profile mutations, and only submission profiles with high replication rates survive.
Peer review is not exempt from criticism and deficiencies [10,11], but nowadays it is one of the scientific community's essential tools to validate and improve the quality of science. Every year, about 13.7 million reviews are done in the academic ecosystem for a total of 3 million scientific articles [9,21]. Additionally, peer-review is an indicator of prestige and confidence for journals and authors [2,14]. Ultimately, we can say that peer-review is a crucial element of the science of today, and it is necessary to continue to improve it by raising the skill of all actors involved in the process.

Model
As described in the introduction, an author submits a manuscript that can have different levels of quality. In this paper, we define three different manuscript categories:S = {s 1 , s 2 , s 3 }, with s 1 being a low-quality manuscript, s 2 a medium-quality manuscript, and s 3 a high-quality manuscript. Likewise, we define three different journal impacts, The action of sending an article to a journal can be seen as optimal or non-optimal. For example, if an author sends a low-quality article to a high-impact journal, it is very likely to get a rejection. In that case, the author has lost time and effort, so it is considered a non-optimal action. Additionally, if an author sends a high-quality article to a low-impact journal, the author is paying the price in terms of visibility, prestige, and impact, which is also a non-optimal action. For s j ∈ S we can define an optimal action as follows: With i * (s 1 ) being the optimal action of sending a low-quality article to a low-impact journal.
With i * (s 2 ) being the optimal action of sending a medium-quality article to a medium-impact journal.
With i * (s 3 ) being the optimal action of sending a high-quality article to a high-impact journal. Every action gives a score to the author. In our model, the result for non-optimal actions is 0, while the optimal action score is 1. We define the reward function, π i (s j ) for i ∈ I and j ∈ S, as follows: Every author has a different ability to identify the quality of an article. We formally represent the distinctive capabilities of authors as partitions. Every author uses a particular partition. If an author can distinguish between low, medium, and high-quality articles, then the author has a fine partition, K F = {{s 1 }, {s 2 }, {s 3 }}. If an author does not distinguish between any type of quality, then the author uses a coarse partition K C = {s 1 , s 2 , s 3 }. Among these polarized partitions, we can also identify other ones: The author can identify low-quality articles but cannot identify medium and high-quality articles.
The author can identify medium-quality articles but cannot identify between low and high-quality articles.
The author can identify high-quality articles but cannot identify low and medium-quality articles.
Using a partition is the basic knowledge that an author has to decide what the potential (impact) of a manuscript would be. Knowledge is also gained from good and bad experiences when submitting manuscripts to different journals. This additional knowledge allows them to have informed opinions about where to submit an article with a certain level of quality. This extra information makes up part of the submission profile of an author. For every category in an author partition, there is a corresponding submission pattern. For example, an author who uses a fine partition has a submission profile consisting of three different submission patterns. An example of a submission profile for a fine partition is (L − I, M − I, H − I) where an author can identify low, medium, and high-quality articles. However, low and medium-quality manuscripts are sent to low-impact journals, while high-impact manuscripts are sent to high-impact journals. Table 1 summarizes the number of submission profiles per partition, Table 2 and Table 3 describe the submission profiles for partitions K F = {{s 1 }, {s 2 }, {s 3 }} and This paper would also like to apply certain concepts inspired by the quasi-species model [5,12].
For each author, we compute the probability of each partition as follows: Where a, b, and c are elements of the set {LOW, MEDIUM, HIGH}, s ∈ S and i ∈ I. Remember that I is the set of categories for the actual impact of a manuscript, and S is the set of categories for an author's perception the manuscript's quality.
Finally, the most probable partition is assigned to the author. Once the partition is established, we can compute each submission profile score for the selected partition by considering the frequency with which a set of authors produce manuscripts of each category. For example, for the K F partition, the best submission profile will always be (L-I, M-I, H-I), giving the best possible score for an author. For it is necessary to decide between the submission profiles (L-I, M-I) or (L-I, H-I) according to the occurrence frequency for medium and high impact manuscripts.
Let π (i) (K) be the reward of submission profile (i) under partition K given as: With S = {s 1 , s 2 , s 3 } being the set of manuscript categories; f s being the frequency of a manuscript category; π i (s) being the reward function under submission profile (i) for manuscript category s, as defined in Eq. 1. Then, we denote as the reward vector of submission profiles under partition K. Among all the author's profiles, we define the best profile as the one with the highest score, seen as: with P being the set of possible profiles for the partition K.

Experimental setup
To apply our model, we deployed our website 4 with the aim of improving authors' skills when identifying the quality of manuscripts and letting them know their most probable partition as well as their recommended submission profile, according to the responses provided. The website was built using a combination of Typescript, HTML, and CSS using Angular framework. Our website is connected to a server written in Python by using a RESTful API. The webpage is responsive, so participants can use the webpage using either a computer or a 4 https://blackcat.ugr.es/quasispecies/  smartphone. Figures 1 and 2 show screenshots from the website. The supplemental material of this work contains screenshots of each section on the website, together with an explanation for each one.

Participants
To test our proposed model, we asked 106 participants to register on our website and classify a minimum of 15 random articles. We needed our participants to have experience in reading and working with scientific literature, so we asked them to have, at least, a bachelor's degree. In addition, our dataset consisted of computer science articles, so working in or having experience in an IT related area was another requirement. Individuals in our experiment came from two different sources: & 86 participants worked in IT jobs. & 20 participants were authors from Computer Sciences journals.

Materials
We created a dataset of articles published in JCR, specifically indexed journals from the Computer Science category, sub-area Artificial Intelligence, from 2019. The dataset contained 21,799 articles. For each article, the dataset included information regarding the title, keywords, abstract, and publishing journal. Concerning the journals, the dataset contained information about the journal's title, impact factor, and tertile level. The dataset is published in Kaggle [4].

Design
In our experiment, we wanted to identify the partition and the submission profile of an author. For that purpose, we used the reward obtained by an author when sending a manuscript to a journal, π i (s j ). This reward is inferred from the user responses on our website. The user must decide by reading only limited information about the article (title, abstract, and keywords). When users are in the training section, the articles they have to review are selected randomly. Users have to infer the article's quality and then decide which impact level journal they would submit the manuscript to. In order to establish a better correlation Fig. 2 The quasi-species peer review website. An example of a stats session. The author can check their results in the form of a confusion matrix. Additionally, they can see their partition and the recommended submission profile between qualities (low, medium, and high quality) and impact, we have assigned three different impacts to journals in our dataset (high, medium, and low impact). According to the JCR index, this impact is in line with the journal's impact factor during the year 2019. In this sense, journals in the first tertile are considered high impact journals, journals in the second tertile are considered medium-impact journals and journals in the third tertile are defined as low impact ones.
Concerning the quality of articles from the dataset, articles from high impact journals are considered high-quality manuscripts. Articles from a medium impact journal are considered medium-quality manuscripts, and, finally, articles from a low impact journal are considered low-quality manuscripts.

Procedure
The user experience on the website is as follows: 1. The user is signed up to the System. 2. After the signup process, users enter the Training section, where the papers are displayed, and a submitting decision must be made (see Fig. 1). The user can skip the manuscript if they are unable to make a decision. 3. After submitting a minimum of 15 articles the user can access the Stats section, see Fig. 2.
In this section, they can see their partition type and recommended submission profile. 4. The user can go back to the Training section and keep training if they wish.
There is detailed information about the website's interface in the supplemental document.
From the user responses, we computed two confusion matrices for each user, MA and MR. These matrices contain the same information about the users' responses but contain, respectively, absolute and relative results. MA is only used to provide additional information to the user in the Stats section, while MR is used to compute the partition type and the submission profile following the model described in the Model section. Tables 4 and 5 are examples of MA and MR. Although both matrices contain the same information, we use MR to determine the author partition. The first step is to calculate the probabilities of each type of partition: With this information, the partition with a higher score is K C , which determines the author's partition type. The second step is to compute the frequencies for each category of manuscripts.  score by considering the author's behavior. The third step is to compute the score for each possible submission profile. For K C , available submission profiles are described in Table 3: Finally, we selected the best profile, which is (M − I). This profile is the one that the author should follow in order to increase their score. While the experiment was taking place, the webpage was measuring the response time for each article. Table 6 illustrates an example of a user with a Fine Partition, with their partition probabilities as follows: P K F 1 ð Þ¼ 0:267 P K F2 ð Þ¼ 0:366 P K F3 ð Þ¼ 0:366 P(K C ) = 0.105 P(K F ) = 0.890. To compute the submission profile, we used the same category frequencies as the example in Table 4, f s : f 1 = 0.158, f 2 = 0.684, f 3 = 0.158. For K F , we have 27 different submission profiles, with (L-I, M-I, H-I) being the most probable, with a score of 1.0.

Results
As stated in the Participants section, 106 participants used our webpage and simulated the submission of at least 15 articles according to the articles' perceived quality.
Once all participants had finished their task we extracted the different partitions and submission profiles obtained from them. The results are shown in Table 7. Regarding the  In order to check the significance of these results, we performed different analyses. Firstly, a One Way ANOVA was carried out. We grouped participant responses into partitions and extracted the average number of correct answers. ANOVA results are described in Table 8. From these results, we can determine that it is very probable that at least one of the groups is statistically significant.
The second step in our analysis was to evaluate the relationships between different groups by performing a Turkey HSD test in order to determine whether the means from each group were significantly different. Table 9 illustrates the Turkey HSD p values obtained. Most of comparisons have a p value lower than 0.01 and may be considered significant. We can see that K F and K C groups are different from the rest of groups. Significant differences were not found between K F1 , K F2 and K F3 . Similar to the analysis performed with answers from participants, we studied the significance differences between time per response. A One Way ANOVA test and a Turkey HSD test were performed. Results from these tests show that there were significant differences between individuals with a K C partition and the rest of groups. But no differences were found between the other groups. Results are described in Tables 10 and 11.

Discussion
In this paper, we tried to answer three questions. With respect to the first one we can say that most of participants in our experiment (76.4%) had the ability to distinguish between low, medium, and high-quality manuscripts (making minor mistakes) and only a minority of individuals were unable to distinguish the quality of a manuscript.
The Fine Partition, K F , was the most common in our experiment (27.4%), followed by K F 3 (23.6%). K C , the coarse partition, was assigned to only 9.4%. Having a fine partition means that the participants can distinguish between low, medium, and high-quality articles. K F1 was also a frequent partition (22.6%).
Sometimes, it can be difficult to distinguish between low and medium or high and medium articles. However, this type of error is less critical than confusing a high-quality article with a low-quality one. We can say that almost all participants (about 73.6%) had K F , K F 3 or K F 1 partitions, which means that they were able to distinguish between different types or manuscripts according to their quality while, at the same time, authors in this partition could differentiate between low-and high-quality documents.
With respect to the second question posed in the paper, results from the experiment also mean that, for experienced users, the amount of information used in our research (Title, Abstract, and Keywords) was enough for them to achieve a quality perception that was quite close to the actual quality of an article. Participants were required to have a bachelor's degree and working experience in IT while some of them were also authors for computer science journals. So, it is likely that most of them had some experience in reading and understanding scientific documents. It would be worthwhile to research more users that have a variety of backgrounds to check their abilities as well. In addition, for future research, we would like to introduce additional datasets to our website in order to be more helpful to researchers from different fields of knowledge.
Regarding the third question raised in our paper, we can say that an experienced author spends about 29 seconds reviewing the Title, Abstract, and Keywords from an article and deciding the quality of a manuscript. The median time to do so is about 20 seconds. Nevertheless, a high standard deviation was observed. Identifying the quality and the potential impact of an article in less than one minute can save a significant amount of time and effort for authors, who do not always have access to the full document in order to decide whether the manuscript would fit their needs or not.

Conclusions
In our paper, we proposed a tool to give authors accurate information about their skills when recognizing the potential of a manuscript and their recommended submission profile. We also wanted to know whether a minimal amount of information, consisting of the only the Title, the Abstract, and the Keywords, would be enough for a researcher to determine the article's quality and to know how much time would be required to score the article.
For the purpose of our research, we designed a website where researchers could test their abilities by evaluating article information and sending it to a journal from one of three impact types. After designing and launching our website, we ran an experiment where 106 experienced users classified at least 15 articles. The experiment results indicate that most of them were able to determine the quality of classified articles accurately. An article required an average time of 29 seconds to perform the evaluation. In the light of the results achieved, we can say that the Title, Abstract, and Keywords provide, in most cases (90.6% according to results from the experiment), enough information to identify, at least, one of the three quality categories defined in this work (low, medium, or high quality).
Future research must test the website with non-experienced users and compare their results with the experienced users' group. Furthermore, we would like to add articles from different fields of knowledge to analyze researchers' behavior according to their varying backgrounds. Finally, the minimum amount of information necessary to accurately score the impact of a manuscript is also an open issue requiring further investigation.