The relevance of title, abstract, and keywords for scientific paper quality and potential impact

Chamorro-Padial, Jorge; Rodríguez-Sánchez, Rosa

doi:10.1007/s11042-023-14451-9

The relevance of title, abstract, and keywords for scientific paper quality and potential impact

Open access
Published: 27 February 2023

Volume 82, pages 23075–23090, (2023)
Cite this article

Download PDF

You have full access to this open access article

Multimedia Tools and Applications Aims and scope Submit manuscript

The relevance of title, abstract, and keywords for scientific paper quality and potential impact

Download PDF

2102 Accesses
2 Citations
Explore all metrics

Abstract

Authors, editors, and reviewers need to have a good perception regarding the quality of a manuscript in order to improve their skills, save effort, and prevent errors that can affect the submission procedure. In this paper, we compared the author’s perception of a manuscript’s quality with the manuscript’s actual impact. In addition, we analyzed the uncertainty of the author’s perception of the manuscript’s quality. From there, we defined ‘partition’ as the author’s ability to perceive the actual quality. We did this by launching a website for the use of the scientific community. This webpage provided a tool to help improve an investigator’s skill in understanding and recognizing the quality of a manuscript so as to help researchers improve and maximize their works’ potential impact. We carried out the experiment with 106 experienced users who tested our webpage. We found that the Abstract, the Title, and the Keywords were enough to perform a substantially decent evaluation of a manuscript. Most of the researchers were able to determine the quality of a paper in less than a minute from this small amount of information.

Priority criteria in peer review of scientific articles

Article 02 February 2016

Dealing with the Journal

Abstract and Keywords

Find the latest articles, discoveries, and news in related topics.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In the academic field, authors need to publish their results in order to make them available to the public, and manuscripts are one of the most important ways to achieve those needs. We can simplify the manuscript publishing process into the following steps: 1) Authors write a manuscript and send it to a journal, 2) The journal reviews the manuscript to check if their quality standards are met, and 3) After the review process, the manuscript is either published or rejected. We can consider manuscripts as potential papers once they pass a review process and various corrections or modifications are made (if required).

Usually, there are plenty of candidate journals where a manuscript could suitably fit according to the topic(s) and the manuscript’s field of knowledge. This situation forces authors to select a journal where they would prefer to have their work published. Typically, every candidate journal has a different level of impact. The impact is defined by [1] as one of academia’s strongest currencies. For a journal, impact is an important asset which ensures that the scientific community, libraries, and academic researchers all continue to pay attention to its publications. Low impact journals face the risk of being removed from scientific indexes and losing interest from the community. For authors, the impact directly affects their visibility and prestige, which can, in turn, directly affect their research career.

Articles published in high-impact journals will presumably have a more significant impact than articles published in low-impact journals. Quality is another critical asset along with impact. We consider a manuscript’s quality in terms of originality, importance, soundness of theory, and verified conclusions. From the point of view of a journal, high-quality articles have a greater probability of attracting the scientific community’s interest. Thus, journals tend to define measures in order to select articles of the highest possible quality [7]. When a journal receives an article, the manuscript is often initially checked to determine if a minimum quality is met. If not, the article would be promptly rejected as a desk decision. If the article fits the journal’s minimum quality standards then it is usually sent to the next step, a peer-review [11].

In this paper, we want to offer support to the peer-review process by providing a tool that can be useful for training and improving the skills of authors, reviewers, and editors while imparting valuable knowledge:

For authors: We propose that our tool can help authors recognize their ability when distinguishing the quality of a manuscript. This knowledge can be useful when choosing a journal for their manuscripts. Also, authors tend to suffer from confirmation bias, which leads to overconfidence as authors believe their manuscripts are of a higher quality regardless of what the actual, objective quality is [18]. The proposed tool can help authors correct this bias by comparing their choices about the manuscript’s quality (Low, Medium, High quality) with its actual quality.
For reviewers: We want to give them feedback about their ability to identify an article’s quality. For example, a reviewer who incorrectly matches the quality of a manuscript and the quality standard of a journal would incur a cost to the journal [6].
For editors: For editors, knowing their skill-level when properly identifying the quality of a manuscript can be also useful. For example, a desk decision can save effort and time for reviewers and authors if the editor believes that the manuscript’s quality does not match the journal’s quality standards [8, 17].

In order to do all of this we built a web training system^{Footnote 1} where users can review information about different real papers and decide what their quality is while receiving direct feedback about their decisions. On our website, authors are presented with only the key elements of an article (abstract, title, and keywords). During the training process, their time per response is measured. From this we wanted to answer the following questions:

1.
What is the level of uncertainty that authors have about the quality of a manuscript with respect to its actual quality?
2.
Do the Title, the Abstract, and the Keywords contain enough information to determine the quality of a manuscript?
3.
What is the average time that an author spends reviewing the key information of a manuscript?

We assumed that an article’s actual quality matched the impact category (Low, Medium, or High) of the journal where the manuscript was published.

Our work is structured in the following manner:

State of the Art and related works: Explores the current state of the art information and research in this field in terms of scientific literature.
Model: Describes our model.
Methodology: Explains the methodology of our experiment, the website, and the dataset used.
Results: Reports the results obtained from users who used our website.
Discussion: In this section, we perform a more detailed analysis of the results achieved.
Conclusions: The last section of our work presents the critical information from our paper.

2 State of the art and related works

Peer review is a standard quality control procedure that is part of a consensus-seeking scientific discussion on quality assurance [13]. During the peer review stage, an editor selects reviewers who are expected to read the candidate manuscript and give a critical assessment regarding the work’s quality. Peer review acts as an editor’s source of knowledge to help them decide if a manuscript should be accepted, rejected, or returned to the author(s) with corrections [3]. Manuscripts relevant to the journal’s scope, which are innovative and well written, have a higher probability of being accepted [6].

In this context, authors will want to send their manuscript to as high an impact journal as possible, while journals will want to publish articles of the highest quality possible. From the perspective of an author, if he or she sends a high-quality manuscript to a low-impact journal there will be a substantial cost in terms of lack of visibility. It is important to mention that journal impact factor is a strong predictor of the number of citations [2]. Nevertheless, sending a low-quality manuscript to a high-impact journal increases the risk of rejection, which would affect the time and effort spent trying to publish, as well as the author’s motivation. Writing a manuscript for a high-impact peer-reviewed journal can be a challenging and frustrating experience. For example, [22] concludes that “the authors do few manuscript submissions prior to journal acceptance, most commonly by lower impact factor journals”.

In [18] the authors analyzed the evolutionary game derived from journal quality controls. An author produces low or high-quality manuscripts which are then submitted to journals who accept manuscripts of different qualities with a certain probability. The authors also identified different strategies and their survival chances according to evolutionary games. These strategies are based on the concept of authors’ and editors’ quality profiles. An author’s profile is based on the probability of an author submitting articles of a certain quality (low or high). In contrast, an editor’s profile is based on the frequency with which an editor accepts articles with a specific quality (low or high).

A vast majority of authors still feel the need to enhance their skills in popular science writing [16]. Nowadays, the author can use different tools that can help in the process of writing a manuscript. Among these tools, we can distinguish Jasper^{Footnote 2} and Hemingway Editor.^{Footnote 3} Jasper uses AI to help write different parts of the manuscript. For its part, Hemingway helps the author highlight problems with their writing. Its goal is to make complex sentences easier. However, while those tools help to write a manuscript, the author must have the ability to recognize the quality of the manuscript.

Different works focus on the author’s perception of the manuscript’s quality. This topic is important to analyze since according to [15] absolute impact factor of the journal, match between perceived “quality” of their study, and journal impact factor were considered to be the three most important factors by the authors when they have to submit a manuscript.

In [19] concrete suggestions for improving the perception of a paper in the reader’s minds is presented. Also, [23] proposed a pilot study to evaluate a method of teaching neurology residents the basic concepts of biostatistics, research methodology, and review of scholarly literature by employing a program of peer-reviewed scientific manuscripts.

Selecting a journal is not always without problems, as authors can suffer from having a flawed perception about their article’s quality. Additionally, reviewers can have imperfect knowledge or bias when determining the quality of a reviewed work. If authors can distinguish the actual quality or impact of a manuscript, they have a fine partition. Conversely, if they cannot distinguish the actual quality of an article, they have a coarse partition. Here, a partition is defined as a map between the author’s perception of the manuscript’s quality and the actual impact of the manuscript. If the author’s perception coincides with reality, we can say that they have a fine partition. The actual impact of a manuscript can be measured by the impact of the journal where it was published. We also used the author’s profile as one of the possible indicators of the quality of a manuscript given the author’s partition. For example, suppose an author cannot distinguish between a low, medium, and high impact manuscript (they have a coarse partition) when the author has to evaluate a manuscript’s impact. In that case, he or she would have three profiles: low, medium, or high impact. On the contrary, if an author has a fine partition, they could have 27 possible profiles. The perception of the quality of a manuscript depends on the author’s partition and the distribution of articles over three different categories (High impact, Medium impact, and Low impact).

The same concept is applied to reviewers [5]. The quasi-species model inspires our work to determine the evolution of an authors’ profiles after the peer-review process. This model was intended to represent the Darwinian evolution of self-replicating entities when a high mutation rate occurs [12, 20]. According to this model, a quasi-species is a big group, or cloud, of genotypes in an environment where their descendants will have a high probability of mutation. The evolutionary success of a quasi-species strongly depends on the replication rates of clouds. In [5] the authors adapted the quasi-species model from biology to the author-editor game’s evolutionary environment. Self-replicating entities are submission profiles under a given partition of manuscript categories. Errors produce profile mutations, and only submission profiles with high replication rates survive.

Peer review is not exempt from criticism and deficiencies [10, 11], but nowadays it is one of the scientific community’s essential tools to validate and improve the quality of science. Every year, about 13.7 million reviews are done in the academic ecosystem for a total of 3 million scientific articles [9, 21]. Additionally, peer-review is an indicator of prestige and confidence for journals and authors [2, 14]. Ultimately, we can say that peer-review is a crucial element of the science of today, and it is necessary to continue to improve it by raising the skill of all actors involved in the process.

3 Model

As described in the introduction, an author submits a manuscript that can have different levels of quality. In this paper, we define three different manuscript categories:S = {s₁, s₂, s₃}, with s₁ being a low-quality manuscript, s₂ a medium-quality manuscript, and s₃ a high-quality manuscript. Likewise, we define three different journal impacts, I = {Low − impact, Medium − impact, High − impact}. The action of sending an article to a journal can be seen as optimal or non-optimal. For example, if an author sends a low-quality article to a high-impact journal, it is very likely to get a rejection. In that case, the author has lost time and effort, so it is considered a non-optimal action. Additionally, if an author sends a high-quality article to a low-impact journal, the author is paying the price in terms of visibility, prestige, and impact, which is also a non-optimal action. For s_j ∈ S we can define an optimal action as follows:

$${i}^{*}\left({s}_1\right)= Low- impact$$

With i^∗(s₁) being the optimal action of sending a low-quality article to a low-impact journal.

$${i}^{*}\left({s}_2\right)= Medium- impact$$

With i^∗(s₂) being the optimal action of sending a medium-quality article to a medium-impact journal.

$${i}^{*}\left({s}_3\right)= High- Impact$$

With i^∗(s₃) being the optimal action of sending a high-quality article to a high-impact journal.

Every action gives a score to the author. In our model, the result for non-optimal actions is 0, while the optimal action score is 1. We define the reward function, π_i(s_j) for i ∈ I and j ∈ S, as follows:

$${\pi}_i\left({s}_j\right)=\left\{\begin{array}{cc}1& i={i}^{*}\left({s}_j\right)\\ {}0& i\ne {i}^{*}\left({s}_j\right)\end{array}\right\}$$

(1)

Every author has a different ability to identify the quality of an article. We formally represent the distinctive capabilities of authors as partitions. Every author uses a particular partition. If an author can distinguish between low, medium, and high-quality articles, then the author has a fine partition, K_F = {{s₁}, {s₂}, {s₃}}. If an author does not distinguish between any type of quality, then the author uses a coarse partition K_C = {s₁, s₂, s₃}. Among these polarized partitions, we can also identify other ones:

${K}_{F_1}=\left\{\left\{{s}_1\right\},\left\{{s}_2,{s}_3\right\}\right\}$: The author can identify low-quality articles but cannot identify medium and high-quality articles.
${K}_{F_2}=\left\{\left\{{s}_2\right\},\left\{{s}_1,{s}_3\right\}\right\}$: The author can identify medium-quality articles but cannot identify between low and high-quality articles.
${K}_{F_3}=\left\{\left\{{s}_3\right\},\left\{{s}_1,{s}_2\right\}\right\}$: The author can identify high-quality articles but cannot identify low and medium-quality articles.

Using a partition is the basic knowledge that an author has to decide what the potential (impact) of a manuscript would be. Knowledge is also gained from good and bad experiences when submitting manuscripts to different journals. This additional knowledge allows them to have informed opinions about where to submit an article with a certain level of quality. This extra information makes up part of the submission profile of an author. For every category in an author partition, there is a corresponding submission pattern. For example, an author who uses a fine partition has a submission profile consisting of three different submission patterns. An example of a submission profile for a fine partition is (L − I, M − I, H − I) where an author can identify low, medium, and high-quality articles. However, low and medium-quality manuscripts are sent to low-impact journals, while high-impact manuscripts are sent to high-impact journals. Table 1 summarizes the number of submission profiles per partition, Table 2 and Table 3 describe the submission profiles for partitions K_F = {{s₁}, {s₂}, {s₃}} and ${K}_{F_1}=\left\{\left\{{s}_1\right\},\left\{{s}_2,{s}_3\right\}\right\}$, respectively.

Table 1 Number of submission profiles per partition

Full size table

Table 2 Submission profiles for the fine partition, K_F = {{s₁}, {s₂}, {s₃}}

Full size table

Table 3 Submission profiles for ${K}_{F_1}=\left\{\left\{{s}_1\right\},\left\{{s}_2,{s}_3\right\}\right\}$, ${K}_{F_2}=\left\{\left\{{s}_2\right\},\left\{{s}_1,{s}_3\right\}\right\}$ and ${K}_{F_3}=\left\{\left\{{s}_3\right\},\left\{{s}_1,{s}_2\right\}\right\}$

Full size table

This paper would also like to apply certain concepts inspired by the quasi-species model [5, 12].

For each author, we compute the probability of each partition as follows:

$$P\left({K}_F\right)=P\left({s}_3|{i}_3\right)\cdotp P\left({i}_3\right)+P\left({s}_2|\ {i}_2\right)\cdotp P\left({i}_2\right)+P\left({s}_1|{i}_1\right)\cdotp P\left({i}_1\right)$$

(2)

$$P\left({K}_C\right)=\left(P\left({s}_3|{i}_1\right)+P\left({s}_2|{i}_1\right)\right)\cdotp P\left({i}_1\right)+\left(P\left({s}_1|{i}_2\right)+P\left({s}_3|{i}_2\right)\right)\cdotp P\left({i}_2\right)+\left(P\left({s}_1|{i}_3\right)+P\left({s}_2|{i}_3\right)\right)\cdotp P\left({i}_3\right)$$

(3)

$$P\left(\left\{\left\{{S}_a\right\},\left\{{S}_b,{S}_c\right\}\right\}=P\left({S}_a|{i}_a\right)\cdotp P\left({i}_a\right)+P\left({S}_c|{i}_b\right)\cdotp P\left({i}_b\right)+P\left({S}_b|{i}_c\right)\cdotp P\left({i}_c\right)\right.$$

(4)

Where a, b, and c are elements of the set {LOW, MEDIUM, HIGH}, s ∈ S and i ∈ I. Remember that I is the set of categories for the actual impact of a manuscript, and S is the set of categories for an author’s perception the manuscript’s quality.

Finally, the most probable partition is assigned to the author. Once the partition is established, we can compute each submission profile score for the selected partition by considering the frequency with which a set of authors produce manuscripts of each category. For example, for the K_F partition, the best submission profile will always be (L-I, M-I, H-I), giving the best possible score for an author. For ${K}_{F_1}=\left\{\left\{{s}_1\right\},\left\{{s}_2,{s}_3\right\}\right\}$, it is necessary to decide between the submission profiles (L-I, M-I) or (L-I, H-I) according to the occurrence frequency for medium and high impact manuscripts.

Let π_(i)(K) be the reward of submission profile (i) under partition K given as:

$${\pi}_{(i)}(K)={\sum}_{s\in S}{f}_s{\pi}_i(s)$$

(5)

With S = {s₁, s₂, s₃} being the set of manuscript categories; f_s being the frequency of a manuscript category; π_i(s) being the reward function under submission profile (i) for manuscript category s, as defined in Eq. 1. Then, we denote as the reward vector of submission profiles under partition K. Among all the author’s profiles, we define the best profile as the one with the highest score, seen as:

$$\pi (K)=\left({\pi}_{(1)}(K),{\pi}_{(2)}(K),\dots, {\pi}_{(n)}(K)\right)$$

(6)

$$bestprofile(K)= argma{x}_{i\in P}\left\{{\pi}_{(i)}(K)\right\}$$

(7)

with P being the set of possible profiles for the partition K.

4 Methodology

4.1 Experimental setup

To apply our model, we deployed our website^{Footnote 4} with the aim of improving authors’ skills when identifying the quality of manuscripts and letting them know their most probable partition as well as their recommended submission profile, according to the responses provided.

The website was built using a combination of Typescript, HTML, and CSS using Angular framework. Our website is connected to a server written in Python by using a RESTful API. The webpage is responsive, so participants can use the webpage using either a computer or a smartphone. Figures 1 and 2 show screenshots from the website. The supplemental material of this work contains screenshots of each section on the website, together with an explanation for each one.

4.2 Participants

To test our proposed model, we asked 106 participants to register on our website and classify a minimum of 15 random articles.

We needed our participants to have experience in reading and working with scientific literature, so we asked them to have, at least, a bachelor’s degree. In addition, our dataset consisted of computer science articles, so working in or having experience in an IT related area was another requirement. Individuals in our experiment came from two different sources:

86 participants worked in IT jobs.
20 participants were authors from Computer Sciences journals.

4.3 Materials

We created a dataset of articles published in JCR, specifically indexed journals from the Computer Science category, sub-area Artificial Intelligence, from 2019. The dataset contained 21,799 articles. For each article, the dataset included information regarding the title, keywords, abstract, and publishing journal. Concerning the journals, the dataset contained information about the journal’s title, impact factor, and tertile level.

The dataset is published in Kaggle [4].

4.4 Design

In our experiment, we wanted to identify the partition and the submission profile of an author. For that purpose, we used the reward obtained by an author when sending a manuscript to a journal, π_i(s_j). This reward is inferred from the user responses on our website. The user must decide by reading only limited information about the article (title, abstract, and keywords).

When users are in the training section, the articles they have to review are selected randomly. Users have to infer the article’s quality and then decide which impact level journal they would submit the manuscript to. In order to establish a better correlation between qualities (low, medium, and high quality) and impact, we have assigned three different impacts to journals in our dataset (high, medium, and low impact). According to the JCR index, this impact is in line with the journal’s impact factor during the year 2019. In this sense, journals in the first tertile are considered high impact journals, journals in the second tertile are considered medium-impact journals and journals in the third tertile are defined as low impact ones.

Concerning the quality of articles from the dataset, articles from high impact journals are considered high-quality manuscripts. Articles from a medium impact journal are considered medium-quality manuscripts, and, finally, articles from a low impact journal are considered low-quality manuscripts.

4.5 Procedure

The user experience on the website is as follows:

1.
The user is signed up to the System.
2.
After the signup process, users enter the Training section, where the papers are displayed, and a submitting decision must be made (see Fig. 1). The user can skip the manuscript if they are unable to make a decision.
3.
After submitting a minimum of 15 articles the user can access the Stats section, see Fig. 2. In this section, they can see their partition type and recommended submission profile.
4.
The user can go back to the Training section and keep training if they wish.

There is detailed information about the website’s interface in the supplemental document.

From the user responses, we computed two confusion matrices for each user, MA and MR. These matrices contain the same information about the users’ responses but contain, respectively, absolute and relative results. MA is only used to provide additional information to the user in the Stats section, while MR is used to compute the partition type and the submission profile following the model described in the Model section.

Tables 4 and 5 are examples of MA and MR. Although both matrices contain the same information, we use MR to determine the author partition. The first step is to calculate the probabilities of each type of partition:

Table 4 Example of a MA matrix. This matrix is used to give users additional information on the website

Full size table

Table 5 Example of an MR matrix. This matrix is used to compute the author’s partition and their corresponding submission profiles. MR provides information about real journal impact and perceived quality in relative terms while MA contains absolute information

Full size table

$$P\left({K}_{F_1}\right)=0.421\ P\left({K}_{F_2}\right)=0.316\ P\left({K}_{F_3}\right)=0.263\ P\left({K}_C\right)=0.632\ P\left({K}_F\right)=0.368$$

With this information, the partition with a higher score is K_C, which determines the author’s partition type. The second step is to compute the frequencies for each category of manuscripts. f_s: f₁ = 0.158, f₂ = 0.684, f₃ = 0.158. These frequencies help us weigh the maximum possible score by considering the author’s behavior. The third step is to compute the score for each possible submission profile. For K_C, available submission profiles are described in Table 3:

$$\left(L-I\right)=0.158\ \left(M-I\right)=0.684\ \left(H-I\right)=0.158$$

Finally, we selected the best profile, which is (M − I). This profile is the one that the author should follow in order to increase their score. While the experiment was taking place, the webpage was measuring the response time for each article. Table 6 illustrates an example of a user with a Fine Partition, with their partition probabilities as follows: $P\left({K}_{F_1}\right)=0.267$ $P\left({K}_{F_2}\right)=0.366$ $P\left({K}_{F_3}\right)=0.366$ P(K_C) = 0.105 P(K_F) = 0.890.

Table 6 Example of an MR matrix of a user with a Fine Partition

Full size table

To compute the submission profile, we used the same category frequencies as the example in Table 4, f_s: f₁ = 0.158, f₂ = 0.684, f₃ = 0.158. For K_F, we have 27 different submission profiles, with (L-I, M-I, H-I) being the most probable, with a score of 1.0.

5 Results

As stated in the Participants section, 106 participants used our webpage and simulated the submission of at least 15 articles according to the articles’ perceived quality.

Once all participants had finished their task we extracted the different partitions and submission profiles obtained from them. The results are shown in Table 7. Regarding the submission profiles, participants received a recommended submission profile according to their partitions. The distribution of submissions profiles was as follows:

{{S₁, S₂}. {S₃}}
(L-I, H-I): 11
(M-I, H-I): 12
(H-I, H-I): 2
{{S₁, S₃}. {S₂}}
(L-I, M-I): 5
(H-I, M-I): 6
(H-I, L-I): 6
(H-I, L-I): 1
{{S₂, S₃}. {S₁}}
(M-I, L-I): 16
(H-I, L-I): 6
(H-I, M-I): 2
{{S₁}, {S₂}, {S₃}}
(L-I, M-I, H-I): 24
(L-I, L-I, H-I): 5
{{S₁, S₂, S₃}}
(L-I): 3
(M-I): 5
(H-I): 2

Table 7 Partitions obtained from participants’ responses

Full size table

Finally, concerning response time, the average time spent per article was 29.47 seconds, with a median time of 19.99 seconds, and a standard deviation of 66.82 seconds.

In order to check the significance of these results, we performed different analyses. Firstly, a One Way ANOVA was carried out. We grouped participant responses into partitions and extracted the average number of correct answers. ANOVA results are described in Table 8. From these results, we can determine that it is very probable that at least one of the groups is statistically significant.

Table 8 One Way ANOVA test. Participants’ responses

Full size table

The second step in our analysis was to evaluate the relationships between different groups by performing a Turkey HSD test in order to determine whether the means from each group were significantly different. Table 9 illustrates the Turkey HSD p values obtained. Most of comparisons have a p value lower than 0.01 and may be considered significant. We can see that K_F and K_C groups are different from the rest of groups. Significant differences were not found between ${K}_{F_1}$, ${K}_{F_2}$ and ${K}_{F_3}$.

Table 9 Turkey HSD p values. Participants’ responses. Bold cells show comparisons whose p value is lower than 0.01

Full size table

Similar to the analysis performed with answers from participants, we studied the significance differences between time per response. A One Way ANOVA test and a Turkey HSD test were performed. Results from these tests show that there were significant differences between individuals with a K_C partition and the rest of groups. But no differences were found between the other groups. Results are described in Tables 10 and 11.

Table 10 One Way ANOVA test. Participants’ time per response

Full size table

Table 11 Turkey HSD p-values. Participants’ time per response. Bold cells show comparisons whose p value is lower than 0.01. Italic cell indicates a p value lower than 0.05

Full size table

6 Discussion

In this paper, we tried to answer three questions. With respect to the first one we can say that most of participants in our experiment (76.4%) had the ability to distinguish between low, medium, and high-quality manuscripts (making minor mistakes) and only a minority of individuals were unable to distinguish the quality of a manuscript.

The Fine Partition, K_F, was the most common in our experiment (27.4%), followed by ${K}_{F_3}$(23.6%). K_C, the coarse partition, was assigned to only 9.4%. Having a fine partition means that the participants can distinguish between low, medium, and high-quality articles. ${K}_{F_1}$ was also a frequent partition (22.6%).

Sometimes, it can be difficult to distinguish between low and medium or high and medium articles. However, this type of error is less critical than confusing a high-quality article with a low-quality one. We can say that almost all participants (about 73.6%) had K_F, ${K}_{F_3}$ or ${K}_{F_1}$ partitions, which means that they were able to distinguish between different types or manuscripts according to their quality while, at the same time, authors in this partition could differentiate between low- and high-quality documents.

With respect to the second question posed in the paper, results from the experiment also mean that, for experienced users, the amount of information used in our research (Title, Abstract, and Keywords) was enough for them to achieve a quality perception that was quite close to the actual quality of an article.

Participants were required to have a bachelor’s degree and working experience in IT while some of them were also authors for computer science journals. So, it is likely that most of them had some experience in reading and understanding scientific documents. It would be worthwhile to research more users that have a variety of backgrounds to check their abilities as well. In addition, for future research, we would like to introduce additional datasets to our website in order to be more helpful to researchers from different fields of knowledge.

Regarding the third question raised in our paper, we can say that an experienced author spends about 29 seconds reviewing the Title, Abstract, and Keywords from an article and deciding the quality of a manuscript. The median time to do so is about 20 seconds. Nevertheless, a high standard deviation was observed. Identifying the quality and the potential impact of an article in less than one minute can save a significant amount of time and effort for authors, who do not always have access to the full document in order to decide whether the manuscript would fit their needs or not.

7 Conclusions

In our paper, we proposed a tool to give authors accurate information about their skills when recognizing the potential of a manuscript and their recommended submission profile. We also wanted to know whether a minimal amount of information, consisting of the only the Title, the Abstract, and the Keywords, would be enough for a researcher to determine the article’s quality and to know how much time would be required to score the article.

For the purpose of our research, we designed a website where researchers could test their abilities by evaluating article information and sending it to a journal from one of three impact types. After designing and launching our website, we ran an experiment where 106 experienced users classified at least 15 articles. The experiment results indicate that most of them were able to determine the quality of classified articles accurately. An article required an average time of 29 seconds to perform the evaluation. In the light of the results achieved, we can say that the Title, Abstract, and Keywords provide, in most cases (90.6% according to results from the experiment), enough information to identify, at least, one of the three quality categories defined in this work (low, medium, or high quality).

Future research must test the website with non-experienced users and compare their results with the experienced users’ group. Furthermore, we would like to add articles from different fields of knowledge to analyze researchers’ behavior according to their varying backgrounds. Finally, the minimum amount of information necessary to accurately score the impact of a manuscript is also an open issue requiring further investigation.

Notes

References

Bai X, Liu H, Zhang F, Ning Z, Kong X, Lee I, Xia F (2017) An overview on evaluating and predicting scholarly article impact. Information 8(3):73. https://doi.org/10.3390/info8030073
Article Google Scholar
Callaham M, Wears RL, Weber E (2002) Journal prestige, publication bias, and other characteristics associated with citation of published studies in peer-reviewed journals. J Am Med Assoc 287(21):2847–2850. https://doi.org/10.1001/jama.287.21.2847
Article Google Scholar
Campanario JM (1998) Peer review for journals as it stands today—part 1. Sci Commun 19(3):181–211. https://doi.org/10.1177/1075547098019003002
Article Google Scholar
Chamorro-Padial J (2021) Computer science articles & journals [data set]. KAGGLE. https://doi.org/10.34740/KAGGLE/DS/1268595
Chamorro-Padial J, Rodriguez-Sánchez R, Fdez-Valdivia J, Garcia JA (2019) An evolutionary explanation of assassins and zealots in peer review. Scientometrics 120(3):1373–1385. https://doi.org/10.1007/s11192-019-03171-3
Article Google Scholar
García JA, Rodriguez-Sánchez R, Fdez-Valdivia J (2015) The author–editor game. Scientometrics 104(1):361–380. https://doi.org/10.1007/s11192-015-1566-x
Article Google Scholar
García JA, Rodriguez-Sánchez R, Fdez-Valdivia J (2019) Do the best papers have the highest probability of being cited? Scientometrics 118(3):885–890. https://doi.org/10.1007/s11192-019-03008-z
Article Google Scholar
Garcia JA, Rodriguez-Sánchez R, Fdez-Valdivia J (2021) The editor-manuscript game. Scientometrics 126:4277–4295. https://doi.org/10.1007/s11192-021-03918-x
Article Google Scholar
Johnson R, Watkinson A, Mabe M (2018) The STM report: an overview of scientific and scholarly publishing. Technical and Medical Publishers, International Association of Scientific, pp 1–214
Google Scholar
Kassirer JP, Campion EW (1994) Peer review: crude and understudied, but indispensable. JAMA 272(2):96–97. https://doi.org/10.1001/jama.1994.03520020022005
Article Google Scholar
Mavrogenis AF, Quaile A, Scarlat MM (2020) The good, the bad and the rude peer-review (no. 3; Vol. 44, pp. 413–415). Springer. https://doi.org/10.1007/s00264-020-04504-1
Mengel F (2012) On the evolution of coarse categories. J Theor Biol 307:117–124. https://doi.org/10.1016/j.jtbi.2012.05.016
Article MathSciNet MATH Google Scholar
Menon V, Varadharajan N, Praharaj SK, Ameen S (2021) Quality of peer review reports submitted to a specialty psychiatry journal. Asian J Psychiatr 58:102599. https://doi.org/10.1016/j.ajp.2021.102599
Article Google Scholar
Okike K, Hug KT, Kocher MS, Leopold SS (2016) Single-blind vs. double-blind peer review in the setting of author prestige. J Ame Med Assoc 316(12):1315–1316. https://doi.org/10.1001/jama.2016.11014
Article Google Scholar
Özçakar L, Franchignoni F, Kara M, Lasa S (2012) Choosing a scholarly journal during manuscript submission: the way how it rings true for physiatrists. Eur J Phys Rehab Med 48(4):643–647
Google Scholar
Rajput AS (2022) Scientific writing: an analysis of Pune-based climate scientists’ perceptions and training needs. Weather 77:99–103. https://doi.org/10.1002/wea.3967
Article Google Scholar
Richard BP et al (2019) Are scientific editors reliable gatekeepers of the publication process? Biolog Conser 238:108232. https://doi.org/10.1016/j.biocon.2019.108232
Article Google Scholar
Rodriguez-Sánchez R, García JA, Fdez-Valdivia J (2016) Evolutionary games between authors and their editors. Appl Math Comput 273:645–655. https://doi.org/10.1016/j.amc.2015.10.034
Article MathSciNet MATH Google Scholar
Schoenwolf GC (2013) Getting published well requires fulfilling editors' and reviewers' needs and desires. Develop Growth Differ 55(9):735–743. https://doi.org/10.1111/dgd.12092
Article Google Scholar
Schuster P, Swetina J (1988) Stationary mutant distributions and evolutionary optimization. Bull Math Biol 50(6):635–660. https://doi.org/10.1007/BF02460094
Article MathSciNet MATH Google Scholar
Tennant JP, Ross-Hellauer T (2020) The limitations to our understanding of peer review. Res Integ Peer Rev 5(1):6. https://doi.org/10.1186/s41073-020-00092-1
Article Google Scholar
Wallach JD, Egilman AC, Gopal AD, Swami N, Krumholz HM, Ross JS (2018) Biomedical journal speed and efficiency: a cross-sectional pilot survey of author experiences. Res Int Peer Rev 3:1. https://doi.org/10.1186/s41073-017-0045-8
Article Google Scholar
Wong VSS, Strowd RE, Aragón-García R (2017) Et al. mentored peer review of standardized manuscripts as a teaching tool for residents: a pilot randomized controlled multi-center study. Res Int Peer Rev 2:6. https://doi.org/10.1186/s41073-017-0032-0
Article Google Scholar

Download references

Funding

Funding for open access publishing: Universidad de Granada/CBUA.

Author information

Authors and Affiliations

CITIC-UGR. Universidad de Granada, 18071, Granada, Spain
Jorge Chamorro-Padial
Departamento de Ciencias de la Computación e I.A, CITIC-UGR. Universidad de Granada, 18071, Granada, Spain
Rosa Rodríguez-Sánchez

Authors

Jorge Chamorro-Padial
View author publications
You can also search for this author in PubMed Google Scholar
Rosa Rodríguez-Sánchez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jorge Chamorro-Padial.

Ethics declarations

Conflict of interests

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

ESM 1

(DOCX 497 kb)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Chamorro-Padial, J., Rodríguez-Sánchez, R. The relevance of title, abstract, and keywords for scientific paper quality and potential impact. Multimed Tools Appl 82, 23075–23090 (2023). https://doi.org/10.1007/s11042-023-14451-9

Download citation

Received: 08 September 2021
Revised: 11 May 2022
Accepted: 31 January 2023
Published: 27 February 2023
Issue Date: June 2023
DOI: https://doi.org/10.1007/s11042-023-14451-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

The relevance of title, abstract, and keywords for scientific paper quality and potential impact

Abstract

Similar content being viewed by others