Documentary quality versus veracity of information of the websites on syphilis and gonorrhea

Objective: Analyze the possible relation between the documentary quality (DQ) versus the veracity of information (VI) on the syphilis and gonorrhea web pages. Methods: Descriptive cross-sectional study of websites containing information about syphilis and gonorrhea, by accessing this population through a Google. The quality was studied by using 8 variables belonging to DQ and 7 variables of the VI. Results: A total of 440 active websites mainly belonging to mass media and private entities was assessed. The fulfillment of DQ gave the following results: Mean 3.46 ± 0.07, median 4 and range from 0 to 7. The VI result was: median 4.07 ± 0.09, median 4 and range from 0 to 7. According to the search athwart syphilis or gonorrhea, the contrast of the median of the two indicators was: 3.55 vs 3.37 p = 0.181 and 4.14 vs 4.00 p = 0.442. No correlation was verified amid the data of DQ and VI (R = 0.04); p = 0.368. Similarly, no significance was observed when segregating data by disease, in the case of syphilis R = -0.03; p = 0.625 and on gonorrhea R = 0.12; p = 0.064. Conclusions: The DQ and VI bestowed low outcomes, which implies poor quality of syphilis and gonorrhea websites. According to infection (syphilis or gonorrhea), there were no meaningful differences amid the median values of the two indicators. Being acquainted with the authorship and affiliation of a website and the fact that it is tied to a prestigious web may be a factor to be deemed when predicting the VI of a website. The correlation amid the two indicators did not demonstrate an association, thus, knowing the DQ does not imply having the security of an adequate VI.


Introduction
Ecological studies carried out by the analysis of the search for information in Web 2.0 (also known as Social Web, refers to websites that emphasize user-generated content, ease of use, participatory culture and interoperability for end users), have turned into a complement for the public health surveillance. In addition, they contribute interesting data that on occasion, allows for getting a "self-portrait" of the informational needs of a population.
Library and information science and related fields in the sociology of science and science and technology studies have developed a range of theories and methodologies -now including webometrics-concerning quantitative aspects of how different types of information are generated, organized, disseminated and used by different users in different contexts (Bjőrneborn & Ingwersen, 2004).
In this regard, the term "infodemiology" was coined to label the emerging set of public health information methods in order to analyze search behavior, communication and publication on the Internet (Eysenbach, 2009).
For the same reason, Mavragani et al., (2018), pointed to the benefit of observing and analyzing the behaviour based on Web 2.0 so as to know the human demeanor with the purpose of foretelling, assessing and even preventing the problems related to health which were common in the quotidian life.
On the other hand, sexually transmitted infections (STIs) are an important public health problem, both, on account of the burden of diseases that STIs can cause, as well as the complications and the sequelae that STIs generates if they were not diagnosed or treated early.
The increased incidence befalls to people whose ages are between 14 and 35, being higher in those females who lived in urban areas, singles and young (World Health Organization, 2016). For these young people, the traditional means are no longer within the main spaces where ideas about STIs are reported and shared. They prefer face to face relationships, Internet and social networks according to allow interactivity and conversation and even a desired anonymity.
Considering the growing importance of Web 2.0 studies on STIs, several kinds of research were conducted to be acquainted with "information need" regarding these infections. Some of these include: research performed on Web 2.0 tools used to prevent STIs (Sanz-Lorente et al., 2018), the study of search trends using Google Trends (Chiu et al., 2017;Johnson & Mehta, 2014), information on STIs broadcasted through YouTube (Ortiz-Martinez et al., 2017;Sanz-Lorente, Chorro-Vicedo, et al., 2019), the validity of the references that underpin the information about STIs on Wikipedia (Sanz-Lorente, Ruiz-Belda, et al., 2019), or access to adult content websites as risk markers for STIs (Yom-Tov et al., 2015).
Having said that, this amount of health information that is readily available should be of a proven quality.
The ease and freedom with which health contents can be published on the Web 2.0 require a set of criteria that enable them to screen the electronic contents as well as to discern the veracity, credibility, reliability and at short, the quality of these data that this medium provides us. This set of quality criteria should be based on broad agreement among health specialists, health authorities and the user representatives.
Most existing assessment systems are based on the establishment of a set of criteria (indicators). Although, one of the aspects that should these assessment methods consider is to enable users to draw their own conclusions. However, quality website assessment based on the opinion provided by the users -users experience-is a task of high complexity which has not been sufficiently corroborated (Sanz-Lorente & Guardiola-Wanden-Berghe, 2017).
Guardiola-Wanden-Berghe et al. (2011), based on the different existing proposals recast in 22 items the variables on Web quality provided by the primary institutions, showing in turn a positive correlation between compliance with the quality variables and the credibility indicator (8 items). This correspondence bestowed the general user the opportunity of assessing the quality of a certain Web using only 8 easily understood variables (Sanz-Lorente & Guardiola-Wanden-Berghe, 2017).
Likewise, having studied the quality of the websites it was concluded that identifying the authorship and affiliation was a relevant element to foretell the information quality. The joint existence of authorship and affiliation might be a prognostic quality factor of a website (Guardiola-Wanden-Berghe et al., 2011;Sanz-Lorente & Guardiola-Wanden-Berghe, 2017).
Certainly, possessing simple tools may facilitate decision-making regarding the information quality of a particular website. Nonetheless, relying only on documentary indicators may mislead those who do not have proper knowledge on the matter being sought. It would be interesting to know if there is a relation amid the quantitative indicators of documentary quality (fulfillment of the stipulated requirements that ensure the correct structure of a document) along with the veracity of the information (information integrity afforded by the source of information) provided by a website. Nowadays however, the quality of information on health provided online remains questionable, therefore, there is a compelling need to understand how the consumers assess this information. Sun et al. (2019), demonstrated that health information by consumers is usually highly subjective and occasionally misinformed.
Consequently, having seen the current search habits and the need for the applicant population to receive truthful and quality information, this document aims to analyze the possible relationship between the documentary quality (DQ) versus the veracity of information (VI) on the syphilis and gonorrhea web pages.

Methods
Design: Cross-sectional descriptive study, in which the population to study are Web pages, resulting from their access, from the procured reference from the search on Google Spain http:// www. google. es/].
Source of data: The data were obtained by direct consultation and access from the Internet. The search terms were syphilis and gonorrhea.
These terms were introduced one after the other on the search bar. Searches that included both terms were conducted in September 2019. They were carried out by M.S-L. and N.M-C and were verified by J.S-V.
PageRank along with Google Sample Fallacy were considered (The Google search engine never offers results of more than one thousand references. It 'estimates' the number of websites that exist on servers, as its web-crawling spider 'googlebot ' crawls the World Wide Web and calculates its estimate, in relation to the number of relevant websites, by how long it took to find the first thousand websites) (Guardiola- Wanden-Berghe et al., 2011).
Inclusion criteria: Websites that were written in English and Spanish whose overriding topic are syphilis and gonorrhea. Owing to their correspondence with the name of these diseases, these terms were chosen.
Exclusion criteria: Pages requested previous payment for its consultation, since they are not consulted by the general population. It was also not possible to analyze the inactive pages (broken or non-existent hyperlink).
Variables to study: Based on Guardiola-Wanden- Berghe et al. (2011) proposals were obtained the documentary quality (DQ) variables, meanwhile the variables to determine the veracity of information (VI) were procured from the Center for Disease Control and Prevention (CDC) for the prevention of syphilis (Center for Disease Control and Prevention (CDC), 2017b) and gonorrhea (Center for Disease Control and Prevention (CDC), 2017a); see Annex I.
Authorship and Affiliation (A-A): Through the fulfillment study of indicators according to the different websites-that may or may not have these two variables-it was verified whether the joint existence of authorship and affiliation might be a prognostic quality factor of a website.
Data analysis: As a result of the sum of the affirmative answers, variables were coded dichotomously ("no" = 0; "yes" = 1), which enabled the calculation of the fulfillment of the frequencies.
Continuous variables were described with their mean and standard deviation, meanwhile, ordinal variables were depicted with their absolute value and percentage. The mean and median were used as central tendency measures. The most representative variables were represented using tables or figures. Student's t-test was used to verify the significance of the mean difference for independent samples. Using Bonferroni method as a post-hoc test, the Analysis of Variance (ANOVA) was performed to compare the means between more than two groups for a quantitative variable. The significance level used in all hypothesis testing was α ≤ 0,05.
Information process: In order to avoid modification of results and keep the procured Page-Rank, it was selected from the Google advanced search "showing 100 results per page" and the obtained references in the research were stored in portable document format (pdf) with a hyperlink in each reference, this was made to re-access to the Web page at whatever moment. Each of the procured results and the validity of the link were verified, hence there was complete certainty about the number of acceptable websites for the study, but at in time did Google provide any indication of the "approximate" number of outcomes considered.
To verify the fulfillment of the studied variables, it was "navigated" when it was necessary through the web pages included in each of the websites (i.e. in the link "about us".
The evaluation of each website was carried out by the authors M.S-L. and N.M-C. Possible discrepancies were resolved through consultations with the author C. W-B. and subsequent consensus among all authors.
Statistical Package for the Social Science (IBM-SPSS), version 22 for Windows were used to store data and the statistical calculations. Double tables always were filled in subsequently to compare the equality between them thus to avoid transcription mistakes.
Documentary qualityl (DQ). The descriptive analyses of the compliance with these variables submitted the following results: Mean of 3.46 ± 0.07; minimum of 0 and maximum of 7; median = 4. There was not a website that met the eight criteria of this indicator. Amongst the results of syphilis and gonorrhea (3.55 versus 3.37; p = 0.181) were not found significant differences between the means of this indicator.
Regarding the content quality was procured the following findings: mean of 4.07 ± 0.09; minimum of 0 and maximum of 7; median = 4, which means that the 48 (10.91%) websites met all the criteria. There were no differences amongst the means of this indicator for the following diseases syphilis and gonorrhea (4.14 versus 4.00; p = 0.442).
Compliances of each one of the items for both indicators in the total of the studied websites can be consulted in Table 2.

Indicators according to the type of institutional affiliation
The outcomes in compliance with the two indicators segregating them by type of institution showed that were no difference between websites of scientific societies, scientific publishers, public institutions and media. Effectively, a significance between these and the private entities or personal websites (p < 0.001) was found in all the cases. Regarding the VI no relation was found between the websites of the different institutions; see Table 3. For both indicators, the relationship of these data with the median value of the total analyzed websites can be seen in Fig. 2.

Authorship and affiliation as a quality forecast factor
From the total of studied websites, 244 (55.45%) were presented simultaneously authorship and affiliation: 125 (28.41%) about syphilis and 119 (27.05%) about gonorrhea. Relevant differences were observed among the means of the DQ and the websites that had simultaneously A-A) and this did not (4.23 versus 2.53; p < 0.001). For the same reason, segregating the data by disease was procured a significant relation regarding the means of this indicator: A-A for syphilis 4.30 versus 2.53; p < 0,001 and A-A for gonorrhea 4.14 versus 2.46; p < 0.001.

Fig. 2 Allocation of the indicator values of the documentary quality (DQ) and the veracity of information (VI) and its correspondence regarding the median value of syphilis and gonorrhea websites related to the institution
Regarding the VI, no relation was found among their means (4.15 versus 3.97; p = 0.353). And, neither a significant relation was found when separating the data by disease A-A for syphilis 4.10 versus 4.21; p = 0.699 and A-A for gonorrhea 4.20 versus 3.77; p = 0.082.
The relation of these data with the median value, for both indicators, from the total of analyzed websites that meet simultaneously authorship and affiliation, can be seen in Fig. 3.

Relationship between the indicators
Amongst the data of the DQ and the VI (R = 0.04; p = 0.368) no correlation was verified. Likewise, no significance was observed by segregating the data by disease: in the case of syphilis R = -0.03; p = 0.625 and in the case of gonorrhea R = 0.12; p = 0.064.

Discussion
This paper analyzed and correlated the concepts of DQ in comparison with the VI. In other words, those cases in which the user is able to find the information in websites which no criteria of inclusion or exclusion are applied, as a result can "stumble" on information of variable quality and where the final user is always who should be the latter judge to assess the quality and its relevance.
The number of non-active hyperlinks (link not working or error 404) was lower than the described one in a research study in the 2011 (Guardiola-Wanden-Berghe et al., 2011), however, it was more similar to another current article published in 2017 (Sanz-Lorente & Guardiola-Wanden-Berghe, 2017). These data allow one to infer whether the access to the health websites has not changed in recent years.
The procured results related to the DQ showed a poor compliance with the 8 criteria and it is noteworthy that the median values indicate that only fifty per cent of the studied websites met half of the items of this indicator. Even more notable that not all criteria were verified on any website. The data related to the VI were also not appropriate as should they have been. The only issue that could be mentioned that barely a few data fulfilled with all the items of the VI, however, the mean of the correct items was similar to the DQ.
Thus, the data stemmed from both indicators which arouse doubt of how ensure and judge the quality information on the Web, creating uncertainty about the possibility of assessing the information veracity of the information using only the indicators.
In any case, and despite the lower compliance observed, according to a current study (Sanz-Lorente et al., 2018), the tools of the Web 2.0 have shown a positive effect in the promotion of foreseeable strategies for STIs and may help link young people to health sexual campaigns. These tools could be combined with other interventions and demonstrate the full potential to become essential tools for public health. Kuder et al. (2015), showed that a good website design captured more participant information and no decrease in requests, kit return, or treatment adherence of people with STDs.
Having said that, it is important to consider that assessing the quality of the websites is an arduous task, on the one hand, as a construct, it cannot be measured directly, on the other hand, the quality will be defined according to the expectations of the users, which implies an essential subjective component (Feo Acevedo & Feo Istúriz, 2013).
The poor results obtained is a health problem when considering the adolescents (group in which a greater contagion of STIs is produced) seek information mainly through Google, with no knowledge of systems that accredit content quality, but consider it useful and reliable, changing their behaviour patterns according to the information found. All this carries a risk in this age group with very sensitive characteristics (Blázquez Barba et al., 2018). Furthermore, for this age group, Web 2.0 is not only the most frequent information source, but it is used rather than medical consultation. (Sanz-Lorente et al., 2018). Notwithstanding, it has been found that most websites do not respond properly to the public inquiries about treatment options (Walsh et al., 2019). Taking into account the study by Resneck et al. (2016), the internet STI testing sites were difficult to contact and demonstrated unwillingness to answer consumer-specific questions. This situation could create a disaffection problem.
Perhaps the low results obtained should make us reflect, today's health care environment encourages health care consumers to take an active role in managing their health. As digital natives, young educated adults do much of their health information management through the Internet and consider it a valid source of health advice. However, the quality of information on health websites is highly variable and dynamic. Little is known about the understandings and perceptions that young educated adults have garnered on the quality of information on health websites used for health care-related purposes (Tao et al., 2017). What is known is that men who accessed internet-based screening had known risk factors for STIs and had a high prevalence of infection. Therefore, Internet-based screening was acceptable and could reach these high-risk men who might not otherwise be reached through traditional means (Chai et al., 2010). On account of social and health-care barriers, the LGBTQ people (lesbian, gay, bisexual, transgender and queer) community are another important risk group seeking health information, including STIs. Use of hookup sites was nearly ubiquitous among MSM (men who have sex with men) undergoing STD screening. In this sense,specific hookup sites were significantly associated with STD diagnoses among MSM (Chan et al., 2018). Kreines et al. (2018), conducted a web-based study on the DQ for LGBTQ and concluded that there was a dearth of reliable information and the need to improve the quality of health-care towards better inclusion and consideration for them.
It is noteworthy that the media were the most representative types of institution, account of the impact that these have on public health and individual health, besides, they are essential for shaping opinions. In the healthcare sector, the media play a crucial role, since the media content creates and strengthens demeanors, beliefs, and values. In other words, media are instrumental in bringing behavioural changes in knowledge, beliefs, and attitudes about health and healthy behaviours.
Generally speaking, the messages and campaigns carried out in the media with great design and an adequate orientation to certain groups of the population are usually efficacious (Stead et al., 2019). Thus, it is notably important that the information broadcasted on media websites are reliable, corroborated, and straightforward. Although the webbased campaigns showed good acceptability and low cost, it does not mean that an educational activity alone may be insufficient to change behavior (Ross et al., 2016).
It is noteworthy, that the media were the most representative institution of this kind since these have an impact on individual and collective health, and they are essential to shaping opinion. In the healthcare sector, the media play a crucial role, since its media content creates and strengthens demeanors, beliefs, and values. Nevertheless, it is necessary to foster a reflection towards these "new" models of communication, more democratic and participatory which with proper information will have a better impact on the life and health of the population. (Feo Acevedo & Feo Istúriz, 2013;Saraf & Balamurugan, 2018).
Although the scientific societies present the foremost results, the lower compliance of both indicators requires an important and urgent review and enhancement of these "virtual headquarters" as they are social references of the population who search about health information. The low quality of health information on the websites of scientific societies was already reported in previous works. (Sanz-Lorente & Guardiola-Wanden-Berghe, 2017).
The partaking of experts in the updating and disseminating of knowledge for the benefit of society, which is already particularly appreciated, should unravel through the websites of their scientific societies (López Marcos & Sanz-Valero, 2013).
Many scientists have a negative attitude towards participation in Web 2.0 because the vast majority believe they lack the necessary skills. On the other hand, it is also seen as an activity of a relative irrelevance.The low contribution of health professionals in editing Web 2.0 contents, which is not in their consultation, has already been studied in previous works (Oller-Arlandis & Oller-Arlandis, 2017).
The association of the A-A with the highest compliance of the DQ might be an important factor in predicting the information quality. As it has been studied in other Web 2.0 tools, knowing the source of a certain document is the first criterion to be considered. The Web is filled with individual opinions that are (in many instances), disguised behind nonexistent persons, false or are concealed in the anonymity of the Internet and, on occasion, personal opinions are shown as though they were scientific facts. (Gabarrón & Fernández-Luque, 2012;Oller-Arlandis & Oller-Arlandis, 2017). Guardiola-Wanden-Berghe et al. (2011), commented that the presence of the author linked to a reference institution could be the first criterion of quality to be considered. The reputation of an author or his institution he/she represents is relevant in deciding the quality of the sources that have been sited. According to Whiteley et al. (2012),Web 2.0, (specifically the social networks) paved the way for sharing a large amount of personal information on health, and without appropriate quality which (i.e. deficiencies in usability, authority, and interactivity), might be a risk for forestalling the STIs.
Websites prove the most challenging as they do not give much information about authors or creators. The websites should provide, preferably in "about me/us" link, unabridged information of its managers and their respective institutional affiliations. This simple detail would provide more credibility to their information.
The absent correlation between both studied indicators (DQ vs VI) demonstrates that it is insufficient the proposal of a few criteria to ensure the information quality obtained on the Web. The employment of quantitative indicators is valid when these are applied by healthcare professionals. Nonetheless, doubts exist about the benefit of the quantitative indicators when they are used by the general population. In many cases, quantitative indicators are impractical for a non-expert. Although, the Brief DISCERN was considered as a reliable and valid instrument capable of discriminating between websites with good and poor content quality (Khazaal et al., 2009).
Perhaps, it might be sufficient to establish some "screening criteria" and allow the user to assess the procured information. In the same manner, as befalls in other information resources, called "classics" (i.e. newspapers, radio or television) in which the messages and its content may be erroneous, unfinished or present a biased point of view, but not for that reason cease to use, being the user who forms his judgment call.
Conceivably, the existence of codes of conducts and ethics for example, HONcode (Health On the Net Foundation (HON Foundation), 2019), might be a trustworthy way of showing the existence of credible and quality information, given that, a significant number of people have low health literacy and these people have difficulties accessing to this information, assessing its quality and applying it to their circumstances (Charow et al., 2019).
Possible limitations of this study: This work has reviewed the entire website´s population resulting from the conducted searches with the terms syphilis and gonorrhea to know about DQ and VI of the websites. However, it might not represent accurately what is colloquially understood by "navigate". That is to say, the investigation is neither supervised or carried out by an expert in the matter.
On the other hand, there may be a bias, owing to the fact that the research was only performed on Google so those carried out in other search engines or including those directly executed on social networks or aggregator services were not considered.
In conclusion: The DQ and VI bestowed low outcomes, which implies poor quality of syphilis and gonorrhea websites. According to infection (syphilis or gonorrhea), there were no meaningful differences amid the median values of the two indicators. Being acquainted with the A-A of a website and that it is tied to a prestigious website may be a factor to be deemed when predicting the VI of a website. The correlation amid the two indicators did not demonstrate an association, thus, knowing the DQ does not imply having the security of an adequate VI.
Acknowledgements To Habiba Chbab, master's degree in English and Spanish for Specific Purposes and doctoral student in Professional and Audiovisual Translation (Research branch: medical translation), for her inestimable collaboration in the translation of this document.
Author contributions All authors contributed to the study conception and design. Material preparation and data collection were performed by María Sana-Lorente and Natalia Noles-Caballero. The analysis and interpretation of the data by Carmina Wanden-Berghe & Javier Sanz-Valero. CWanden-Berghe and Javier Sanz-Valero wrote the first draft of the manuscript and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Funding Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.