1 Introduction

The internet plays an increasingly important role for many older adults. In the USA, for instance, the share of internet users among the age group 65 + has grown from 46% in 2011 to 75% in 2021 [1]. The growth coincides with higher frequency of use, longer duration, and greater breadth of use. These changes highlight that a larger share of older adults integrate the internet in their daily life, which helps cope with age-related limitations in current or future situations [2,3,4]. Although many barriers toward internet use have diminished, digital inequalities in older adults still exist [5]. These inequalities produce a continuum of internet use with respect to frequency, duration, and breadth [6,7,8]. Therefore, understanding the factors explaining older adults’ internet use is an important issue in human–computer interaction, and the insights obtained can help in devising digital services and policies that are better tailored to the needs of older adults.

For investigating the factors explaining internet use, multivariable regression models are widely used by modeling the relationship between two or more independent variables and internet use as the dependent variable. Researchers apply regression analysis to observational data collected from a sample of older adults to estimate a regression model in order to test causal explanations of internet use (explanatory modeling) [9]. Linear regression models are applied to continuous dependent variables, as it is the case for frequency, duration and breadth of use. The estimated regression coefficients indicate how a one-unit change in the independent variable affects the dependent variable. Logistic regression models are applied to categorical dependent variables. If the dependent variable has two categories, e.g., for use and non-use, binomial logistic regression must be used. Dependent variables with three or more categories can be studied using multinomial logistic regression. In either form of logistic regression, the estimated coefficients represent the change in the natural logarithm of the odds of belonging to the user category (log-odds). The coefficients are usually exponentiated and reported as odds ratios (OR) [10].

Categorical variables allow measuring various facets of internet use, such as whether an older adult (1) uses the internet in general, (2) uses the internet for a specific purpose, (3) consumes a specific online service, and (4) belongs to a certain type of user. Given this variety, logistic regression is a useful statistical method for the study of older adults’ internet use and frequently being used in the literature, notwithstanding the limitation of categorical measures. However, logistic regression as a powerful but sophisticated method demands extensive care in the computation of regression models and in the reporting of results. It is important that the results from published studies are based on the correct application of procedures, data that meet assumptions of logistic regression, and accurate reporting of the analyses. Although the peculiarities of logistic regression have been acknowledged long ago and demonstrated for different fields of research [11,12,13], the state of adoption to investigate older adults’ internet use has not yet been inquired.

It is not known whether the multi-disciplinary literature fulfills commonly recommended quality criteria for logistic regression analysis. These criteria have first been proposed in the medical literature [14, 15] but are not specific to the field of research as they reproduce guidelines for the execution and reporting of logistic regression [16,17,18]. Deficits in the fulfillment would undermine the ability to interpret, compare, and integrate the evidence obtained from single studies, and thus, the understanding of factors explaining older adults’ internet use would be compromised. Our research addresses this important gap in the literature by adopting quality criteria and conducting a systematic review. Thus, the objective of our research is to assess the extent to which empirical articles on internet use among older adults meet quality criteria for logistic regression analysis.

2 Method

2.1 Information sources and search strategy

We identified peer-reviewed articles published in journals and conference proceedings between January 1, 2010, and December 31, 2020 through a systematic search of the literature. The search was conducted on August 25, 2021, and October 29, 2022, using the electronic databases Scopus and PubMed. We chose Scopus for its greater coverage of peer-reviewed literature compared to Web of Science [19, 20]. In view of the multi-disciplinary characteristics of the field, we additionally searched PubMed as the most comprehensive database of biomedical and life sciences literature, including gerontology, health informatics, and public health [21]. We performed the bibliographic search on the article’s title, abstract, and keywords by combining search terms for older adults, internet use, and logistic regression. Thus, the search query had three components: (1) Older adults were represented as ("older adult*" OR "older people" OR "old age" OR elder* OR senior*); internet use as the dependent variable was included as ("internet use" OR "internet usage" OR "digital technology use" OR "digital technology usage" OR "use of digital technolog*"); and (3) logistic regression was coded as (logistic OR odds OR survey). The latter component included the generic term ‘survey’ to prevent oversight of articles that provide no specification of the statistical method used in their title, abstract, or list of keywords. Articles known to the authors but not found in the search were added to the initial list.

2.2 Eligibility criteria and article selection

We included articles that reported on the application of logistic regression for explaining older adults’ internet use. The dependent variable could represent (1) internet use in general, (2) use for a specific purpose, (3) consumption of a specific online service, or (4) belonging to a certain type of user. Therefore, we included binomial logistic regression (dependent variable with two categories, i.e., use and non-use) and multinomial logistic regression (dependent variable with more than two categories, e.g., representing different types of users). With respect to the target population, we set the minimum age at 55 years, which allowed us to take regional, societal, and cultural differences in the definition of older adults into account [22]. We defined further inclusion criteria as follows: peer-reviewed article published in a journal or conference proceedings between 2010 and 2020, written in English, original contribution, and full-text available. By limiting the search to the last eleven years (2010–2020), this review enabled assessment of articles that reflect on the major changes of older adults’ internet use during that time period and thus have more practical relevance in the current digital era characterized by greater importance of social media and digital health services for older adults.

We ensured the validity of the article selection through a procedure that included independent coding, training of coders, measuring agreement, and resolving disagreements [23]. First, we developed a codebook defining the eligibility and exclusion criteria as well illustrating the exclusion criteria by examples. We pilot-tested the codebook with two randomly selected articles. The screening phase began with two authors independently coding the first five articles (in reverse chronological order) based on the title, abstract, and keywords. Then, the codes were compared and conflicts were resolved by discussion between the coders. This procedure was repeated for the next five articles. Because the inter-coder agreement was substantial, the number of articles per round was increased to about 50 articles. The agreement between coders in the screening phase was high (87.2%), and the most frequent conflicting codes were related to internet use as the dependent variable (5.0%) and older adults as the target population (3.9%), respectively.

For the articles that went through the screening, the full-texts were downloaded and then independently assessed by the same coders. This assessment used the same procedure as for the screening (pilot testing of the codebook with an example article, two rounds of coding of five articles, coding of the remaining articles, and resolving conflicts after each round). Again, the initial agreement was high (92.0%), and the reasons for conflicting codes were related to the dependent variable (3.2%) and the target population (4.8%).

2.3 Data collection process

For the articles that met all eligibility criteria, the two coders independently collected data for the evaluation criteria defined in Sect. 2.4. Additionally, the coders extracted data on study characteristics, including country, dependent variables, and sample size. If an article reported two or more regression models, we identified the size of the smallest sample and then applied all subsequent analyses to the respective model. Due to the high number of data items, we organized the coding into four steps, in which different extraction forms were used (one form each for study characteristics, analytic criteria 1 through 6, analytic criterion 7, and documentation criteria). For each step, we provided a codebook defining the data items and then pilot-tested the codebook and extraction form for the first article. The coders met on a daily basis to compare their coding and agree upon the final data points (the initial agreement was 90.1%). This iterative process lasted for about two weeks.

In a subsequent analysis, we determined the subject area of each article and categorized its theory component as follows. Subject area was defined as the journals’ primary subject area listed in Scopus (e.g., "Medicine: Health Informatics”). Theory component was categorized as strong, if the authors justified their regression model by referring to a theory, developing hypotheses, or describing a mechanism that underlies the empirical phenomenon (causal reasoning). If no such information was present, we categorized the theory component as weak. These articles only referred to previous empirical studies, or stated to examine factors associated with internet use, but did not refer to any theory or underlying mechanism. Although we provide an indicator for the presence of theory in an article, this categorization does not allow inferences about the validity and usefulness of the theory component.

2.4 Evaluation criteria

We evaluated the included articles for nine criteria, which we adopted from previous reviews examining the application of logistic regression in medical research [14, 15, 24]. The criteria summarize widely accepted guidelines for improving the execution, reporting, and interpretation of logistic regression [16,17,18]. Following the proposal of Bagley et al. [14], the criteria are divided into six analytic criteria related to the computation of the regression model and three documentation criteria associated with the reporting of the model development. Because the literature under investigation aims at explaining internet use and thus tests causal hypotheses (explanatory modeling) but does not aim at predicting whether a newly observed older adult belongs to the user group (predictive modeling), validation of the estimated regression model on unknown data is not part of the quality criteria [9].

2.4.1 Sufficient events per independent variable

If the logistic regression model is computed from too few observations of the categories of the dependent variable (“events”), the estimates of coefficients will become unreliable [25]. This problem is particularly relevant to studies in which events are sparse, but the regression model includes many independent variables; hence, the estimates will rely on few observations and might lack in precision. Simulation studies suggest that this problem can be mitigated if the ratio between the number of the less common category to independent variables exceeds some threshold, such as 10:1 or 20:1 [26, 27]. We calculated this ratio for each study.

2.4.2 Conformity with linear gradient

Logistic regression analysis assumes that every change in a given independent variable has approximately the same effect on the log-odds of the dependent variable [18]. The assumption should be verified for continuous independent variables, if included in the model. We determined whether articles indicated the verification of that assumption or included no continuous variables.

2.4.3 Interactions

The coefficient obtained for a given independent variable can only be interpreted, if the variable does not interact with another independent variable [28]. If an article provided no justification for the tested regression model, thus if the theory component was categorized as weak, we assessed whether any testing for interactions was reported, either through including interaction terms in the regression model or stating anywhere in the text that interactions where checked. However, if an article had a strong theory component, we did not check for testing but rated the criterion as fulfilled (because of the theoretical underpinning of the tested model).

2.4.4 Collinearity

Collinearity describes a situation in which two independent variables are linearly associated so that their inclusion in the regression model may lead to unstable estimates of coefficients and reduce their statistical significance. In other words, the variables cannot independently explain internet use because they are either conceptually similar, affected by a confounding variable, or dependent on each other. The collinearity problem can be detected by inspecting correlations between independent variables and calculating the variance inflation factor (VIF) for each independent variable [29]. We coded whether the collinearity problem was discussed and which measures were reported (if any).

2.4.5 Main estimates

A logistic regression model includes estimates of the direction and strength of relationship between independent variables and the log-odds. We rated this criterion as fulfilled if the article both provided point estimates (e.g., odds ratio, coefficient) and interval estimates (e.g., confidence interval, standard error). As a supplementary information, we coded the reporting of exact p-values for the independent variables included in the model.

2.4.6 Goodness of fit

Goodness of fit (GOF) is defined as the extent to which the regression model describes the data from which the model was estimated. A great variety of measures are available for logistic regression models, including statistical tests, regression diagnostics, pseudo R-squared measures, discrimination statistics, and percentage correct [30, 31]. We coded the adoption of each measure.

2.4.7 Selection of independent variables

The application of logistic regression to test causal explanations follows the hypothetico-deductive method of scientific inquiry. In this method, the selection of independent variables must be explained based on extant theory, earlier research, or preliminary evidence of the plausibility of a relationship with the dependent variable. We assessed whether such an explanation was given, and additionally noted if the selection was based on statistical significance in a prior analysis (e.g., bivariate chi-square test for categorical variables).

2.4.8 Coding of independent variables

The interpretation of an estimated logistic regression model depends on an appropriate description of the coding of each independent variable. We checked whether articles provided that coding through units of measurement and the possible numerical values.

2.4.9 Fitting procedure

The procedure for entering independent variables into the logistic regression model should explicitly be stated because automated procedures can result in unstable models and affect the interpretation of statistical significance [32]. For the articles that provided such a statement, we coded whether (1) the model was collectively specified a priori, (2) successive models were manually built by adding variables structured in blocks, or (3) an automated procedure was used (e.g., forward inclusion, backward exclusion).

3 Results

3.1 Article selection

Figure 1 presents the selection process for peer-reviewed articles published in journals and conference proceedings, adapted from the Preferred Reporting Items for Systematic Reviews and Meta-Analyses statement [33]. We retrieved 361 records through database searching. Additionally, we considered four articles known to us because they dealt with the explanation of internet use by older adults using logistic regression; these articles were indexed in Scopus or PubMed, but their records had some limitations and that is why they were not automatically identified. After the removal of duplicates, we screened 281 articles based on the title, abstract, and list of keywords. Of the 63 articles that were relevant for the full-text assessment, 36 articles met the criteria for inclusion.

Fig. 1
figure 1

Flow diagram of article selection

Table 1 provides an overview of the included articles. All but two articles report on the application of binomial logistic regression to examine general or specific internet use. The two remaining articles deal with multinomial logistic regression to examine three different types of users. Six samples included more than 5000 participants, and five samples comprised less than 500 participants. Note that sample size is with respect to the regression analysis and is often smaller than the sample of the survey (due to missing values). The number of independent variables ranged between 3 and 22.

Table 1 Overview of the articles included in the analysis (N = 36)

Table 2 shows the subject areas of journals taken from the Scopus database, indicating that the most frequent areas were health informatics (12), geriatrics and gerontology (5), public health (5), and communication (3). With respect to their theory component, we categorized 16 articles as strong, of which eleven referred to specific theories or perspectives, and five described mechanisms of digital inequality. We categorized 20 articles as having a weak theory component to justify the regression model.

Table 2 Subject areas and categorization of theory components (N = 36)

3.2 Analytic criteria

Table 3 reports the results of five analytic criteria. The number of events per independent variable was greater than ten in 30 articles (83%), which included eight articles that did not allow determining the exact ratio but fulfilled this criterion.

Table 3 Fulfillment of analytic criteria except goodness-of-fit (N = 36)

Fifteen articles reported on regression models that included no continuous independent variable (42%); the remaining articles did not discuss the conformity with linear gradient.

The interactions criterion was fulfilled for the 16 articles that provided a theoretical justification for the tested regression model. Two articles actually tested theoretically underpinned moderated relationships [55, 69]. However, none of the articles categorized as having a weak theory component reported on the testing for interactions.

The collinearity problem was addressed in eleven articles (31%), with seven articles providing information about correlations, two articles discussing variance inflation factors, and two articles reporting both measures. All but seven articles both provided point and interval estimates for each independent variable (81%). The most used measure of effect size was odds ratio (n = 33), followed by coefficient of the log-odds (n = 8), and average marginal effect (n = 1). Exact p-values for the independent variables were available for 17 articles.

Table 4 presents the results of the goodness-of-fit criterion. Overall, nineteen articles reported at least one GOF measure (53%). Fifteen articles provided one or more pseudo r-squared measures. Two articles reported the concordance statistic (c-statistic), which indicates how good the model is in correctly classifying users and non-users. While the c-statistic can range between 0.5 for a poor model and 1.0 for a perfect model, the observed values were 0.61 [38] and 0.82 [44], respectively. Percentage correct as an aggregate measure of classification success was available in five articles. No article mentioned the adoption of regression diagnostics, such as plotting of residuals.

Table 4 Fulfillment of the goodness-of-fit criterion (N = 36)

3.3 Documentation criteria

Every article provided an explanation for the selection of independent variables and described their coding, as shown in Table 5. In nine studies, variable selection was contingent upon the p values obtained in a prior test; furthermore, one study only retained variables that were statistically significant in the multivariable analysis. Twenty-nine articles included an explicit statement of the fitting procedure used (81%). The most frequently used procedure was a priori specification of the full model (n = 18), followed by manual stepwise regression in blocks (n = 8), and automated model fitting (n = 3).

Table 5 Fulfillment of documentation criteria (N = 36)

4 Discussion

4.1 Key findings and implications for methodology

This review contributes to the literature by providing comprehensive insights into the state of adoption of logistic regression to explain older adults’ internet use and thus complements previous findings of heterogeneity in the conceptualization and measurement of explanatory variables and internet use [70]. We analyzed thirty-six articles that applied logistic regression to examine older adults’ internet use. With respect to nine quality criteria commonly recommended for the reporting of logistic regression analysis, fulfillment ranged between four and nine, with half of the articles addressing at least six criteria. Three articles met eight, and one article [69] met nine criteria. Based on the primary subject areas in Scopus, we found no statistical difference in the quality scores between articles published in medical journals (n = 24, M = 6.13, SD = 1.42) and non-medical journals (n = 8, M = 6.25, SD = 0.71), using a Mann–Whitney U-test (U = 87.0, p = .717). Similarly, differences in each of the nine criteria were marginal (using Fisher’s exact test). Moreover, the total fulfillment did not increase over the period of eleven years (r = − 0.06, p = .752, n = 36, Spearman correlation test). We observed the lowest reporting rates for testing for collinearity (31%), conformity with linear gradient (42%), interactions (44%), and goodness-of-fit (53%). Fulfillment was higher for fitting procedure (81%), main estimates (81%), sufficient events per independent variable (83%), and complete for the selection and coding of independent variables.

It is unfortunate that the collinearity criterion was addressed in less than one-third of the articles. This finding is particularly worrisome for studies assessing the role of multiple health-related variables. Such variables might be strongly correlated due to the coexistence of chronic conditions among older adults (multi-morbidity) and co-occurrence of physiological or psychological conditions with a primary condition (comorbidity) [71]. For instance, one study tested the roles of loneliness, social participation, number of friends, and health-related quality of life [66]. Similarly, another study included depression, loneliness, social network, number of chronic diseases, and health-related quality of life [58]. Given that collinearity can easily be determined using correlation analysis and variance inflation factors, articles should indicate the computed metrics and applied thresholds. Although presenting a correlation matrix would allow the reader to inspect all bivariate relationships in great detail [37], the reporting can effectively be limited to the highest correlations [62, 69]. In a similar vein, the computation of variance inflation factors can elegantly be summarized in a single sentence that reports the lowest and highest values and concludes whether the values are below a critical threshold [56].

The criterion for conformity with linear gradient was addressed in 42% of the articles by leaving the continuous independent variables out of the regression model. For the remaining articles, the estimates must be interpreted as every one-unit change in the independent variable having the same effect on the log-odds of internet use. However, as the assumption has not been covered in the reporting, the effect might be different for the range of possible values of the independent variable. This assumption attains particular importance due to many factors that are measured on a continuous scale in surveys of older adults. Studies test the role of psychometric constructs, such as loneliness [58] and trust [64], whereas their alternate transformation into dichotomous or ordinal scales may lose relevant information. In addition, continuous sociodemographic variables, such as age and number of chronic conditions, are frequently used; again, transformation into lower levels of measurement would reduce information and might undermine the model’s explanatory power.

The fulfillment of the interactions criterion (44%) was exclusively due to the articles that provided a theoretical justification of the tested regression model. However, we note that 20 articles lacked such a justification and also did not report on the testing for interactions. This testing would allow detecting whether estimated associations hold true for all subgroups of older adults studied and thus should become a standard procedure in studies that have an exploratory component. If no interactions are found, we recommend to summarize the results in a single sentence within the method section, such as “we tested for interactions between independent variables but all interaction terms were not statistically associated with the dependent variable and thus not retained in the regression model.”

Our review uncovers that the reporting rate of goodness-of-fit measures was low at 53%. In other words, for almost one-half of the articles, we do not know whether the proposed model fits the data from which it was estimated; hence, the inferences drawn from the model might be improper. Fifteen articles reported pseudo r-squared measures, but these measures are different from r-squared measures for linear regression and thus cannot be interpreted as the share of variance in the log-odds of internet usage explained by the model [18]. In five of these fifteen articles, pseudo-r-squared was incorrectly interpreted as “percentage of the variance” explained. The remaining ten articles provided no interpretation but simply reported the measure. A more informative and intuitive measure is percentage correct, which indicates the percentage of observations (here: older adults) correctly mapped onto the categorical outcome (e.g., user and non-user classes). For instance, in a study by Quittschalle et al. [58], percentage correct was 73.8% and thus much greater than the size of the majority class—non-users accounted for 58.2%. Therefore, this measure signifies model fit and provides insights into the model’s explanatory power. Additionally, percentage correct can be differentiated for each class and therefore describe how accurate the model is in identifying users vis-á-vis non-users [54]. Classification performance can also be determined using the c-statistic and visualized in a receiver operating characteristic (ROC) curve, as demonstrated in a recent article [72]. This article reports statistical tests, pseudo r-squared, and discrimination statistics to paint a comprehensive picture of model fit. We recommend to report at least one measure of classification performance.

We note that the reporting of fitting procedures (81%) was considerably greater than in previous reviews examining the medical literature [73, 74]. Moreover, the higher prevalence of manual fitting compared to automated procedures is consistent with the hypothetico-deductive method in the enquiry of older adults’ internet use. In addition, most studies analyzed samples that included sufficient events per independent variable (83%), although the exact ratio could not be determined for some articles.

With respect to main estimates, most articles both reported point and interval estimates (81%) by either displaying odds ratios, coefficients, or both, along with confidence intervals or standard errors. The interpretation of odds ratios and coefficients in logistic regression models must not be confused with change in the probability of internet use, because the estimates do not represent the relative risk of belonging to the user class. This difficulty in interpreting odds ratios might explain why many articles simply reported the OR (or coefficient) without interpreting the strength of the relationship. A richer interpretation is made possible by calculating average marginal effects (AME), which represents the average change in the probability of internet use for a one-unit increase in the independent variable, as reported in the article by Anderberg et al. [35].

In light of the reliance on statistical significance of main estimates, almost every second article omitted exact p-values but used somewhat arbitrary thresholds. This practice is problematic as it might conceal differences between extremely small values and values that are very near the threshold. Moreover, p values are affected by the sample size so that even minuscule relationships of no practical importance will become "statistically significant" in very large samples [75]. Indeed, in four of the articles that only reported thresholds, models were estimated from extremely large samples with more than ten thousand observations. We suggest to interpret p-values under consideration of contextual factors, including meaningful strength of relationships for the population of older adults, width of confidence intervals, and sample size, rather than derive scientific conclusions based on passing of a specific threshold [76].

It is noteworthy that all articles provided an explanation of the initial selection of independent variables based on the literature and described the coding of these variables. The latter finding is different from the medical literature, which often devises complex and intricate measurements but lacks in complete description of the coding, with fulfillment of this criterion below twenty percentage [24, 73, 74]. On the other hand, the possible exclusion of independent variables that were not statistically associated in a prior analysis is contradictory to the hypothetico-deductive method. If the excluded variable operationalizes a theoretical factor, it should be retained in the model and the result should be related back to the factor in the context of older adults’ internet use. Likewise, the factor might represent an important control variable that influences the strength of relationship of another factor [77, 78].

4.2 Implications for theory and practice

The results of our review also have implications beyond methodology. First, almost one half of the articles exclusively examined general internet use. This limitation even holds true for six of the ten articles published in 2020. Given that older adults’ actual behavior has greatly changed in the past [79, 80], future research is required to examine more differentiated purposes of use, including purposes not related to health, such as leisure, entertainment, education, and daily errands. These endeavors can help understand the continuum of older adults’ internet use by illuminating non-use in specific areas. Second, our findings on different roles of theory in justifying logistic regression models suggest that explicit theory still plays a second-tier role in that literature. With less than one-third of the articles grounding their proposition on a specific theory, we suggest enhanced efforts to adopt and test theories already used for the wider study of digital inequality rather than rely on prior empirical findings that are bound to the target group of older adults. Third, the articles included in our review also exhibit considerable diversity in the number and range of factors tested, which then lead to a broad set of regression models. This finding points to the need for theoretical integration and consolidation to advance theory and maintain parsimony of theoretical models. As an initial step toward such integration, we recommend to collate and synthesize research evidence through systematic reviews. A standard element is the assessment of risk of bias based on methodological criteria. With respect to survey research, criteria have been proposed for various fields, such as health care [81] and information systems [82]. The reporting criteria adopted in the current review can complement the methodological criteria. Fourth, when practitioners interpret results of empirical articles to devise digital interventions, training programs and policy guidelines, we advise them to consider the strength of evidence, which at least partly reflects in the accurate reporting of logistic regression models.

4.3 Limitations

The limitations of our review should be noted. Our review was restricted to the information available from the articles; thus, it is possible that researchers have performed additional analyses, including the testing of assumptions, and considered the obtained results in their model development. Space constraints of journals and proceedings might have hindered a complete and more elaborate reporting of all the procedures required for conducting logistic regression analysis. Because the articles included in the review originate from multiple disciplines and have been published in various journals and proceedings, the researchers might have adopted different guidelines, which could explain some part of the variance in the fulfillment of quality criteria.

5 Conclusion

This review examined the reporting quality in the application of logistic regression for explaining older adults’ internet use. We found the most substantial shortcomings for four of nine quality criteria: First, the lack of testing for collinearity can put the reported statistical associations of independent variables into question. Second, the absence of information on conformity with linear gradient might raise doubts about the stability of reported effect sizes. Third, the low consideration of interactions might obscure that the observed effect varies for certain subgroups of older adults. Fourth, the shortage of information on the goodness of fit of models incurs the risk that a different model exists that will better fit to the data and thus exhibit greater explanatory power. The results point to the need for increased effort from authors to verify assumptions underlying logistic regression analysis and assess how accurate the models explain older adults’ internet use. While our study provides evidence for deficits, we submit specific recommendations on how authors can improve the execution and reporting of logistic regression analysis for the explanation of older adults’ internet use. The identified deficits have important implications for the interpretation of articles and the accumulation of knowledge. It would be disadvantageous if digital interventions, training programs, and policy guidelines for greater digital inclusion of older adults were founded on empirical results that had not been more rigorously validated. Therefore, our findings also highlight the need for heightened scrutiny by reviewers and editors. The quality criteria for the reporting are well established in the statistical literature and can readily assist in assessing the reporting of manuscripts under review. Editorial boards are advised to put more emphasis on a comprehensive and accurate description of the logistic regression analysis in the method and results section of the articles. To ensure that review teams have sufficient methodological competences to verify the quality of logistic regression reports, the addition of independent statisticians or similarly qualified reviewers should be considered. Taken together, we believe that the proposed recommendations for authors, reviewers and editors would not only enhance the validity, uniformity, and comparability of articles but also facilitate the integration of evidence to advance digital inequality theory.