The development of smartphone and 5G technology make it easy to access Internet and change people’s life in China, such as online payment, consumer behaviour. Internet become an important part of people’s life. The 47th report from China Internet Network Information Center [1] indicated that up to December 2020, there were 989 million Internet users in China who spent 26.2 hours weekly online, 17.1% of users were under the age of 20. Most of them (99.7%) used smartphone to get Internet. Game Apps were the top App category among the top four categories in the market, accounting for 25.7% of all Apps. The adverse effect of Internet overuse was evident, such as poor academic performance, psychological and physical health problems [2,3,4,5,6]. In China, Internet overuse becomes a public health concern, especially on college students [7, 8]. There were a few different terms to describe the phenomenon of maladaptive Internet use including “Internet addiction, Internet addicts, Pathological Internet use, Internet Addiction Disorder, Problematic Internet use, maladaptive patterns of Internet use, computer-medicated communication addicts, computer junkies, etc.” [9,10,11,12,13]. In this study, the term “Pathological Internet use” (PIU) was taken to describe the behaviour of inability to control Internet use that would in turn lead to physical, psychological and social problems, affect individual’s social function and daily life [10, 11, 14].

The prevalence of PIU on college students was varied among different countries ranged from 3.2 to 43% [15,16,17,18,19,20,21]. Despite the sample difference, the inconsistent measurement instrument and cut-off point might contribute to the great discrepancy on prevalence rate of PIU [15]. A review study on the existing measurement tool of Internet addiction found that there were 45 tools to measure PIU, but most of them were not well-validated [22]. A valid assessment tool is important for clinical and research setting. Exploring the psychometric properties of existing tool in diverse culture and age group was deemed more efficient, rather than developing a new scale [14, 22, 23].

Young’s Internet Addiction Test (IAT) was found to be the most validated and frequently used instrument among studies in different countries [15, 22]. It was well validated in 17 languages, such as English, Chinese, Italian, Greek, Korean, Thai, French, Turkish, Malay [14, 22, 24, 25]. It was also one of the most frequently used instruent to examin the prevalence of PIU in China [15]. The result of construct validity on factor analysis was varied which found 1 to 6 factor models [20, 22, 26,27,28,29,30,31,32,33]. Previous validation study on bilingual version of IAT found some problematic items, such as IAT7, IAT11 [31], IAT 3, IAT9 [34]. The expression or translation of some items should be upgrade or reformulated [22]. This study was aimed to rephrase the Chinese version of IAT and examine the item-level psychometric properties in a sample of college students in order to upgrade the construct quality of IAT under Chinese background. The effect of socio-demographic and Internet use factors on IAT was also identified after controlling the differential item function (DIF).

Methods

Participants and procedure

This study was carried in two phases, which used different samples of three-year college students in Zhejiang, China. In the first phase, a total of 384 students from Hangzhou Vocational &Technical College were answered the questionnaire in order to examine the validity of IAT items. There were 208 males and 140 females at the age of 18.34 (SD = 0.76), 184 students were the only child in the family (Table 1).

Table 1 Characteristics of 1st phase study sample

In the second phase, data were collected from four colleges (Zhejiang Institute of Mechanical & Electrical Engineering, Wenzhou Vocational College of Science & Technology, Hangzhou Vocational &Technical College, Zhejiang Yuying College of Vocational and Technology). As shown in Table 2, a total of 1131 students participated in the 2nd phase study, 598 were male and 533 were female. There were 408 from 1st year, 488 from 2nd year, 235 from 3rd year. The number of respondents from four major filed was roughly equivalent (344 from art, humanity and social science, 238 from science, 229 from engineering, 320 from others). Students were divided into five Internet use groups according to their respond on favorite online activity, who rate the MMORPG as their favorite activity were deemed as MMORPG users (n = 229), rate cellphone game as the favorite activity were cellphone game users (n = 158), choose SNS as favorite activity were SNS users (n = 422), who generally try various online activities and do not have favorite activity were deemed as general users (n = 179). The other users (n = 143) were those who have favorite Internet activity, but were neither SNS nor game, such as online searching, shopping, video, gambling etc.

Table 2 Characteristics of 2nd phase study sample

Measure

The questionnaire used in this study comprised two parts, first is the basic information of college students including gender, major field, time spent online, and years of Internet use experience; second part is the Internet Addiction Test (IAT) which is a 20-item of self-report instrument used to measure the individual’s Internet use from the perspective of psychological symptoms and behaviors, such as psychological dependence, compulsive use, and withdrawal, problems of school, sleep, family, and time management. It was developed based on Young’s YDQ [13, 14]. The original English version of IAT was translated into Chinese using translation and back translation procedures. Phrases were modified to adapt to current internet use situation and sample background, such as in item 6, “grades/coursework/study” replaced the word “work”; “email” in item 7 was changed to “online instant message (e.g. qq, wechat). The first version was scored on a 5-poin Liker scale, 1 for rarely, 2 for occasionally, 3 for frequently, 4 for often, 5 for always. It was modified In Young and Nabuco de Abreu’s latest book “Internet Addiction: A Handbook and Guide to Evaluation and Treatment”, the items are rated on a 6-point scale regarding to participants’ experience of their Internet use: 0 for not applicable, 1 for rarely, 2 for occasionally, 3 for frequently, 4 for often, 5 for always. The cut-off point for severe Internet addiction was 70–100 and 80–100 respectively. This study chose the latest scoring method (6-point rating scale) for IAT items.

Statistical analyses

In the 1st phase study, Rasch model analysis was first applied to examine unidimensionality assumption, rating scale property, item fit and reliability by Winsteps version 3.75.0. Principal components analyses of residuals (PCA) was used to test unidimensionality, which the raw variance explained by measures should be more than 40% and the unexplained variance explained by 1st contrast should be less than 2 eigenvalue [35]. Category structure was tested to examine the monotonic ordering of 6-category rating scale. Mean square standardized residuals (MNSQ) of INFIT and OUTFIT were indices of item fit, the value between 0.5 to 1.5 is deemed productive [36]. Separation coefficient is the signal-to-noise ratio, the ratio of “true” variance to error variance. The person reliability is equivalent to KR-20, Cronbach Alpha Coefficient. And the item reliability is equivalent to construct validity [37]. Second, exploratory factor analysis (EFA) was conducted to identify the construct of IAT by Mplus version 6 using WLSMV estimator [38].

In the 2nd phase study, the construct of IAT was verified by confirmatory factor analysis (CFA). The differential item functioning (DIF) and the effect of covariates on IAT latent factors were examined by a multiple indicator multiple causes (MIMIC) model. The covariates in the MIMIC model were Internet use variables and socio-demographic variables (Table 1). The Internet use variables included years of Internet use experience (M = 11.31, SD = 2.72), time spent online per day (M = 5.66 h, SD =2.82), favorite Internet activate (general users as the reference group). The socio-demographic variables were age (M = 20.05 years, SD = 2.43), programme (3rd year as reference group), gender (male as reference group), and major (art, humanity and social science as reference group).

Numbers of model fit indices were found in Mplus. This study used RMSEA, CFI, TLI, SRMR for model fit evaluation [39]. Root Mean Square Error of Approximation (RMSEA) was suggested that the value less than 0.05 was good fit, blow 0.08 and above 0.05 as acceptable fit. The Standardized Root Mean Square Residual (SRMR) was suggested to be in the range of 0.05 and 0.10 as acceptable, between 0 and 0.05 as good fit [39]. The Comparative Fit Index (CFI) value above 0.95 was considered as good fit, and greater than 0.90 as acceptable fit [40]. The Tucker-Lewis Index (TLI) also known as the Nonnormed Fit Index (NNFI), which the value above .90 were considered as acceptable fit, and above .95 as good fit [40].

Result

1st phase study

The 1st phase study sample (n = 348) was used to test the item quality and validity of IAT. Correction may necessary if it helps to meet the required psychometric property of instrument. Rasch analysis was first used to evaluate the category rating scale and item property. The construct validity of IAT was identified by exploratory factor analysis (EFA).

The result of Rasch principal component analysis (PCA) in Table 3 showed that the raw variance explained by measure was 43% and unexplained variance in 1st contrast was 5.5% with 1.9 eigenvalue indicating that the IAT showed a good fit as a unidimensional scale.

Table 3 IAT Standardized residual variance (in Eigenvalue units) (n = 348)

Category structure was evaluated, which found disordered threshold of structure calibration between 1 (rarely) and 2 (occasionally) response (Table 4). Therefore, an original 6-category rating scale was converted to a 5-category rating scale by collapsing 1 (rarely) and 2 (occasionally) response. As shown in Table 4, the value of structure calibration increases with the category value, and the new category system performed better than the 6-category system. The overall property of IAT with 5-category rating scale showed a good to excellent person and item separation (2.66 and 6.86) (Table 4).

Table 4 Summary of category structure on IAT 6- and 5- category rating scale (n = 348)

Table 5 is the item fit statistics in misfit order, which showed that all the point-measure correlation (CORR.) are positive and high, ranged from 0.41 to 0.63, all are close to the expected correlation (EXP.). It implied that all the items are aligned with the abilities of person. The average item infit and outfit MNSQ is close to 1, ranged from 0.71 to 1.48.

Table 5 Item fit statistics of IAT in misfit order (n = 348)

As previous research have found one- to six- factor solutions for IAT, this research identified the one- to six- factor models respectively in Mplus. As shown in Table 6, a 3-factor model was found to be fit better and acceptable (x2 /df < 2, RMSEA = 0.031, SRMR = .037, CFI = .991, TLI = .988), all factor loadings were above 0.30 and significant, factors were correlated moderately to high (r = 0.541–0.774). The cut-off point of loadings was low in order to compute item loadings for further inspection in CFA analysis. A cross-loading was found on iat18. As the loading on factor 2 is much higher than loading on factor 3, iat18 was grouped in factor 2. Factor 1 had five items (iat1, iat2, iat5, iat6, iat8) that related to time management problem and negative influence on study/job of Internet use. Factor 2 is consists of 11 items (iat10, iat11, iat12, iat13, iat14, iat15, iat16, iat17, iat18, iat19, iat20) that measure the excessive use and emotional conflict of Internet use. Factor 3 contains four items (iat3, iat4, iat7, iat9) relating to neglect social life of Internet use.

Table 6 Factor loadings, factor Correlations of EFA for IAT (n = 348)

2nd phase study

The 2nd phase study was conducted on a sample of 1131 college students, which is aimed to verify the structural validity of IAT found in the 1st phase study, test the DIF effect of IAT, examine the effect of covariates (socio-demographic and Internet use variables) on IAT latent factors.

As shown in Table 7, the model fit indices of CFA showed acceptable to good fit (RSMEA = 0.065, CFI = 0.954, TLI = 0.948), the factor loadings ranged from 0.487 to 0.814. The latent factors were significantly correlated to each other, ranged from 0.845 to 0.902.

Table 7 Factor loadings, factor correlation and fit indices of CFA model, MIMIC model, and MIMIC with DIF model by overall sample (n = 1131)

The result of MIMIC model showed that the 3-factor model of IAT with covariates fitted the data well (RMSEA = 0.040, CFI = 0.963, TLI = 0.957). The significant effect of Internet use covariates on the three latent factors were time spent online per day (Table 8), which was positively related to Factor 1 (B = 0.078, p = 0.000, β = 0.315), Factor 2 (B = 0.080, p = 0.000, β = 0.317), Factor 3 (B = 0.064, p = 0.000, β = 0.245).

Table 8 The impact of covariates on IAT latent factors and items

The significant group difference on the latent factor scores were found on gender, Internet use group, and grade. Female users had 0.132 SD lower latent scores than male on Factor 2. Year 3 students had lower latent scores than year 1(0.292 SD, 0.299SD, and 0.414 SD for factor 1, 2 and 3 respectively) and year 2 students (0.309SD, 0.367SD, and0.337SD for factor 1, 2 and 3 respectively) on the all three latent factors. General users had 0.246 SD and 0.275 SD lower latent scores than MMORPG users on factor 2 and 3 respectively.

Differential item functioning (DIF) was tested by checking the modification indices (MI) which is the indication of significant association in the model from covariant to IAT items. As shown in Table 8, the final MIMIC model with DIF identified seven items displayed DIF and demonstrated good fit to data (RSMEA = 0.040, CFI = 0.965, TLI = 0.959). People spent more time online were more likely to endorse lower scores on two items that were IAT2(B = -0.054, p = 0.000, β = − 0.160) and IAT8 (B = -0.054, p = 0.000, β = − 0.146), while report higher scores on IAT12 (B = 0.039, p = 0.000, β = 0.103); female had decreased probability to endorse IAT13 (B = -0.358, p = 0.000, β = − 0.340) than male; SNS users were likely to endorse higher scores on IAT12 (B = 0.333, p = 0.000, β = − 0.309) and endorse lower scores on IAT19 (B = -0.370, p = 0.000, β = − 0.353); Other users prefer to endorse lower scores on IAT4 (B = -0.444, p = 0.000, β = − 0.434).

Comparing MIMIC model with DIF and without DIF on the regression coefficients of covariates to the latent factor (see Table 8), the significant change was on the effect of female to factor 2 (β was changed from-0.132 to − 0.092, with significant to non-significant). The other changes of regression coefficient were very small which did not contaminate the result of the association between covariates and three latent factors, such as the regression coefficient was increased slightly from time spent online to factor 1 (β was changed from 0.315 to 0.388), decreased from year 1 to factor 1 (β was changed from 0.292 to 0.283) (see Table 8).

Discussion

The objective of the 1st phase study is to examine the item quality and factor structure of IAT (Chinses version). The original IAT is a 6-point rating scale. A study on a Greek version IAT suggested that 3-point rating scale performed better [25]. Another study in Malaysia suggested to keep the 6-point rating scale for a bilingual version IAT (English and Malay [41]. Rasch model analysis of this study first found the disordered threshold of 6-category rating scale which suggested to collapse 1 (rarely) and 2 (occasionally) response. The 5-point rating scale worked better and applied in 2nd phase study. The unidimensional structure of IAT was confirmed in this study that was consistent with the previous researches [25, 41]. There was no item with severe misfit that implied the item was productive for the measure. Overall, a good to excellent person and item separation (2.66 and 6.86) revealed that the Chinese version of IAT with 5-point rating scale is a reliable instrument to measure PIU.

A 3-factor solution of IAT was first identified in the 1st phase study sample and then confirmed by the 2nd phase study sample. The result of 3-factor structure was quite similar with study among Hong Kong university students [31] and Hong Kong adolescents [26, 27]. The possible reason is that those studies were held in different area of China; the research samples use same language and share similar culture. The major difference was on two items (IAT 7 and 11) which were dropped in study of Hong Kong [27, 31] as its poor performance in EFA (e.g. low factor loading), and kept in this study with its good item fit and high factor loadings. The improvement of item 7 may related to rephrase “email” to “online instant message (e.g. qq, wechat) in this study, as the word “email” may link to work which were found by researcher’s previous study in Malaysia [34]. Consistent with most studies, IAT 11 was not found any problem in this research. The difference may related to that Chang and Law (2008) set a higher cut-off point of factor loading (> 0.4), (31), while other researchers usually set a lower criteria (> 0.3) at the preliminary stage or EFA analysis so that the relevant item could be included, such as study on Greek adolescents [42], Italian adults [43], Thai university students [24]. A number of other influences may also affect the variance, such as translation, sample,culture,and data analysis method.

The MIMIC model in the 2nd phase study found significant DIF relating to 6 IAT items (IAT2, IAT4, IAT8, IAT12, IAT13, IAT19). Examing the effect of DIF on IAT latent factor found that only one itme (IAT13 snap, yell, or act annoyed if someone bothers you while you are online) loading on factor 2 (excessive use and emotional conflict of Internet use) made measurement bias on gender. The significant gender difference was no longer existed when correcting DIF effect, which implied that DIF was the main reason for gender difference on the factor 2 “excessive use and emotional conflict of Internet use”. This result was inconsistent with the study in Malaysian [34] which found IAT 14 performed DIF on gender, but did not contaminate any latent factor scores of IAT. It seems that male tended to more sensitive on IAT13 when they experienced with emotion symptoms of Internet use. Female in China may perform less observed emotion symptoms related to Internet use. Comparing MIMIC model with and without DIF indicated that the magnitude of DIF for the other 5 items was very limited and the effect on the latent factor scores of IAT was negligible. Item delete is not suggested as the effect size is limited to one latent factor scores of IAT, not on the other two and the item is important to measure emotional symptoms of internet overuse. DIF may be related to translation or culture. In addition, this is the first study to validate Chinese version of IAT in item level, the relevant academic evidence is very few under Chinese background. Modification on IAT13 relating to translation or expression may be necessary to control the measurement bias on gender.

In this study, the significant effect of covariates (socio-demographic and Internet use variables) on the 3 latent factors of IAT were time spent online, year 1, year 2, general users. Time spent online was significant predictor of all three IAT latent factors. It implied that students spent more time online could experience higher level of PIU symptoms. This result was consistent with most previous research findings that there were close relationship between duration of Internet use and PIU [34, 44,45,46,47]. This study found that college students spent 5.66 h (SD = 2.82) online per day. Comparing to the past researches in China found that time on daily Internet use is increasing among college and university students [48]. The popular of smartphone may play a role on the increasing time of Internet use as smartphone make it easy to access Internet. Students with PIU tended to spent more time online compared with non-PIU [49]. The impact of Internet first use in early age is inconsistent. Some studies found that the Internet use experience and the age of first Internet use was related to the level of PIU [34, 50], while other studies did not find the relation [44]. The result of this study did not found any significant relation between the Internet use experience and the three IAT latent factor scores.

Online games were deemed as more attractive than offline games [51, 52]. Tone, Zhao and Yan (2014) found the attraction of online games was the most important factor of PIU compared to other factors (personality, life events). And the MMORPG users were more likely to develop PIU than other game users [53, 54]. This study divided the Internet users into five groups (general, MMORPG, cellphone game, SNS, others) according to their self-report on the favorite Internet activities. The general users reported significant lower scores than MMORPG users on factor 2 and 3 of IAT, while the scores of the other three groups (cellphone game, SNS, others) did not find any significant difference with MMORPG users on the three IAT latent factors. It implied that the other Internet activities such as SNS users, cellphone game users, had the same risk of PIU as MMORPG users.

This study found that students in year 3 reported significantly lower scores than students in year 1 and year 2 on the all three latent factors of IAT. The result was different with the studies in Jiang Su [55] and Xin Jiang [56] China, which found that the students in year 2 and 3 were more vulnerable to PIU as they had less study work and more free time to get online. The inconsistent finding on grade may related to the sample which in this study were 3-year college student, while others were 4 or 5-year undergraduate students. The final year students were not included in the study of Jiang Su and Xin Jiang which only took the students in year1, 2 and 3 as their research sample. Li, Wang, & Wang, (2009) included the fourth year students and did not find any grade difference related to PIU [57]. The third year students in this study were in the final year of their college study. They were usually concentrated on their graduate project, internship and job searching, which may decrease the risk of PIU.

Conclusion and future study

A 5-point scale is more adapted to the Chinese version of IAT. Item improvement was efficient that the problematic items found in literature was performed good in this study. The overall psychometric property of this Chinese version IAT was good with limited DIF effect in one item. One item need adaption to control the gender bias in the future study. Bigger sample size and equivalent sample across grade was suggested.