Study design, setting, and period
Secondary data analysis was done from a community-based national cross-sectional study. The study was conducted in Ethiopia, which is situated in the Horn of Africa. It has 9 regional states (Afar, Amhara, Benishangul-Gumuz, Gambela, Harari, Oromia, Somali, Southern Nations, Nationalities, and People’s Region (SNNP) and Tigray) and two administrative cities (Addis Ababa and Dire-Dawa).
In 2016, the population of Ethiopia was estimated as 102 million, of which 43.47% of the population is aged less than 14 years with a birth rate of 36.5 births per 1000 population and fertility rate of 4.46. Ethiopia is the 13th in the world and 2nd in Africa most populous country. The country has three tiers of health systems; primary health care units (primary hospitals, health centers, health posts, primary clinics, and medium clinics); secondary health care (general hospitals, specialty clinics, and specialty centers) and tertiary health care (specialized, teaching hospitals). In response to population differences, the number of hospitals varies from region to region. Oromia, which is the most densely populated region, has 30 hospitals. The other two regions with 19 and 20 respectively are Amhara and SNNPR. Although there were 16 hospitals in Tigray, there is only one hospital in Gambela, while there are two in Benishangul-Gumuz  (Fig. 1).
The data for this study were taken from the 2016 EDHS. It is the fourth comprehensive and nationally representative survey conducted as part of the worldwide demographic and health surveys (DHS) project. The EDHS 2016 data were downloaded from the DHS website (http://dhsprogram.com) after getting authorization. The 2016 EDHS samples were selected using a stratified two-stage cluster sampling design. In the first stage, about 645 enumeration areas (EAs) (202 in urban areas and 443 in rural areas) were selected with probability proportional to the EAs size and with independent selection in each sampling stratum.
The study’s source population was all reproductive women aged 15–49 years and the study population was all women between 15 and 49 years in the selected EAs. A total of 14,369 women had been asked about HIV/AIDS knowledge. Potential predictors such as education, wealth index, place of residence (urban versus rural), the type of contraceptive, sex of household head and other variables were extracted from the dataset based on the availability of data and previous literature. More methodological details can be accessed from EDHS 2016 report .
The outcome variable is comprehensive knowledge of HIV/AIDS which is defined as correct knowledge of two mechanisms to prevent HIV and rejection of three misconceptions about HIV. To assess the comprehensive HIV and AIDS knowledge of reproductive age women, every woman was asked whether or not she correctly answered the following questions: (1) can using condoms prevent HIV transmission? (2) can HIV be prevented by limiting sex to one faithful uninfected partner? (3) can a person get HIV from mosquito bites? (4) can a person get HIV by sharing food with someone infected? and (5) can a healthy-looking person have HIV? . Accordingly, comprehensive knowledge of HIV/AIDS coded as non-knowledge women = 0 and knowledge women = 1.
The independent variables of the study were drawn from socio-demographic characteristics (age, education status, residence, region, religion, household head sex), health care and system-related (method of contraception, know the place of the test for HIV) and socioeconomic factors (media exposure such as TV, reading newspapers, wealth index, own mobile phone).
Data management and methods of analysis
After the extraction of data, STATA software Version 14 (StataCorp LP, College Station, Texas 77845 USA) was used to clean up, rename, recode, and further analysis. Sampling weights were used to adjust the probability of selection and non-response differences. First, we examined socio-demographic characteristics of the sample using descriptive statistics. Data were correlated (within-cluster correlation 0.26), as EDHS has a hierarchical and clustered structure. This requires the use of advanced model that considers variability due to clustering nature. A mixed effect binary logistic regression modeling under the broad family of generalized mixed modeling was therefore adapted. In the bivariable analysis, variables with p-values ≤ 0.2 were selected to adjust in the final model. Finally, the p-values less than 0.05 were used to classify variables with statistical significance of comprehensive knowledge of HIV/AIDS in the multivariable analysis.