This article describes our process and algorithm for successfully creating a reproducible client-based dataset for quantifying the degree to which individuals in Ontario have multiple records of epidemiologically linked diseases, which is a first for infectious disease surveillance in Ontario. Moreover, our methodology serves as a platform from which a client-based dataset can be produced and updated to support the ongoing analysis of any combination of diseases reported in the province that are of interest for integrated surveillance based on related epidemiology or syndemic potential. The methodology is flexible enough that additional variables collected during routine public health follow-up of cases, such as demographics and risk factors, can be added to the dataset for in-depth analyses.
In addition to overcoming initial technical data manipulation challenges, our implementation was facilitated by three factors. First, under Ontario’s Personal Health Information Protection Act 2004 (Personal Health Information Protection Act 2004), PHO is permitted to collect, use, and disclose data for purposes as described in PHO’s enabling legislation (Ontario Agency for Health Protection and Promotion Act 2007), including surveillance. Second, having an integrated database with common client identifiers facilitated disease record linkage compared with previously described barriers experienced in other jurisdictions with siloed data sources (i.e., HIV data stored separately) (Gasner et al. 2014). Third, analysis from a single provincial database enabled a more comprehensive assessment of the existence of multiple records per client where disease records may be associated with various PHUs in the province. Smaller jurisdictions may be able to repeat these methods to inform local planning, but may not identify the existence of multiple records per client due to small counts. Smaller jurisdictions may also risk missing disease records if there is frequent movement of clients across their boundaries where their disease events are then attributed to other jurisdictions where they do not have access to the records for surveillance purposes.
Beyond establishing the proof-of-concept client-based dataset, our preliminary analyses using the eight included diseases also provided important previously undescribed insights into the intersection of sexually transmitted and bloodborne infections (STBBI) and related infections of TB and iGAS in the province. Interestingly, we found that 23.1% of clients had more than one disease record, representing close to half (45.1%) of all of the disease records. We have not found other reports describing the proportion of clients with more than one disease record of the diseases included in our analysis to compare our findings. In terms of disease types, we found that the occurrence of having at least one other disease type ranged from 12% (chlamydia, HBV, and TB) to 63% (syphilis). These proportions of at least one other disease type are similar to previously published results of an analysis from the New York City Department of Health and Mental Hygiene (NYC DOHMH), where 11% (HBV) to 64% (syphilis) of their client records had more than one disease type. However, their dataset did not include iGAS and required deterministic matching to link records across databases (Drobnik et al. 2014). In both our analysis and the findings from NYC DOHMH, infection with two or more types of diseases was the most common among those with syphilis, followed by gonorrhea (52% in NYC DOHMH) (Drobnik et al. 2014). Further analysis is required to explore these results, particularly with respect to the timing of the syphilis or gonorrhea disease events in relation to other STBBI disease types, and the implications this may have for STI prophylaxis and prevention (Molina et al. 2018; Tan et al. 2017).
We also examined the presence of three specific disease type combinations with known interactions to begin to explore how these methods can be used to support surveillance in Ontario. First, we assessed HIV/AIDS and TB and found 2.6% of TB cases also had an HIV/AIDS record, which is similar to findings in previous analyses of Ontario data (Ontario Agency for Health Protection and Promotion (Public Health Ontario) 2015) and is in line with or lower than the percentage of TB cases with HIV/AIDS co-infection in other jurisdictions (BC Centre for Disease Control 2019; Public Health Agency of Canada 2014; Rivest et al. 2014). Further analysis is required to determine the timing of TB and HIV/AIDS relative to each other, and whether initiatives to routinely screen for HIV/AIDS or TB at the time of TB or HIV/AIDS diagnosis, respectively, have led to changes in timing of identification or prevalence of overlapping infections (Public Health Agency of Canada 2014; Public Health Agency of Canada 2019).
Second, despite the overlap in modes of acquisition/transmission between HIV/AIDS and hepatitis C and iGAS, specifically sharing of drug use equipment, there were only 59 clients in our dataset with all three disease types. Data from the aforementioned London, Ontario, iGAS outbreak investigation indicated that 9.5% of individuals (14/147) with iGAS infections were positive for both HCV and HIV (Dickson et al. 2018). These individuals would represent 23.7% (14/59) of the clients with HCV, HIV/AIDS, and iGAS records in our dataset. This outbreak occurred recently, between April 1, 2016 and February 28, 2018, and therefore, the combination of HIV/AIDS, HCV, and iGAS may be an emerging issue in the province, particularly among people who use injection drugs. Continued and ongoing monitoring of this disease overlap is important to inform prevention and response efforts.
Third, we identified 1274 clients in our dataset with records of HIV/AIDS, gonorrhea, and syphilis, representing over 10% of clients with HIV or syphilis records. While HIV–syphilis co-infection and HIV–gonorrhea co-infection have been previously assessed in separate unpublished analyses, in Ontario, this is the first time records of all three have been examined together at a provincial level. As described above, understanding the overlaps of STBBIs can inform prevention activities. This type of analysis also supports integrated approaches to STBBIs as described in the Pan-Canadian Framework for Action on STBBIs (Public Health Agency of Canada 2018).
While our aim is to develop novel methods of analyzing infectious disease surveillance data, these data have several limitations. As with all passive surveillance systems, only cases of diseases reported to public health and recorded in iPHIS were included in our dataset. Our dataset may be an underrepresentation of the true disease burden in Ontario. This underreporting may vary by disease due to factors such as disease awareness, health-seeking behaviours, availability of health care, severity of illness, clinical practice, methods of laboratory testing, and reporting behaviours. Laboratory testing methods and provincial case definitions for some diseases have changed over time which may also impact the number of cases reported (Ontario Agency for Health Protection and Promotion (Public Health Ontario) 2018). Our dataset may underestimate the number of people with multiple disease records and the number of records per client if an individual was entered in duplicate as two separate clients. Individuals testing positive for HIV anonymously (Government of Ontario 2014) will result in an anonymous client with only the HIV record associated with it and cannot be linked to other disease types. Our dataset may also overestimate the number of records per client, particularly for STIs, where multiple instances of the same infection episode were entered as separate disease records, such as follow-up positive testing results being counted as a new infection. However, this is likely minimized as PHUs assess whether each infection is likely to be the same versus a new infection. Conversely, it is also possible that separate infections occurred and were only entered into iPHIS once.