Linkage Methods for Connecting Children with Parents in Electronic Health Record and State Public Health Insurance Data
- First Online:
- Cite this article as:
- Angier, H., Gold, R., Crawford, C. et al. Matern Child Health J (2014) 18: 2025. doi:10.1007/s10995-014-1453-8
- 226 Views
The objective of this study was to develop methodologies for creating child–parent ‘links’ in two healthcare-related data sources. We linked children and parents who were patients in a network of Oregon clinics with a shared electronic health record (EHR), using data that reported the child’s emergency contact information or the ‘guarantor’ for the child’s visits. We also linked children and parents enrolled in the Oregon Health Plan (OHP; Oregon’s public health insurance programs), using administrative data; here, we defined a ‘child’ as aged <19 years and identified potential ‘parents’ from among adults sharing the same OHP household identification (ID) number. In both data sources, parents had to be 12–55 years older than the child. We used OHP individual client ID and EHR patient ID numbers to assess the quality of our linkages through cross-validation. Of the 249,079 children in the EHR dataset, we identified 62,967 who had a ‘linkable’ parent with patient information in the EHR. In the OHP data, 889,452 household IDs were assigned to at least one child; 525,578 with a household ID had a ‘linkable’ parent (272,578 households). Cross-validation of linkages revealed 99.8 % of EHR links validated in OHP data and 97.7 % of OHP links validated in EHR data. The ability to link children and their parents in healthcare-related datasets will be useful to inform efforts to improve children’s health. Thus, we developed strategies for linking children with their parents in an EHR and a public health insurance administrative dataset.
KeywordsChildrenElectronic health recordsPublic health insuranceFamily
When parents have health insurance, their children are more likely to be insured and to receive guideline-appropriate health care [1–5]. Because of this strong association between coverage for parents and optimal health insurance and care for children, efforts to optimize children’s health must be informed by coverage and health care services utilization data from both children and their parents [6–8]. Obtaining such family-level information requires the ability to link children with their parents in large healthcare-related databases.
In the past all family members were commonly covered under the same employer-sponsored health insurance plan , resulting in claims data that included information on all family members. Recent changes in family health insurance coverage patterns; however, make it harder to obtain coverage and utilization data with linkages between parents and their children. As employer-sponsored health plans have grown more expensive for families, fewer families have all members covered by the same insurance plan; discordant coverage within families has increased [10–13]. Children and parents are now commonly covered by different payers (e.g., parent has private coverage but child has public coverage), or some family members have coverage while others do not (e.g., child has public coverage but parent has no coverage) . Even if both child and parent have some type of public coverage, administrative datasets rarely include mechanisms that enable ‘linking’ a child to their parent.
The recent expansion of electronic health records (EHRs) present a new source of data on insurance coverage status and receipt of healthcare on multiple family members, if they receive care from the same provider (e.g., family physician), the same clinic, or group of clinics with a shared EHR [15, 16]. However, similar to the limited ability to link children with their parents in many state insurance administrative data systems [e.g., Medicaid and Children’s Health Insurance Programs (CHIP)], few EHRs include a mechanism for linking children with their parents.
While data on individual patients has been linked across multiple datasets for various purposes [17–20], we know of no published methods for achieving linkages between children and their parents in EHR or public health insurance administrative data. This paper describes methodologies we developed for linking data on children and their parents within: (1) EHR data from Oregon clinics who are members of the OCHIN community health information network, and (2) administrative data from the Oregon Health Plan (OHP), which covers individuals enrolled in Oregon’s Medicaid and CHIP programs.
OCHIN’s EHR Data
OCHIN centrally hosts and maintains an EpicCare© EHR system that is shared among many community health centers . (Originally called the Oregon Community Health Information Network, this organization is now “OCHIN, Inc.” as it has expanded members to >300 clinics in 16 states). Patients have a single health record accessible across all sites, and OCHIN maintains an EHR data warehouse with an enterprise-wide master patient index. This study utilized EHR data from clinics in the OCHIN network serving both children and adults in Oregon (141 clinics). OCHIN’s comprehensive data warehouse has aggregate practice management data (e.g., appointments, diagnoses and procedures, similar to insurance claims data) and medical record data (e.g., problem lists, physician notes, prescription records, lab results, and referrals). The data are regularly checked and cleaned, and are stored in a central repository that can be searched electronically . OCHIN’s EHR is primarily used to support service delivery to individual patients at the point of care. Prior to our study, no mechanism was available for connecting children to their parents within the EHR data.
Oregon Health Plan Administrative Data
Oregon currently provides several public health insurance programs, most of which are operated by the OHP. Different programs have different eligibility criteria: for example, some children enroll via the CHIP, and others via Medicaid; pregnant women, children with special health care needs, low income adults, and disabled adults are all eligible to enroll in separate programs [23, 24]. Multiple family members may be enrolled in the OHP, yet each individual could join via different eligibility categories and/or at different times. Individuals can be tracked across programs via individual client identification (ID) numbers. OHP also identifies members of the same household with a household case ID number, but no family relationships are specified and thus, no mechanism is available to specifically identify and link children with their parents in OHP datasets.
Linkage Algorithm Descriptions
OCHIN's EHR Data
Potential child–parent links were limited to children and parents who were both patients at one of the 141 Oregon clinics in OCHIN’s EHR network. Although requiring both children and parents to be patients limited the number of children whose parents we could identify, it yielded richer parent data (i.e., age, race/ethnicity, insurance status, medical diagnoses, past medical history, health care utilization patterns, etc.). Such parent-level data are necessary for analyses of how parental factors influence a child’s access to insurance and receipt of health care services.
To link children with their parents in the OCHIN EHR data, we first identified children <18 years of age with at least one visit to an Oregon OCHIN clinic in 2002–2010. We chose <18 years of age because persons ≥18 are considered an adult in the OCHIN EHR. We used the only two EHR data fields containing information on an adult patient connected to a child patient to link parents and children—the ‘guarantor’ and ‘emergency contact’ fields.
OCHIN clinics use the guarantor field to identify the person responsible for paying for a given visit. The guarantor includes both an ID number and the type of relationship the adult has with the child. Specifically, the ID number is the patient ID of the OCHIN adult financially responsible for the child and the type of relationship includes possible familial relationships (e.g., parent, uncle, sister, etc.). Each OCHIN patient has only one patient ID, thus the ID found in the guarantor field matches to the patient record of the OCHIN adult financially responsible for the child. Twenty-six percent of the children identified in our sample had an adult OCHIN patient identified as their guarantor. OCHIN clinics also use an emergency contact field to identify a child’s parent or guardian. This field was populated with an adult OCHIN patient for 5 % of the children in our sample. Approximately half of the children who had an emergency contact adult in their EHR data also had an adult guarantor listed; within this group, 99 % of the guarantor and emergency contact fields agreed on parental relationship. We excluded any potential child–parent link if the parent was not 12–55 years older than the child.
Oregon Health Plan Administrative Data
A ‘child’ was anyone aged <19 years at any point during the study period (2002–2010) who shared a household case ID number with at least one adult who was deemed a potentially linkable parent.
A potentially linkable ‘parent’ was anyone sharing a child’s household case ID number who was aged ≥19 years at some point during the study period and was 12–55 years older than at least one child with the same household case ID number. Any adults in the household who did not meet those criteria were excluded and assumed to have another type of relationship with the identified child (e.g., sibling, significant other, grandparent, etc.). If no potentially linkable parents were identified, that child was excluded.
Within the potential child–parent links identified, only one female was considered the ‘mother’ and only one male the ‘father.’ Household cases with more than one potential parent of either sex were excluded, as it could not be determined which adults were the parents versus grandparent, aunt, uncle, same-sex partnership parent, etc.
If multiple children shared a household case ID number, we repeated this process for each individual child until all children in the family were either excluded or linked to a potential parent.
After we identified children who linked to at least one parent in both OCHIN EHR and OHP administrative data, we compared the demographic characteristics of these children to children who did not link to a parent within each data source.
Cross-Validation of Child–Parent Linkages
We were able to check the quality of our linkage processes by using the subset of cases from the OHP data in which both child and parent had an OCHIN EHR patient ID and the subset of OCHIN EHR cases in which the child had an OHP client ID. To perform this cross-validation, we compared the child–parent links in one dataset (tested data set) with the child–parent links in the other data set (validating data set). This study was reviewed and approved by our institutional review board [#00006727].
Linking Children and Parents in OCHIN’s EHR Data
Linking Children and Parents in the Oregon Health Plan Administrative Data
Demographic comparisons for not linked and linked children from the OHPa and OCHIN EHR data
Children in OHP dataset between 2002 and 2010 (n = 1,017,984)
Children in OCHIN EHR dataset between 2002 and 2010 (n = 249,079)
Not linked (N = 492,406) (%)
Linked (N = 525,578) (%)
Not linked (N = 186,112) (%)
Linked (N = 62,967) (%)
Child year of birth
Prior to 2002
Cross-Validation of Child–Parent Linkages
Cross-validation of child–parent linkages identified within the OHPa and OCHIN EHR data
Tested data set
Validating data set
N pairs (%)
OCHIN EHR child–parent links
All OHP child–parent linkages in which both child and parent had an OCHIN ID
No conflict in parent(s) identified for child
Different parent identified for child in OHP data
OHP child–parent links
All OCHIN EHR child–parent linkages in which the child had an OHP individual client ID
No conflict in parent(s) identified for child
OHP linked parent identified as non-parent in OCHIN EHR data, or different parent identified for child within OCHIN EHR data
Parental health insurance status is significantly associated with children’s insurance status and receipt of evidence-based health care [1–8]. Further, treating a patient within the context of the family is consistent with more comprehensive and holistic care [25–27]. Thus, reliable information about health insurance coverage status and health care for children and their parents is needed to inform efforts to optimize children’s health. However, no straightforward processes for linking children with parents currently exist for many datasets used in research, policy, and practice. To address this need, we developed methods for linking children and their parents within two commonly used data sources. Our methods may inform future efforts to link children and their parents in EHR data, public health insurance administrative data, and similar datasets.
Practice and Policy Implications
As the number of families receiving coverage through employer-sponsored programs decreases [11–13, 28], fewer insurance databases will include information about all family members on a single plan held by one parent (the covered employee). An increasing percentage of American children are now insured through Medicaid or CHIP, and their data is de-coupled from that of parents who are insured elsewhere or are uninsured. With the Affordable Care Act (ACA), more individuals will have access to coverage and parents may gain insurance coverage through new plans; however, parents who gain new coverage might not be insured by the same plan as their children (e.g., a parent may obtain private coverage through health insurance exchanges, while their children obtain public coverage through the CHIP) [29–31]. Even when parents ‘join’ their children in public insurance programs through states’ expanded Medicaid eligibility , it may be difficult to link children with their parents in state administrative datasets because of differing eligibility requirements, enrollment dates, and programmatic enrollment procedures (as in Oregon). All-payer claims databases may mitigate some of these issues but still do not provide an easy way to link parents to their children . Thus, the methods described here will continue to be necessary to make these linkages, even for families where both children and parents have coverage through the same public program.
Even with ACA changes in policy, the current US policy environment makes it difficult for many families to enroll all family members in the same health insurance plan. For example, income requirements for public plans are different for children and adults . In addition, employer-sponsored plans only have to be affordable for the employee (not the family) to comply with ACA regulations, which may lead to an inability of parents to afford such coverage for their children . With family members increasingly insured by different plans, it has become difficult to link children and parents in health insurance datasets. Further, as payment moves towards global capitation and away from fee-for-service, insurance claims data will likely contain less complete information about health care services. EHR data may provide an important alternative source of data containing information on both children and their parents and a richer source of information related to health care services received and the health status of individual patients and families. For family members who receive care at the same clinic, EHR data can provide more complete information than health insurance data. As heath information exchanges improve, it will also be possible to link family members with information in different EHRs. As the use of EHR data for tracking and coordinating care and informing future policies increases, child–parent linkages in EHR data will become increasingly relevant to informing practice and policy decisions.
Linking children to their parents could inform research, policy, and practice in several ways. First, it could allow for treatment that targets the whole family. For example, obesity; since obesity is often a problem for children and their parents, treatment options could be directed at the whole family if clinicians are aware of the child and parents weight and other relevant biomarkers. Second, it could help inform recommendations for disease. For example, asthma; if the child’s parent is a documented smoker in the EHR, this information could prompt a provider to offer additional education on its impact on the child’s disease and resources to help a parent quit. Third, having known linkages between parents and children in their medical records might facilitate better coordination of care for families. For example, a provider could be electronically prompted to remind a mother at her visit about her child’s immunizations that are overdue and help get them scheduled. Lastly, these linkages could be used to better understand and diagnose children by being able to see the medical issues suffered by their parents.
The linkage methods we developed were time-consuming and resource-intensive. The data had to be cleaned, managed, validated, and processed through algorithms using deductive logic to define child–parent links. Given the importance of using EHR data for the purposes described here, further investigation and validation of these linkage methods is needed. Processes for easily and automatically linking families within other large healthcare-related data sets are also needed.
Strengths and Limitations
One strength of our approach to linking children and parents within OCHIN’s EHR data was the use of all relevant relationship information. Another strength in using this methodology in an EHR dataset was the large number of children for whom a potential parent could be identified.
Our process shares a weakness inherent to all secondary data analyses—the quality of the available data is determined by the quality and consistency of the data entered into the system. For example, the use of the guarantor and emergency contact fields in the EHR may vary by clinic. We did not, however, assess the extent to which completion of these data fields differed between clinics, nor were we able to assess the percentage of children who could not be linked to a parent because of missing or erroneous data, which means we likely missed many potential links. Further, as stated above, we captured only a subset of child–parent links in the EHR because we only linked children with parents who were also patients within the same health care system. We also could not find parents to link with the majority of children in the OHP dataset because the eligibility requirements for children to qualify for Medicaid or CHIP are much more inclusive than the requirements for adults. Many parents of children enrolled in OHP do not quality for this coverage due to having an income that exceeds the limit for adult eligibility. Thus, demographics differed between children who linked to a parent versus children who did not link to a parent in the datasets because certain subpopulations are more likely to have both children and parents qualify for OHP coverage or to receive health care from the same clinic.
Certain assumptions were necessary. In most cases, we believe a ‘true’ parent was found; however, ID of a parent was premised on the assumption that an adult in the household who was 12–55 years older than the child was a ‘parent’ or one of the child’s primary guardians. We attempted to identify only children with the most probable primary guardians and excluded any in which this relationship was not easily identifiable (e.g., children linked to multiple adults of a single gender, or linked to more than two adults) so as to minimize the chances of linking to non-parent adults (e.g., grandparents, aunts/uncles, roommates, siblings, etc.). Thus, some children with same-sex parents who should have been included in the final number of linked pairs were likely excluded. Researchers wishing to exclude fewer children could relax these criteria but it may yield linkages that are less precise. We were not able to assess the percentage of children who could not be linked to a parent because of missing or erroneous data. Of note, our results are specific to one state’s Medicaid and linked EHR data; applying these algorithms to different states and health systems may not be possible or may yield different linkage rates.
The ability to link children and their parents in large healthcare-relevant datasets is necessary for informing efforts to optimize children’s health care. We developed strategies for successfully linking children with their parents in two such data sets, which are being used to study the impact of recent policy changes that increased discordant health insurance patterns in families. These algorithms could also be used to inform and evaluate future practice and policy changes.
This work was financially supported by the Agency for Healthcare Research and Quality (AHRQ) (1 R01 HS018569), the Patient-Centered Outcomes Research Institute (PCORI), and the Oregon Health & Science University, Department of Family Medicine. The funding agencies had no involvement in the preparation, review, or approval of the manuscript. We would also like to acknowledge OCHIN, Inc. and all clinics in the network for participating in this research. The authors are grateful for editing and publication assistance from Ms. LeNeva Spires, Publications Manager, Department of Family Medicine, Oregon Health & Science University, Portland, OR.