International Multi-database Pharmacoepidemiology: Potentials and Pitfalls
- First Online:
- Cite this article as:
- Lai, E.CC., Stang, P., Yang, YH.K. et al. Curr Epidemiol Rep (2015) 2: 229. doi:10.1007/s40471-015-0059-z
With the rapid progress of computer technology and development of electronic health records, pharmacoepidemiologic studies using multiple databases across and within nations have become feasible and popular in recent years. Multinational pharmacoepidemiologic studies (MPES) provide opportunities to compare drug utilization and effects across countries, to study rare exposures and outcomes, and to collaborate across nations. Multiple networks in North America, Europe, and Asia/Pacific region have emerged to support MPES and national and cross-national collaborations. We highlight the challenges of MPES and respective solutions with examples, including non-database pharmacoepidemiologic studies for low resource countries, distributed network approaches to address data privacy issues, considerations for biases due to variations in data structure, coding, and practice/behaviors of health care providers and patients. Because there are no standard recommendations in designing and conducting MPES, transparency is the key to adequate conduct and interpretation of the studies. More efforts in developing and standardizing the methods for MPES are required.
KeywordsMultinational pharmacoepidemiologic study Across-nations study Multiple databases study Research network Distributed network approach
Pre-marketing clinical trials are not sufficient to assess potential adverse effects and effectiveness of a pharmaceutical product in practice settings , and those trials that are multinational often do not include enough patients in many regions to offer meaningful analyses of these subpopulations. Similarly, clinical trials, even multinational ones, enforce a research construct that does not reflect the health care delivery system in which these patients (and interventions) are integrated. Pharmacoepidemiologic studies, a part of phase IV studies during the post-marketing period, may provide evidence of beneficial and hazardous effects of medical products in real world populations . With the rapid progress and widespread use of computer technology and development of electronic health records in North America, Europe, and many parts of Asia, pharmacoepidemiologic studies using multiple databases across nations have become feasible and common in recent years. More researchers and policy makers are interested in developing and participating in multinational pharmacoepidemiologic studies (MPES), and not surprisingly, many research networks have been formed to provide a mechanism to support national and cross-national collaborations and MPES.
There are a number of advantages of MPES. First, researchers and policy makers can compare utilization, prescribing patterns, and safety profiles of medications and medical devices among countries and better understand whether the differences may be due to the differential selection of patients, the delivery system itself, or perhaps suggest underlying biological or genetic differences. Second, drug effects are different among ethnic groups due to different prevalence of polymorphisms of receptors and cytochromes ; beneficial and adverse effects are also affected by life style, such as dietary habits, that varies across countries. Third, a possible increase in sample size using multiple databases across nations allows researchers to study rare exposures and/or rare outcomes with sufficient power, providing a greater possibility to study rare adverse events or rare diseases. Also, not all medications, devices and health policies are available in every country so MPES also offer opportunities for one country to learn from others to inform their own clinical and regulatory decisions regarding these interventions. Finally, MPES provide opportunities for collaborations and communications across nations, which is the fuel for further advancement in the research community.
Coordinating center check list
Participating centers check list
Existing International Research Networks
A number of research networks and initiatives have emerged in North America, Europe, and the Asia-Pacific region to assist in the conduct of national and multinational pharmacoepidemiologic and other observational studies. Similarly, several networks with a specific focus (e.g., vaccine safety) or special populations (e.g., children or pregnant women) have also been formed to expand the possibility of establishing international networks for these specific topics/populations.
There are various administrative databases, such as the Health Services Databases in Saskatchewan  and the Ontario Health Insurance Plan Databases  which are confined to its specific province in Canada and arose from the provincial government’s need to manage health care delivery costs. This is typical of “claims” databases whose fundamental purpose is to allow fiscal tracking and accounting for the delivery of health care from a payer perspective. In the USA, most payers have claims data and even the US Government payers, Medicaid and Medicare, have databases that are used in research. The Medicare databases, for example, cover approximately 40 million US residents of age 65 and older and eight million younger people with disabilities . However, unlike other countries with universal health care managed by a single central government payer, neither the USA or Canada has a single administrative database that covers the entire population in the nation. To address this, a few research networks have been initiated to facilitate multi-database studies that could provide national coverage. For example, the Canadian Drug Safety and Effectiveness Network (CDSEN) was initiated in 2007 by the Canadian government in order to connect researchers throughout provinces in Canada . CDSEN’s research teams form a coordinated network of over 150 researchers that are committed to excellence in pharmacoepidemiologic research  which reaches across most of the provincial databases. In the USA, the Food and Drug Administration (FDA) launched the Sentinel Initiative in 2008, aiming to refine safety signals with the major objective of developing a scalable and transparent organizational structure to study the safety of medical products  by organizing multiple databases under a single research governance structure. Although the Sentinel Initiative in the USA and CDSEN in Canada are not international, their large multi-database approach within a single country offers additional examples of research platforms that may be relevant for multi-national approaches .
There are a number of initiatives in Europe with various interests and goals. For example, the Exploring and Understanding Adverse Drug Reactions (EU-ADR)  an initiative of the European Commission is aimed at developing a surveillance system for drug safety on the basis of cross-database use in European countries. The EU-ADR uses clinical data from electronic healthcare records of over 30 million patients from participating countries to efficiently study drug safety issues. To enhance monitoring of the safety of medical products, a collaborative European project named the Pharmacoepidemiological Research on Outcomes of Therapeutics by a European ConsorTium (PROTECT) was initiated. As the first step of the PROTECT project, Sabaté et al. identified 19 collaborative international European working groups, networks, and research projects related to drug utilization, such as the European Drug Utilisation Group (EuroDURG) and the European Network of Centres for Pharmacoepidemiology and Pharmacovigilance (ENCePP). The detailed characteristics and specific goals of these networks are available in their recent paper . The Nordic Pharmaco Epidemiological Network (NorPEN) was formed in Nordic countries to promote the collaboration among researches and to create the possibility of carrying out cross-country population-based comparative research in pharmacoepidemiology, thereby promoting safer medication use .
The Asian Pharmacoepidemiology Network (AsPEN) was formed to provide a mechanism to support the conduct of pharmacoepidemiologic research in Asia and to facilitate the prompt identification and validation of emerging safety issues among the Asian countries. AsPEN was first suggested at the 3rd Asian Conference on Pharmacoepidemiology (ACPE) in Seoul, Korea, in 2008. AsPEN was initiated by four Asia-Pacific countries (Japan, Korea, Taiwan, and Australia) in 2008 and has expanded to Singapore, China, India, Hong Kong, and Thailand, adding a few more databases and ethnic groups . However, not all countries participating in AsPEN have databases and its infrastructure is still under development, and the operation is still in its infancy. The AsPEN studies as well as ongoing projects are listed on the website .
Networks and Initiatives with Specific Population/Goals
Some researchers shared their common interest and initiated research networks to increase the possibility of studying certain patient populations underrepresented in clinical trials, i.e., older people, children, and pregnant women. For example, the Task-force in Europe for Drug Development for the Young (TEDDY) [17, 18] was established to promote a safe and efficacious use of medications for children. The European network of population-based registries for the surveillance of congenital anomalies (EUROCAT)  was initiated in 1979 with more than 1.7 million births surveyed per year in 23 countries to provide early warning of new teratogenic exposures on congenital anomalies in Europe. Similarly in Europe, there are a number of initiatives under the Innovative Medicines Initiatives (IMI) that leverage multiple databases across geographic regions for specific disease areas. An underlying focus of the IMI-funded EMIF effort (http://www.emif.eu/) includes the development of a platform to support use of patient-level data from a number of sources and countries. The VACCINE.GRID  is a global network of leading public health organizations to implement innovative and distributed vaccine effect research aimed at overcoming the boundaries of institutions, nations, and continents to inform vaccine benefit and risk assessments across the globe . Additionally, the International Society for Pharmacoepidemiology (ISPE) is formed as an international professional organization dedicated to advancing the health of the public by providing a forum for the open exchange of scientific information . Several special interest groups (SIGs) of ISPE have also been created to facilitate interactions among those with same interests, e.g., drug safety pregnancy, vaccine safety or biologic safety within ISPE membership .
Initiatives to Develop/Expand Infrastructures for National or International Multi-database Pharmacoepidemiologic Studies
In the last decade, a few important initiatives were developed to facilitate the research networks for national or multinational pharmacoepidemiologic studies. The Mini-Sentinel Operations Center (MSOC) was developed to coordinate the network of Mini-Sentinel Data Partners and lead the development of the Mini-Sentinel Common Data Model (MSCDM). MSCDM version 4 currently includes 11 tables that represent information for the data elements needed for Mini-Sentinel activities . Data participants transform their data locally according to the MSCDM, which enables them to execute standardized analytical programs and combine the results. The Observational Medical Outcomes Partnership (OMOP), funded by the Foundation for the National Institutes of Health (FNIH), was established to support data science and methodological research to empirically evaluate the performance of various analytical methods in the context of a network of databases . OMOP has established a common data model (OMOPCDM, currently version five), similar to the MSCDM established by Mini-Sentinel with some distinct differences. The unique aspects of OMOPCDM are that it can accommodate heterogeneous databases and execute large-scale statistical analyses [24, 25, 26, 27]. Many of the resources for the OMOPCDM conversion are provided and can be downloaded from the OMOP website, including the standard vocabulary specification, drug codes and mapping tables, and ready-to-be-used SAS programs for validating the quality of the OMOPCDM conversion . In 2013, some AsPEN researchers initiated and participated in the Surveillance of Health Care in Asian Network (SCAN) project in collaboration with key members of OMOP, with the goal of strengthening the infrastructure for future AsPEN studies, describing current prevalence/incidence of medical conditions, and the utilization and trends in related medications among Asian countries. The OMOPCDM has become the first common data model (CDM) in Asian databases [16, 28].
Advantages of Multinational Pharmacoepidemiologic Studies
Multinational pharmacoepidemiologic studies (MPES) provide opportunities to compare and benchmark drug utilization patterns , safety and effectiveness of medications and devices among countries . For example, Zito et al. compared the utilization of antidepressants among Denmark, Germany, Netherlands, and the USA and found that utilization in the USA was at least three times higher than in the three European countries. The major variations in the use of subclasses included tricyclic antidepressants (TCAs), predominately used in Germany, and selective serotonin reuptake inhibitors (SSRIs), predominately used in the USA, Denmark, and the Netherlands . Also, the AsPEN investigators conducted a study to compare cardiovascular and gastrointestinal safety of nonsteroidal antiinflammatory drugs (NSAIDs) among Asian countries . Assessing the patterns of NSAIDs use among countries, they found that loxoprofen was only marketed and most commonly used in Japan and Korea and mefenamic acid accounted for a small proportion of NSAID initiations in all countries. Additionally, they found the utilization pattern of celecoxib in Taiwan was different from other countries, because the national health insurance program in Taiwan only covers celecoxib for patients with certain conditions, including previous history of gastrointestinal ulcer or bleeding . The identification of drug utilization patterns can help practitioners, researchers, and policy makers evaluate the current status and trends by benchmarking, documenting these variations, and exploring reasons for variations, which may include differences in cultures, policies and regulations, and database structures and coding systems.
Racial and Ethnic Differences in Safety and Effectiveness of Medications
The effects of race or ethnicity on pharmacokinetics and pharmacodynamics have been widely discussed [3, 31]. Well-known examples include dry cough induced by angiotensin-converting enzyme (ACE) inhibitors being more prevalent in Asian populations than in Caucasians  and the different effects of warfarin on Asian and Caucasian populations, requiring much smaller doses in Asian patients because of the genetic variants of hepatic enzymes . The advantage of MPES is that it can evaluate racial/ethnic effects based on an identical study design. For example, the AsPEN investigators included multiple databases from several countries and evaluated the variation in the association between thiazolidinediones and heart failure across ethnic groups . They found that the risk of heart failure with thiazolidinediones was higher in predominantly Caucasian countries (Australia and Canada) than in the Asian countries (Hong Kong, Japan, Korea, and Taiwan). Likely explanations for the variation are varying prevalences of polymorphic hepatic enzymes metabolizing thiazolidinediones across different races [16, 34]. However, not infrequently, it is unclear if the conflicting results from studies are solely due to racial/ethnic differences or other factors related to design or analytic approaches. For example, Hsieh et al.  found that the use of valproic acid may increase the risk of stroke in patients with epilepsy in Taiwan, but Olesen et al.  have demonstrated that valproic acid may attenuate the risk of stroke compared to carbamazepine in patients with epilepsy in Denmark. Careful standardization in coding and database infrastructure, analytic design, and statistical approaches, as well as knowledge about health policies, and patient and health professionals’ behaviors are the cornerstone to conduct valid multinational pharmacoepidemiologic studies to identify racial/ethnic variations in drug safety or effectiveness .
Increased Sample Size and Study Power
Detecting serious but rare adverse events are challenges in pharmacoepidemiologic studies. Underpowered studies often generate less robust results and contribute little information. MPES can potentially provide more data, by using a larger sample size with more power, providing a greater possibility of producing meaningful results. Although systemic reviews or meta-analyses can be conducted to summarize multiple under-powered studies, the validity of meta-analyses for observational studies is unclear given diverse study designs and analytic approaches. MPES avoid this problem by using a common protocol leading to less methodological discrepancies [37, 38].
MPES provide opportunities to develop/enhance the collaborations among pharmacoepidemiologists across countries. Through our collaboration in experiences in AsPEN and SCAN, we gained a better understanding of the health systems and cultural factors that influence prescribing behaviors and the use of medications. Through discussions and working together on projects, we are often able to come up with a better approach to overcome pitfalls in design or analyses to increase the validity of studies. However, there remain numerous challenges and limitations that we need to be aware of, and sometimes may not be able to solve, which are discussed in the next section.
Challenges and Methodological Considerations
Having access to good data sources is the first step for most pharmacoepidemiologic studies. While electronic health care databases are spreading throughout the world, there remain many countries without accessible research quality databases. Most, if not all, databases that are available were not created for research but have been repurposed to serve the research needs. Data privacy is also a significant issue in Asia and other industrialized countries. MPES designs are complex and may face considerable limitations due to the substantial differences in health care systems, physician and patient behaviors, cultures, policies, drug availability, and persistence and adherence to the treatment. We discuss these challenges and highlight their methodological considerations in this section.
Availability of Data and Other Resources
Outside of North America and Europe, the availability of databases is relatively limited. In Asia, we found that there are three nationwide claim databases from Japan, South Korea, and Taiwan; two nationwide electronic health record databases from Hong Kong and Singapore; and a few regional electronic health records databases from China and Thailand; however, the availability of databases from other Asian countries remains limited . Furthermore, the availability of funding and skillful pharmacoepidemiologists or programmer support with relevant database experiences is likely limited outside of North America and Europe. These deficiencies dramatically affect the practice of MPES outside of North America and Europe.
Non-database Pharmacoepidemiology for Countries with Limited Data and Other Resources
The establishment of databases is a crucial and long-term strategy, although this would be very difficult for low-resource countries. It also warrants painstaking efforts on data integrity and data conversion . Non-database MPES, by collecting primary medical records or questionnaires, should therefore be considered in low-resource countries. The principles and methodology of non-database MPES are generally similar to those in database studies. Although non-database MPES are limited by the small sample size or loss of follow-up of patients, there are advantages of the primary data collection and appropriate design and analytical methods can overcome or minimize the weaknesses. For example, Ooba et al. conducted a prospective stratified case-cohort study to assess the association between statins and diverse adverse drug reactions in Japan when a health care database was not yet available . This type of study can be expanded as multi-national studies. Also, primary data collection allows ascertainment of confounding information that is not obtainable from the available databases such as body height/weight, laboratory data, and use of non-prescription drugs. Investigators might also consider a homogenous group and comparable controls to decrease heterogeneity of sample for comparison. Modern technologies (e.g., electronic data capture and online survey or questionnaire) may be employed to collect essential data efficiently.
The centralized data repository approach is considered to be the most ideal method for MPES [40, 41]; that is, a coordinating research group collects all the data from participating countries and performs the analyses. Its strengths are efficiency, flexibility, and quality of analyses because the coordinating group analysts are able to access all the data directly and to generate results. However, data privacy issues for the centralized data repository approach make it impractical in many multi-database studies. Even with practices and techniques to preserve the confidentiality such as encryption of identifiers and/or strict policy and authentication mechanism to data access, sharing individual-level data across nations is usually discouraged and/or prohibited.
Distributed Network Approach to Protect Data Privacy
Common protocol approach has become one of the alternatives to the centralized data repository approach for MPES and better complies with data privacy protection [42, 43]. With this approach, we can standardize study design and analytical approaches using a common protocol, and then each site can execute analyses on the basis of protocol without sharing data, but can send summarized results to the coordinating center. The major disadvantage of this approach is redundancy in efforts to program and analyze the data and the need for in-depth and frequent communications to ensure researchers are aligned with the same concepts and to avoid confusion or misunderstanding of the study design.
The CDM could be developed for each study or could be a global model for the entire database used for most routine pharmacoepidemiologic studies. Both Mini-Sentinel Common Data Model (MSCDM) and Observational Medical Outcomes Partnership Common Data Model (OMOPCDM) are global models that preserve many of the components including patients’ diagnoses, procedures, prescriptions, and relevant health utilization information  A few large-scale empirical evaluations of statistical methods for risk identification in longitudinal health care data in the USA has been completed based on OMOPCDM . The OMOPCDM was also applied to a collection of heterogeneous data from the UK, other countries in Europe [47, 48], and the Asian-Pacific region . These well-established CDMs can facilitate the conduction of MPES by the distributed network approach .
The common protocol approach or CDM approach has limited options in pooling results, however, because of the unavailability of individual data from independent sites. Meta-analysis by generic inverse variance method is the simplest statistical approach to pool the results from sites  but the flexibility to do post hoc changes is limited [50, 51]. Confounding factors should be carefully considered and addressed in particular by analyzing the results at a summarized level . To improve the performance of MPES and results pooling and avoid the conflict of data privacy, Rassen et al. proposed a propensity score-based approach to pool data, which allows the adjustment of covariates without compromising data privacy . This approach divides covariates into “shareable” (e.g., age in decades, sex, year of cohort entry) and “private” categories and calculates a propensity score (PS) as a composite measure accounting for these private covariates that does not allow us to identify patients. However, in our experience in AsPEN, this approach was not practical as sharing any individual level data was not permitted in most countries participating in AsPEN. Wolfson et al. proposed an approach on the basis of the concept of “data aggregation through anonymous summary-statistics from harmonized individual-level databases,” (DataSHIELD) to pool data . They proposed that the coordinating center collect the non-identifying summary parameters, allowing the passing of data between computers, and refining the coefficients of the regression models. Throughout a series of refinements of regression models that repeatedly incorporate summarized statistics from iteration to iteration based on sites’ summary parameters, the DataSHIELD generates a final regression model with estimates that are stabilized and robust .
Confounding and Other Biases
One of the major limitations of using secondary data sources in observational studies is the potential for bias due to confounding, selection, and misclassification. The issues with these biases are more complex to deal with in MPES because of the existing differences in data architectures, coding systems, and practices and behaviors of health care professionals and patients across multi-national databases. For example, while the differences in data architectures and coding systems can be addressed by mapping domestic coding systems to a common international coding system and employing a CDM, international variations in coding practices will remain problematic, if not addressed, leading to misclassification bias or other types of biases. Because of the differences in physician behaviors and practices or health policies leading to different indications for the same treatment, confounding or selection bias is not easy to identify and has to be addressed within each database. For example, from our experience in AsPEN, the risk of hospitalized peptic ulcers and gastrointestinal bleeding in celecoxib users was lower than that in diclofenac users in most Asian countries including Japan, Korea, and Hong Kong, but not in Taiwan. We found that this was likely due to confounding by indication predominantly seen in Taiwan, because Taiwanese National Insurance reimbursed celecoxib only for patients with high risk of gastrointestinal events including those with previous gastrointestinal bleeding, concomitant anticoagulants, or rheumatoid arthritis requiring long-term steroid treatment. While CDM is a useful tool to standardize the database and coding systems, it does not standardize how the data is coded or what the coded data imply.
Preventing/Adjusting Biases in Multinational Pharmacoepidemiologic Studies
Pharmacoepidemiologic principles also apply to MPES in preventing/adjusting for biases. The distributed network approach using CDM is the most practical and widely used approach in multi-database MPES. The first crucial step to valid MPES is to ensure that the conversion of CDM is not causing any data corruption or loss of data integrity in each data source, especially while mapping between domestic codes and standardized coding systems. Based on our experiences of the SCAN project, for example, we found that most countries already employed a common coding systems such as International Classification of Diseases (ICD) code, making it easier to apply CDM. Many databases also use a domestic coding system for drugs, and therefore, drug ingredients, strength, and route of administration became a foundation to map these codes to a standard coding system. Drug mapping was especially challenging for combination products. We broke down the combination products into ingredient level to improve the mapping.
As in our standard practice in pharmacoepidemiologic research, it is important to conduct extensive descriptive analyses to understand, compare, and examine the original data and converted data if using CMD approach at each site before considering or conducting analytic pharmacoepidemiologic studies. The goals of the descriptive analysis is to understand coding behaviors for major diagnoses used to define cohort or outcomes, exposure predictors, physician’s practice patterns, and patient preferences for therapeutic modalities at each site. These are likely to be heavily affected by the local culture and health policies that may or may not be predictable by the researchers. Because the variations in coding behaviors across countries would produce large heterogeneity of results, validation of diagnosis codes that are often used to define the major outcome of interest within each site would be crucial to ensure the validity of MPES. At a minimum, we encouraged that participating researchers investigate and understand the coding practices in each country and reconsider and revise protocols based on the evaluation. For any specific MPES, a carefully developed study protocol that considers the coding/practicing behaviors and variations in the data is helpful to limit the potential for bias. Participating data partners should be knowledgeable and experienced with their own data and health care system and share the site-specific information and experience when reviewing the protocol and making suggestions. Once a protocol is defined, analyzing the data from each site and assessing the heterogeneity of the results before pooling are essential.
Future Vision and Conclusion
The worldwide pharmacoepidemiology community is engaged in developing and participating in MPES. To date, multiple academic research networks have been formed to provide a mechanism to support these cross-national collaborations and MPES. The MPES provide opportunities to compare and benchmark drug uses and effects and to explore the interaction between drugs and countries. It allows researchers to study rare exposures and/or rare outcomes with a larger sample size and sufficient power and provides opportunities for collaborations and communications across nations. However, numerous challenges exist to develop and maintain working networks and ensure the validity of studies coming out of these networks. These include limited resources of data and skillful pharmacoepidemiologists and data analysts/programmers especially in Asian countires and other low resource countries, data privacy issues, and confounding effects and biases due to variations in data structure, different coding systems/practices, and behaviors of health care professionals and patients among sites or countries. As no standard recommendations exist in designing and conducting MPES, transparency is the key to adequate interpretation of the results from MPES. Registering studies and informing the general public about ongoing research could be one of the best approaches to maintain transparency; however, compliance varies. For example, The European Network of Centres for Pharmacoepidemiology and Pharmacovigilance (ENCePP) electronic register of studies provides a publicly accessible resource for the registration of pharmacoepidemiologic and pharmacovigilance studies conducted anywhere in the world . The study registry will also help to reduce publication bias and promote the exchange of information. Many academic networks have published the protocols for their on-going researches and the results of the completed studies on their official websites or bulletins [16, 17, 19]. While more efforts in developing and standardizing the practice and methods for MPES are needed, as in any pharmacoepidemiologic studies, academic networks and researchers should always keep the studies and the reports to the scientific standards; ensure transparency and independence from conflicts of interests. More frequent communications between academic networks and regulatory authorities in each country could be a critical procedure to strengthen national drug/device surveillance practices.
Compliance with Ethics Guidelines
Conflict of Interest
Edward Chia-Cheng Lai, Yea-Huei Kao Yang, Kiyoshi Kubota, Ian C K Wong, and Soko Setoguchi declare that they have no conflict of interest.
Paul Stang declares that he is an employee of Janssen Pharmaceuticals R&D who sponsor, fund, and collaborate on research involving observational databases with a number of institutions in Asia-Pacific and other parts of the world.
Human and Animal Rights and Informed Consent
This article does not contain any studies with human or animal subjects performed by any of the authors.