FormalPara Key Points

Enormous progress has been made in Italy and globally in the use of evidence derived from patients’ clinical data as they access their routine medical care.

Several data sources in Italy have been used to assess the effectiveness, safety and economic value of medications, including claims databases, electronic health records and patient registries.

1 Introduction

The US FDA defines real-world data as routinely collected medical data relating to patient health status and/or the delivery of healthcare. Such data are available through electronic health records (EHRs), medical claims, drug and disease registries, patient lifestyle-related activities and health-monitoring devices [1]. The increasing global use of electronic rather than paper trails following patient–doctor encounters, as well as billing and administrative activities, has led to a significant increase in real-world data sources, particularly in developed countries [2]. The nature of such data sources is necessarily tied to its regional and national setting and varies considerably from country to country. Italy is one such country whose growing real-world data resources need to be understood and used in the complex context of the regional and national clinical, regulatory and governmental setting, particularly concerning healthcare data. Several European countries, including the UK, Scandinavian countries, France, Italy, the Netherlands and Spain, to name a few, have harnessed their electronic healthcare data in the last decade to provide information on drug use and safety. Increasing collaboration between countries means it is essential for relevant European stakeholders to be informed of the nature of such data that are available in other countries, including the strengths and limitations of the data. One major reason for this is the increasing interest in multi-database studies, notably conducted as post-authorisation safety studies, where data from different regions and/or countries are pooled.

Italy is rich in observational electronic healthcare data at the local, regional and national level, and these data have increasingly been harnessed in the last decade. Real-world evidence can provide answers to several drug-related questions on a large scale, including those concerning drug utilisation [3] and post-marketing comparative drug safety (i.e. signal detection [4], refinement and validation [5]) and effectiveness. However, the great value of observational data emerges when considering the special populations that can be studied with these data sources, such as children [6], pregnant women [7], and the elderly [8]. It can also inform decisions concerning healthcare policies aimed at improving drug utilisation and safety, such as health technology assessment (HTA) and the evaluation of risk-minimisation measures (RMMs) implemented by national drug agencies [9] to promote the appropriate, safe and cost-effective use of drugs. Randomised clinical trials, while being the gold standard to evaluate very specific drug-related issues such as efficacy, are by their nature not often able to evaluate any of the above topics. Italian observational databases are therefore a necessary tool to provide real-world evidence concerning the Italian population. However, despite how commonly Italian real-world data are used, to our knowledge no review has been published describing data source characteristics in detail and how this affects research on drug utilisation and safety.

Most Italian research centres involved in pharmacoepidemiology studies using real-world data are affiliated with the European Network of Centres for Pharmacoepidemiology and Pharmacovigilance (ENCePP), a network coordinated by the European Medicines Agency (EMA) with the aim of addressing some of the above-mentioned issues. ENCePP facilitates the process of carrying out high-quality, multi-centre, independent post-authorisation studies. The aim of this narrative review is to describe the main Italian healthcare databases used to study drug utilisation and safety as well as some of the main contributions of these data sources, with a focus on healthcare databases and networks that are registered in the ENCePP inventory.

2 Types of Healthcare Databases in Italy

2.1 Healthcare Databases in the Context of the Italian Healthcare System

The characteristics of the most common Italian databases, i.e. claims databases, are rooted in the structure of the Italian national health service (NHS). The latter is largely funded by national taxes and provides universal healthcare, with outpatient (including dispensing of drugs, diagnostic tests/specialist examinations, etc.) and inpatient care (including inpatient medical procedures, drug dispensing, etc.) covered by the NHS, with co-payment in some cases. The decision to include drugs in the national drug formulary, i.e. deciding which drugs will be covered by the NHS, falls within the remit of the Italian Medicines Agency (Agenzia Italian del Farmaco [AIFA]). In general, all “life-saving” and chronic medications are refundable, i.e. approximately 70% of all marketed drugs in Italy.

The NHS is decentralised and is organised at the regional level. At the national level, government funding for healthcare is provided to regional centres, who then provide regional healthcare services within their catchment area. The Italian regions provide healthcare providers, called local health units (LHUs). If patients seek medical care outside the catchment area of their LHU, their LHU reimburses the healthcare services delivered by other LHU healthcare providers within the same region or other regions, and collects claims data in a format established by the government at a national level. Care would be provided by a family paediatrician for minors aged ≤ 14 years and by a general practitioner (GP) for those aged ≥ 15. Access to non-emergency healthcare is filtered by primary care physicians, who therefore play a ‘gatekeeper’ role. Specialist physicians also act as ‘gatekeepers’ for non-emergency healthcare with regards to restricted drug prescribing [10]. As a result, LHU or regional claims databases can be accessed by several research organizations to conduct real-world studies. Moreover, several research centres have access to primary care medical record databases [11, 12] as well as drug or disease registries.

Most Italian healthcare databases and networks that have been used for pharmacoepidemiology research are currently registered with an ENCePP database inventory, as listed in the Electronic Supplementary Material (ESM) #1.

2.2 Claims Databases

In the early 2000s, the Italian government made the collection of claims data mandatory to account for regional healthcare service provision and the resulting expenditures. All Italian healthcare claims contain data on healthcare utilisation in separate data tables or databases, such as NHS-covered drug dispensing in community and hospital pharmacies, hospital discharge records, co-payment exemptions, outpatient specialist care, diagnostic tests and outpatient procedures, emergency department visits and birth certificates. These basic elements of the claims data are collected in the same way in each LHU; as a result, pooling data from single LHUs is straightforward and feasible. In many LHUs, these data tables can be linked through a de-identified unique patient code, although this is still impractical at the national level. The different types of claims databases that are available for each LHU or region are described in Table 1.

Table 1 Claims databases commonly available in Italian local health units and regions

2.3 Electronic Health Records

There has been an increase in the number and quality of EHRs in Europe that are used for pharmacoepidemiology research, the most well-known of which are perhaps The Health Improvement Network (THIN) in the UK, the Integrated Primary Care Information (IPCI) database in the Netherlands and Base de Datos para la Investigación Farmacoepidemiológica en Atención Primaria (BIFAP) in Spain.

The largest EHR database of this kind in Italy is the Health Search IMS Health Longitudinal Patient Database (HSD). This database was set up by members of the Italian College of General Practitioners and contains data from a network of over 900 physicians uniformly distributed across Italy who share their de-identified clinical records in this database [13]. Patients registered with the GP practices in HSD are representative of the adult population of all Italian regions in terms of age and sex distribution according to the official statistics provided by the Italian Institute of Statistics. The Pedianet network is similar to the HSD as far as data structure is concerned, but it contains only paediatric data [14]. Although it is uncommon, EHRs can be linked to claims databases. This was done for the EHR database Arianna, which was linked to the claims of the Caserta LHU database [15,16,17].

2.4 Medical Registries

Medical registries are the result of a systematic collection of information on all the cases of a particular disease or exposure to a particular drug. AIFA developed the AIFA drug-monitoring registries system primarily to improve early access to innovative therapies, to guarantee the sustainability and affordability of therapies, to collect epidemiological data and to monitor the appropriate use of several drugs. In 2005, AIFA implemented the PSOCARE registry to monitor the prescription of selected drugs for psoriasis. This registry system was active until 2010 and held data on over 20,000 patients [18]. The REACT registry is another dermatology-related registry set up to identify the causative agents underlying rare and serious conditions such as Stevens–Johnson syndrome, toxic epidermal necrolysis and drug reaction with eosinophilia and systemic symptoms (DRESS) and to report the final outcome of the reaction [19].

Several other registries concern drug use in specific therapeutic areas, including diabetes, oncology, ophthalmology, rheumatology, respiratory and neurological diseases. In general, monitored drugs are high-cost medicines and many are biotechnological products. Although AIFA drug registries have much potential as data sources for the post-marketing evaluation of the use and safety of newly marketed drugs, as yet they have rarely been used to generate real-world evidence. Another important registry is the European EUROCAT registry of congenital anomalies, in which two Italian regions participate (Emilia-Romagna and Tuscany) [7].

Drug registries can, in theory, be linked probabilistically to other available healthcare databases using general demographic information. This would guarantee the protection of data privacy while providing opportunities for post-marketing medicine assessment, which would be particularly useful for rare exposures and outcomes [20]. Registries of patients treated with orphan drugs are particularly relevant, as they enable evidence on epidemiology of rare diseases and the effectiveness and safety of the treatment to be gathered [21].

3 Quality

Good-quality data in the framework of real-world investigations can be thought of as data that accurately (1) capture individuals potentially needing healthcare; (2) identify patients accessing care, distinguishing between new and continuing use of healthcare; (3) describe them as far as possible regarding their clinical and socioeconomic features; (4) record their use of recommended healthcare; and (5) identify those who experience important clinical outcomes. Given this complex definition, the three types of sources of Italian real-world data—claims, EHRs and registries—have complementary strengths and limitations.

First, capturing all individuals belonging to a given target population who need drug therapy requires the population-based recording of healthcare utilization of that entire population. From this point of view, claims databases should be considered the cornerstone for investigations concerning complete catchment areas. However, limitations arise regarding their ability to capture all individuals needing healthcare, since they only capture healthcare reimbursed by the NHS. Moreover, only diseases requiring inpatient care and/or immediate medical attention and/or specific co-payment exception, treatments and/or diagnostic services can be traced from this source, and suitable algorithms combining these data sources must be developed and validated [22]. On the other hand, drugs dispensed during a hospital stay, as well as those that are not supplied by the NHS, including over-the-counter medicines, cannot be identified from claims databases. Conversely, medicines dispensed upon specialist prescription only, as is the case for several innovative drugs, can be tracked accurately. In addition, surrogate outcomes (e.g. serum cholesterol to verify that target values have been reached with statin therapy) cannot be identified using claims data.

Given the gatekeeper role of GPs and family paediatricians in Italy, primary care medical records provide a comprehensive picture of the overall health condition of patients and are particularly suitable to identify chronic conditions and those not linked to a healthcare claim. Moreover, the indications for the drug are recorded for every prescription. GPs contributing to the HSD or Arianna databases as well as family paediatricians contributing to Pedianet are trained to use a particular type of electronic medical record software, which is essential to maintain high-quality data recording. On the other hand, emergency admissions and acute medical episodes may be missed in such a primary care medical records database. The first prescription of reimbursed drugs prescribed by a specialist may also not be captured if leading to first drug treatment cycle dispensing from a hospital pharmacy.

AIFA-endorsed drug registries are rich in clinical details but cover only information related to a specific drug, and follow-up timespans may be shorter than in the other two types of databases. Table 2 summarises the strengths and weaknesses of Italian claims databases, EHRs and drug registries. The complementary strengths and limitations of claims databases, EHRs and registries provide a compelling argument for linking the three. Nevertheless, certain limitations are common to all three types of data sources. For example, while the International Classification of Diseases, ninth revision (ICD-9) disease coding system has been widely used in pharmacoepidemiology, it has a relatively small number of terms, estimated at 28,000; as a result, covariates must be defined through a laborious search among potentially useful codes present under various headings [23]. ICD-10 coding, which may be more appropriate for detecting adverse drug reactions, is rarely used in Italian data sources.

Table 2 Summary of strengths and weaknesses of Italian claims databases, electronic health records and drug registries

All data sources undergo thorough quality checks before studies are carried out. The completeness of the data is checked inhouse (e.g. missing records, correct and/or valid data format, duplicate records), data quantity and quality over time is compared to ensure it is consistent (e.g. comparing new and past validated data) for internal validation, and external validation is conducted by benchmarking, i.e. by comparing the demographic characteristics of the population under study, frequency of drug utilization and comorbidities in the single databases with other databases and/or national estimates as provided by the Italian national statistics office. In addition, several validation studies have been carried out, notably, to validate diagnoses of type 2 diabetes mellitus, hypertension, ischaemic heart disease and heart failure as well as their levels of severity in HSD [13]; to validate the prevalence of chronic pulmonary disease in HSD against the prevalence in claims data from five Italian regions (Veneto, Emilia-Romagna, Tuscany, Marche and Sicily) [24]; and to validate upper gastrointestinal bleeding in HSD and Agenzia Regionale Sanità (ARS) linkage NHS claims database compared with two non-Italian data sources: the Dutch electronic medical record database IPCI and the Danish Aarhus NHS-linked regional database [25]. Registries are validated in a similar way [26].

4 Applications of Italian Healthcare Databases to Post-Marketing Drug Utilisation and Safety Research

4.1 Drug Utilisation Research

Italy is frequently included in international drug utilisation comparisons aiming to identify discrepancies in patterns of drug use. Such comparisons, which should be interpreted in the light of the Italian healthcare system, enable identification of critical areas needing regulatory or educational interventions [27]. Moreover, Italian databases are often included in large European pharmacoepidemiological networks in which drug utilisation research is a key tool to define settings suitable for outcomes research [28] as well as to estimate the impact of drug utilisation in specific populations [29, 30]. The ARITMO project is an example of both these applications, aiming to evaluate the proarrhythmic risk of drugs. This project provided stimulating and sometimes unexpected findings on drug utilisation in various drug classes in Italy compared with other European countries [31,32,33].

The inappropriate use of medicines has been recognized as a major problem for global health [34]. As a result, strict monitoring of potentially inappropriate use might be the most important role of drug utilisation research. A number of Italian studies have evaluated the quality of prescribing, particularly in the elderly [35, 36]. The AIFA Geriatrics Working Group used data from the National Report on Medicines Use in Italy (OsMed) database to develop 13 quality indicators addressing polypharmacy, adherence to treatment of chronic diseases, prescribing cascade, under-treatment, drug–drug interactions and drugs to be avoided [35]. Recently, the Italian Group for Appropriate Drug Prescription in the Elderly (I-GrADE) identified a list of prescribing indicators for older adults with cardiovascular diseases and other chronic comorbidities, and is now validating them using ad hoc population samples [36, 37]. The appropriateness of drug prescriptions has also been extensively investigated in the paediatric population, particularly concerning antibiotic use in Italy, for which wide interregional differences in quantitative and qualitative prescription profile were found [38, 39]. See section 4.5 for further information on special populations.

Italian databases have also been widely used to assess adherence to medications and to evaluate predictors of poor adherence and the impact of ad hoc interventions [40,41,42,43,44]. Other studies used multi-level approaches to thoroughly evaluate geographical variations in adherence to therapy for different conditions [45, 46]. Future research will also have to take into account temporal variations to better capture associations between adherence and health outcomes [47].

A recent Italian population-based cohort study provided important data on potential predictors of treatment failure, such as poor adherence, in a large sample of young people affected by schizophrenia through the analysis of a claims database in the Lombardy region [44]. Claims records have also been used to construct ad hoc algorithms for the identification of selected neurological disorders. One such disease is epilepsy. Using an independent source of patients with epilepsy in a district of Lombardy, northern Italy, a diagnostic algorithm including electroencephalogram and selected treatment schedules was found to be moderately sensitive for the detection of epilepsy and seizures [48].

The enrichment of the Italian healthcare databases with clinical data will be essential to improve the quality of drug utilisation research. As discussed (see Tables 1 and 2), clinical information in Italian claims databases is scarce and mainly restricted to hospital discharge records, emergency admission and exemptions from co-payment. On the other hand, Italian EHR databases such as HSD do have clinical information, but they usually cannot be linked to claims data. The recent introduction of e-prescribing in Italy could constitute an advance in drug utilisation research, as the new prescription form includes a field specifying the indication for drug use.

4.2 Implementation and Impact of Risk-Minimisation Measures

RMMs are public health interventions aimed at preventing or reducing the frequency and/or severity of adverse drug reactions (ADRs) [49]. A recent review reported that almost 75% of all RMM evaluation studies around the world were conducted using EHRs or claims databases [50]. This section explores two RMM evaluations carried out using Italian data sources: antipsychotic use in older people with dementia and off-label use of ketorolac and the related suspected ADRs reported.

Antipsychotic drugs are often used to manage the behavioural and psychological symptoms of dementia. However, these drugs were found to lead to an increased risk of all-cause mortality and stroke. Several drug agencies around the world issued drug safety warnings on this risk, leading to a number of observational studies [51,52,53,54,55]. In Italy, the national investigation into antipsychotic drug utilisation by people with dementia using the GP database HSD showed that the international (EMA) and national (AIFA) public health interventions promoting the appropriate use of these drugs did not lead to reduced prevalence of antipsychotic use, especially when compared with another country such as the UK, where use decreased over the study period (2000–2012) [3]. This study was instrumental in showing the effect of RMMs in Italy, and is the only one to do so to date.

Another public health intervention concerns the widespread off-label use of ketorolac. Ketorolac is associated with a higher risk of upper gastrointestinal bleeding and risk of precipitating renal failure [56]. As an RMM, in 2007 AIFA included ketorolac in the national list of drugs under strict monitoring and launched an information campaign among physicians by circulating a ‘Dear Doctor’ letter to ensure the appropriate use of this drug. However, despite EMA recommendations to restrict ketorolac prescribing to specialists only, Italy opted to allow GPs to issue non-repeatable prescriptions for this drug [57]. An Italian study explored the extent of ketorolac off-label use in Italy and its association with the risk for ADRs after the RMM adopted in 2007. The study included two data sources: the Arianna EHR database linked to the Caserta LHU database from 2002 to 2013 and the Italian pharmacovigilance spontaneous reporting system database from 2001 to 2014. Findings from the EHR indicated that the number of inappropriate prescriptions of ketorolac with an inappropriate indication for use decreased during the 3 years after the AIFA RMM (from 53% in 2007 to 45–46% in 2008–2010) but increased again after 2011 (from 50% in 2011 to 53% in 2013). Serious ADR reports including upper gastrointestinal bleeding due to ketorolac in the Italian Pharmacovigilance Database did not decrease after the RMM in 2007 [58].

4.3 Drug Safety

Drug safety information in the pre-marketing phase is inevitably limited because randomised controlled trials (RCTs) are not usually designed to evaluate drug safety profiles [59]. In fact, during drug safety assessment, minor deviations from the intended randomization could significantly affect the stability of the risk estimates [59]. By the time a drug reaches the market, in general fewer than 5000 individuals have been exposed to a drug. As a result, only the most common ADRs can be detected. At least 30,000 people need to be exposed to a drug to capture an ADR with an incidence of 1 in 10,000 exposed individuals. Furthermore, information on very rare but serious ADRs, long-term safety, use in special groups or drug–drug interactions is often incomplete or unavailable, even for commonly used drugs [60, 61].

Post-marketing surveillance systems may be passive or proactive; however, the most commonly used approach is passive drug surveillance relying on spontaneous reporting systems. This approach is susceptible to underreporting, lack of denominator information concerning drug exposure (needed to estimate risk), and difficulty in distinguishing ADRs from events associated with underlying diseases or other factors, as well as reporting bias from excessive media attention [62]. To strengthen post-marketing drug safety measures and compensate for their limitations, a different data source containing such information as well as other methods to analyse such data is required [63]. Claims data and EHRs can adequately address these limitations.

Safety studies using large healthcare databases form an essential component of post-marketing drug surveillance [64]. By increasing the population sample size and heterogeneity of exposure, this approach offers the opportunity to design studies including very large ‘unselected’ populations and to generate real-world evidence in several fields of healthcare. Additionally, compared with traditional spontaneous reporting systems, the post-marketing approach using electronic healthcare data does not suffer from underreporting and reporting biases, thus potentially facilitating a timelier identification of signals. Compared with spontaneous reporting systems for ADRs, real-world healthcare databases may have a greater potential for detecting signals for events with low frequency in the general population and for events that are commonly not considered as potentially drug induced [65].

As for drug utilisation research, Italian databases often contribute to European healthcare database networks for large-scale monitoring of drug safety [66]. For example, the EU-ADR project used data from eight databases set in Denmark, the Netherlands, the UK and Italy and the well-known association between nonsteroidal anti-inflammatory drugs (NSAIDs) and upper gastrointestinal bleeding to clearly show that combining data from European EHR databases to identify adverse drug event associations is feasible and can set the stage for changing and enlarging the scale for drug safety monitoring [67]. Since then, at least other three European multi-country collaborative projects involving Italian electronic healthcare databases have been developed: the ARITMO project [28], the SOS (Safety of Non-steroidal Anti-inflammatory Drugs) project [68] and the SAFEGUARD (Safety Evaluation of Adverse Reactions in Diabetes) project. Alongside these multi-national projects, several multi-regional projects have been developed and are still ongoing. One such example is the AIFA-funded multi-regional study, AIFA-BEST (Bisphosphonates Effectiveness-Safety Trade-off), aiming to provide an assessment of the benefit–risk profile of bisphosphonate use in Italy [69, 70].

Italian databases have also contributed to several new drug safety signals concerning, for example, (1) the risk of acute liver injury in children and adolescents using domperidone, flunisolide and human insulin [4]; (2) the differential risk of chronic kidney disease associated with individual NSAIDs, particularly ketorolac, and long-term use of oxicams, especially meloxicam and piroxicam [71]; and (3) the risk of diabetes onset associated with statins [72]. Drug exposure can also be assessed as a risk factor (or a protective factor) for the occurrence of a disease. The association between angiotensin-converting enzyme inhibitors (ACEIs), angiotensin II receptor blockers (ARBs) and motor neuron disease (MND) was investigated using administrative records from the Lombardy region. Contrary to a previous report that showed an inverse association between ACEIs and MND, the study found no significant association between MND/ALS and antecedent use of ACEIs or ARBs [73].

4.4 Comparative Effectiveness Research and Health Technology Assessment

Comparative effectiveness research (CER) concerns the direct comparison of existing healthcare interventions, identifying which patient groups will benefit most from a given intervention and which are more liable to harm. The availability of routinely collected healthcare data in Italy offers a unique opportunity for research in the context of CER, comparing the benefits and harms of therapeutic strategies available in routine practice, for the prevention, diagnosis or treatment of a given health condition aiming to determine which treatment works best, for whom, and under what circumstances [74]. CER can be, and is, carried out using common analytic methods applied in observational research based on healthcare databases.

In recent years, a large number of studies focusing on effectiveness and safety have been performed in Italy, at both local and multi-centre levels and in international contexts. One example is a multi-centre Italian CER study funded by AIFA, known as the OUTPUL project. This project investigated the outcomes of inhaled therapy in patients with chronic obstructive pulmonary disease (COPD). Data from three regions were pooled to address several effectiveness and safety issues [75, 76]. In addition, the multi-regional project on the evaluation of biologic drugs in cancer patients (Farmaci Biologici in Oncologia [FABIO]) was recently supported by AIFA with the aim of assessing drug utilisation, effectiveness and safety of oncologic drugs, with a specific focus on biological drugs, using the regional LHU databases of seven Italian regions: Lombardy, Tuscany, Marche, Abruzzo, Lazio, Sicily and Sardinia. Several other Italian studies have investigated the gap between scientific evidence from RCTs and clinical practice regarding drugs for hypertension [43, 77], hyperlipidaemia [78], diabetes [79], cancer [80, 81], chronic respiratory diseases [82, 83], osteoporosis [70, 84] and schizophrenia [85], among others. Finally, it is worth highlighting that some attempts to evaluate the cost-effectiveness profile of health interventions were based entirely on data derived from Italian electronic healthcare data [86, 87].

HTA concerns the systematic evaluation of properties, effects and/or impacts of health technology, and observational data also have an important role. Since HTA is an aid to decision makers, decisions are to be grounded in the context in which they are to be taken. Real-world data sources are extremely useful tools for HTA practice but should be always put in the context of their strengths and weaknesses. Several sources of data in Italy have been used for HTA purposes [88,89,90].

4.5 Post-Marketing Drug Safety Assessment in Special Populations

Drug use and safety monitoring among children, pregnant women and the elderly is a priority and a challenge as these vulnerable populations are normally excluded from RTCs [91].

4.5.1 Children and Adolescents

Limited data are available on drug use in children and adolescents, mainly related to differing doses and formulations, which often results in off-label use in such populations [92]. However, it is well-known that children differ from adult patients not only in size and weight but also in several physiological characteristics, resulting in significantly different pharmacokinetic and pharmacodynamic profiles compared with adults [93]. As a result, efficacy and safety data cannot be fully extrapolated from adult data [94]. Several commonly used medicines and medicinal products, including vaccines, antibiotics, antipyretics, NSAIDs and drugs used for the symptomatic treatment of colds or for gastrointestinal and metabolic disorders, could more frequently induce ADRs in children than inadults [95].

Many studies, some of which have used Italian healthcare databases, have investigated the safety profile of drugs in children and adolescents. They have shown that (1) anti-infective agents for systemic use, drugs for central nervous system disorders, and NSAIDs most frequently caused ADRs in children [96]; (2) several drugs were most frequently reported as suspected causes of hepatic injury (e.g. paracetamol, valproic acid, carbamazepine, methotrexate and others) [97]. Other studies investigated the effectiveness and safety of methylphenidate and atomoxetine in the treatment of attention-deficit/hyperactivity disorder (ADHD) in daily clinical practice [98,99,100].

Linking different healthcare claims databases (e.g. drug prescription, hospital admission, specialist visits) permits the monitoring of diagnostic and therapeutic approaches in paediatric patients with chronic disorders (e.g. asthma, epilepsy) and the evaluation of compliance between daily clinical practice and guidelines [101, 102].

The availability of routinely collected EHR data through the Pedianet network of Italian family paediatricians offers a unique opportunity for studying drug effectiveness and safety profiles among children (http://www.pedianet.it/en/). The safety of several drugs in children has been investigated using Pedianet, including NSAIDs [103, 104] and antibiotics [11, 105].

4.5.2 Pregnancy and Lactation

The need to monitor post-marketing drug use and safety in pregnancy was highlighted by the thalidomide tragedy, in which this drug, prescribed as a therapy for morning sickness in pregnant women, was found to be teratogenic, causing phocomelia [106]. However, initiatives aiming to address this information gap are increasing [103, 104]. In fact, long-standing concern about protecting the foetus and excluding pregnant women from clinical trials is slowly being re-evaluated [109]. Nevertheless, few studies have evaluated the safety and efficacy of drugs in pregnant or breast-feeding women. Regulatory authorities are recognising the need for reliable information on the use of medicines during pregnancy and lactation. Indeed, many women use medicines during pregnancy or lactation, whether for chronic or acute conditions. Italian data sources used for post-marketing studies include pregnancy registries, medical claims databases and spontaneous ADR-reporting databases.

Spontaneous ADR-reporting system databases can often provide important signals of teratogenicity or foetal toxicity after drug exposure. For instance, an evaluation of adverse event (AE) reports in the Italian Pharmacovigilance AE Spontaneous Reporting System (Rete Nazionale Farmacovigilanza [RNF]) regarding the pandemic influenza vaccination, which was also administered to pregnant women from October 2009 to June 2010 suggested a positive benefit–risk balance in this population [110].

An important development concerns the contribution of Italian healthcare databases to the EUROCAT network (which investigates congenital malformations) and to EUROmediCAT (which investigates drug-related aspects of congenital malformations). Both the Tuscany and Emilia-Romagna regions contribute claims data and registries [7, 111,112,113,114,115,116].

Finally, the Teratology Information Services (TIS) provides both the public and health professionals with tailor-made information on drug risks in pregnancy. TIS in Italy is a European partner of the European Network of Teratology Information Services (ENTIS), the aim of which is to optimise the interpretation of teratogenic risk, risk communication and risk management as well as recommendations for drug treatment in pregnant women [117]. Italian data sources have been used to conduct several investigations on gestational diabetes [118] and the use and effects of antidepressants during pregnancy [115, 116]. Some of these studies link other claims data to routinely collected Italian birth registry (see Table 1).

4.5.3 Elderly Residing in Nursing Homes

The elderly are generally considered susceptible to ADRs because of several factors, including widespread polypharmacy and multi-morbidity. However, elderly people residing in nursing homes (NHs) constitute a sub-population that deserves special interest because they are likely to be among the frailest, but health data in this setting may be difficult to capture. Polypharmacy is a concern in NHs, because elderly NH residents often have a large number of diseases and frequently require multiple medications. Polypharmacy and potentially inappropriate medications were found to be more frequent among institutionalized people than among elderly people living in the community [121], and inappropriate drug use was shown to increase the risk of adverse health outcomes (hospitalizations, emergency department visits, death) in NH residents.

Data on drug therapy in NHs for large pharmacoepidemiological studies have been retrieved outside of Italy from EHRs or from insurance claims and pill dispenser systems [122], whereas in Italy only ad hoc surveys on specific sub-populations are presently available [119, 120]. The absence of drug-related information in Italian NHs is partly because specific claims databases for this population are unavailable in many regions, which is in turn due to the different procedures for medication purchase and reimbursement in NHs compared with the general population.

Fortunately, NHs in Italy are beginning to make use of electronic case report forms, where data on drugs are integrated with other clinical data on diseases, functional status, etc. This is a good opportunity that could be exploited to conduct pharmacoepidemiological studies using a wealth of other information, particularly if all NHs could collect a standardized set of data, such as the ‘minimum data set’ that US NHs are mandated to supply for all residents in Medicare and Medicaid-certified NHs [125]. Despite the general lack of systematic data collection in NHs, some forms of routinely collected geriatric data can be used in observational research. One such example is geriatric data collected through a multi-dimensional assessment instrument used for administrative and clinical purposes, known as inter RAI, which contains sociodemographic, medical history, functionality status, and cognition information and has been used in many regions in Italy [37]. This tool has been used to study the risk of hospitalisation and death with anticholinergic drug use in elderly individuals in NHs.

4.6 Use and Safety of Generics and Biosimilars

Generic drugs are important for healthcare systems because they cost less than the originators and must be proven as effective as the originator in order to be marketed. Physicians are therefore usually advised to prescribe the former over the latter [122, 123]. However, whether generic drugs are as therapeutically effective and safe as their originator counterparts is still occasionally questioned by clinicians [128]. An Italian population-based cohort study using the claims healthcare utilisation databases of the Italian Lombardy region compared patients treated with generic and brand-name statins in terms of therapeutic interruption and cardiovascular outcomes [129]. This study showed that generic statins were not associated with a greater risk of discontinuation or hospitalization for major cardiovascular events than originator statins. Such a study, as well as others, has a key role in providing evidence to reassure physicians on the safety of generic drugs [130, 131]. As for biological drugs, concerns about the safety and effectiveness of biosimilar drugs are still widespread. These drugs are commonly used for high-burden therapeutic areas, such as dermatology, gastroenterology, rheumatology and oncology, providing on average at least a 20–30% purchase cost reduction in comparison to the reference [132].

Since 2006, many biological drugs have lost their patent; of a total of 41 biosimilars approved by the EMA [133], 36 are currently authorised in the EU (e.g. somatropin, epoetin alfa, filgrastim, follitropin alfa, insulin glargine, infliximab, etanercept and, recently, enoxaparin sodium, teriparatide, rituximab and adalimumab) and an additional 18 biosimilars are awaiting marketing authorization [134]. Biosimilars are biological drugs similar to a previously approved original biological drug (reference product) and are considered effective and safe therapeutic alternatives to their reference products, based on data from a comparability exercise [135]. However, in Italy, biosimilar use was suboptimal in clinical practice, for reasons that include safety concerns [136]. An Italian project funded by the Italian Health Ministry addressed this issue through a network of six claims databases from Palermo, Caserta, Treviso LHUs and the Tuscany, Umbria and Lazio regions, covering a total population of around 13 million inhabitants (25% of the Italian population). Four observational studies conducted in the course of this project showed an increasing trend in the use of biosimilar epoetins, filgrastim and somatropin; however, the proportion of those using biosimilar somatropin was rather low compared with biosimilar epoetins or filgrastim [133,134,135]. These studies also highlighted that therapeutic substitution between biosimilars and originators of the same therapeutic class is frequent in routine care, despite ongoing debates about their comparative safety and effectiveness. Italian post-marketing studies and analyses of pharmacovigilance databases have so far provided reassuring data on the effectiveness and safety of biosimilars of epoetins [139, 140], as well as the safety of switching between reference products and biosimilars of epoetins, filgrastim, somatropin and infliximab [141, 142].

4.7 Vaccines

In the last few years, vaccine use and safety has been of particular interest. Specific issues, such as concern about the 2009 A/H1N1 pandemic influenza, prompted the FDA and the EMA to approve the influenza vaccine without prior regular pre-marketing clinical testing. It was therefore important for studies in subsequent years to confirm the efficacy and safety of the vaccine. On this occasion, for the first time, pregnant women were vaccinated. A study was carried out in pregnant women vaccinated with the influenza vaccine [143], using the vaccine database of the Lombardy region in Italy to identify the exposure status, and the birth registry and the hospital discharge database to ascertain the outcomes.

Many Italian regions have computerized databases of vaccinated subjects that can be linked to demographic, outpatient and/or hospital datasets. These databases have formed the basis of several important studies, although the use of these registers has been limited to a few research groups within specific regions. Indeed, in Italy, immunization registries are created and updated at the regional level. Unfortunately, there is no single national vaccine registry [144].

Registry data are important not only in evaluating vaccine coverage but also in providing exposure data to compare with the spontaneous reporting data. The reporting rate (number of reports divided by number of exposed individuals) is used much more in vaccinovigilance than in pharmacovigilance. The disproportionality analysis used in the signal analysis is more difficult to apply when analysing vaccine data. Registry data allow the comparison of the event reporting rate associated with a vaccine with the background rate of the same event in the population. The linkage of vaccine databases with data from the national pharmacovigilance network has been the basis of several studies, some of which concern the safety of the hexavalent vaccine [145, 146] and of various pneumococcal vaccines. One study using the database of the Italian Pharmacovigilance Network (Rete Nazionale di Farmacovigilanza) and drug utilisation data for the three vaccines against measles, rubella and mumps (MMR) showed a higher rate of hypersensitivity to Morupar than to the other vaccines [147,148,149]. As a result, Morupar was withdrawn and other alternative vaccines were used instead. The type of data used to study vaccine safety is expanding in both number and nature. There is growing interest in the way social media can be used to capture patient-generated information on drug and vaccine safety. For example, a recent Italian study investigated the internet and public confidence in the MMR vaccination in Italy [150].

5 Challenges

5.1 Multi-Database Studies

Data from different databases, including those in different countries or regions, can be pooled to increase statistical power when studying rare outcomes. An additional benefit of such pooling is the increased generalisation of the evidence generated to broad and heterogeneous populations, increasing the value of such evidence [151]. By combining databases, the effects of a wide variety of healthcare services, including medications and other medical interventions, can be studied and compared on ever increasing scales [66, 67, 152].

Several examples of projects and studies based on multiple databases from European countries, including Italy [31,32,33, 68], as well as from Italian regions [36, 69, 70, 84, 153, 154], have been described in the previous paragraphs. Many such studies combined all individual patient records coming from single databases, generating an overall database on which the effect of interest is estimated, i.e. nominally by a one-stage meta-analysis [155]. However, the new European regulations on data protection [156] limit the use of electronic health data in this manner, making it more difficult to use a one-stage approach when combining individual participant data from different countries, and even from different regions within the same country. For this reason, alternative approaches to the one-stage meta-analysis need to be considered. Among these, the so-called two-stage meta-analysis involves separate analyses within each database and subsequent combination of database-specific estimates through standard meta-analytic techniques [157]. This approach should be in line with the above-mentioned European regulations and has been intensively used in many international, European and Italian initiatives [158].

Some initiatives aimed at comparing one-stage and two-stage approaches in the field of multiple database studies have been conducted [159, 160]. In the meantime, multi-regional studies using the two-stage approach are ongoing. For example, some current initiatives of the Italian Ministry of Health are monitoring healthcare pathways in Italian regions, particularly for diabetes, heart failure, COPD and breast and colorectal cancers.

5.2 Linkage with Clinical Databases

Many research centres in Italy have access to one or more databases, including claims databases, EHRs, or medical registries, and perform individual-level record linkage among databases. While Italy is rich in claims databases, the data remain largely of a non-clinical nature and are not collected for clinical purposes. The value of claims databases and similar data sources stands to improve significantly with the addition of a clinical perspective, through linkage with clinical databases, such as laboratory test values, radiographic imaging and geriatric evaluations. However, the linkage between claims and non-claims data, whether electronic medical records, registries or other types of data, is very challenging. A notable difficulty concerns the need to accurately and thoroughly harmonise not only the methods applied but also the identification of covariates such as medical events. This is a challenging and lengthy task that requires mapping different terminologies to specified concepts, such as those of the Unified Medical Language System [161]. Such obstacles hamper linkage of this kind also outside Italy, as noted by the paucity of clinical data included in real-world data analyses, which is a major limitation to the proper evaluation of drug safety and essential in assessments of effectiveness.

Nevertheless, some initiatives to link claims data to clinical data are underway in Italy. The linkage of claims with clinical databases has an important caveat: the clinical outcomes require validation to ensure they truly represent a clinical condition. Some important examples of clinical databases that can be linked to claims databases include (1) the comprehensive geriatric evaluations (including laboratory tests, activities of daily living, cognitive performance and body mass index) collected by a network of geriatricians from the Caserta LHU (almost 25% of the catchment area is available), (2) population-based cancer registries (reporting data with good quality and completeness involving all individuals belonging to the population covered by the registry who received a cancer diagnosis), (3) the hospital-based cancer registry of the Cancer National Institute of Milan (reporting histological, radiological and immunohistochemical data on lung cancer inpatients who have undergone surgery), (4) population-based cross-sectional surveys, such as those included in the Health Examination Survey (OEC/HES) coordinated by the National Health Institute (Istituto Superiore di Sanità) (reporting data on lifestyles, laboratory tests and health examinations of individuals randomly selected in the 1990s from resident populations within each Italian region) and (5) Fondazione IRCAB Registry of cardioverter-defibrillator implants [162]. However, these initiatives are currently the exception rather than the rule in the Italian setting, mainly because of privacy concerns. Nevertheless, progress is being made in this area. For instance, a recent study obtained specific permission from the National Privacy Authority to perform record linkage at the individual level between claims and EHR primary care records of more than 30,000 individuals in five Italian regions [163] for the purpose of validation. The appropriate use of probabilistic record linkage techniques remains an important way forward to link clinical and claims databases.

6 Conclusions

Although RCTs are undoubtedly the gold standard to demonstrate the efficacy and, to a lesser extent, the safety of new medicines before marketing approval, they have several limitations that can be addressed using healthcare databases. Italy is rich in healthcare databases, including EHRs, claims databases and AIFA-endorsed drug-monitoring registries. While much progress has been made in recent decades to use these data sources extensively to address various research questions on post-marketing use and safety of medicines, concerted efforts are needed to also harness data that are currently being collected but not used [164].