1 Introduction

Rare diseases affect between 25 and 30 million people in the United States, or 10% of the population, and an estimated 350 million people worldwide [1,2,3]. Although there is some variability in how a rare disease is defined between regions, a disease or disorder is considered rare in the United States when there are fewer than 200,000 cases at any given time, and in Europe, when the condition affects fewer than one in 2000 individuals [3, 4]. Globally, 75% of rare diseases are pediatric and 30% of the affected children do not live past the age of 5 years [1]. Despite the staggering number of individuals whose lives are altered due to a rare disease diagnosis, 90% of the approximately 7000 known rare disorders have no US Food and Drug Administration (FDA)-approved treatment [1, 5]. Barriers to research for effective treatments include restricted funding support, limited foundational disease-specific knowledge, gaps in understanding of the heterogeneity of the condition, caution around risk–benefit thresholds and clinically meaningful impact for the patient population, small and dispersed patient communities that challenge traditional methodologies, and fragmentation of efforts that impede timely scientific discovery [1, 6]. To reduce and mitigate some of these barriers, patient registries are increasingly utilized by experts within the rare disease research field to facilitate learning networks and research collaborations between industry, scientific researchers, regulators, clinicians, community organizations, and patients and families [7,8,9].

By definition, a patient registry is “an organized system that uses observational study methods to collect uniform data (clinical and other) to evaluate specified outcomes for a population defined by a particular disease, condition, or exposure, and that serves one or more predetermined scientific, clinical, or policy purposes” [10, 11]. The Agency for Healthcare Research and Quality’s registries guide identifies four primary use cases for patient registries: (1) describing a disease’s natural history; (2) determining clinical effectiveness of treatments; (3) assessing the safety of treatments; and (4) evaluating or improving quality of care [9, 12]. Often, registries collect additional individual-level details not typically captured within traditional clinical settings that can inform the design of clinical trials to better reflect the needs of the patient population [13].

Patient registry frameworks vary significantly, reflecting design differences that accommodate intended use cases and a predetermined purpose. Recognizable examples include (1) public health and epidemiological disease (tracking) registries that monitor the prevalence, incidence, and trends of specific diseases; (2) clinical registries which gather physician-entered information regarding a patient’s disease progression, treatment, and symptom management, often with the collection of biological samples; (3) product registries which capture data on the efficacy and safety of new or repurposed drugs, medical devices, and other therapeutic and pharmaceutical products; and (4) natural history registries, particularly relevant for rare disease research, that generate patient-reported data to document the foundational characteristics of a condition [6, 7, 13].

The challenges to rare disease research are many, including the heterogeneity of disease presentation and limitations in knowledge about true natural history within and between the ranges of rare diseases. We are at a critical moment: rare diseases are gaining recognition as a public health priority, and as such, the marketplace is rapidly expanding, resulting in community fragmentation from redundant initiatives in already small patient populations. In addition, emerging trends in precision medicine have focused on exploring more common, complex conditions, such as diabetes and heart disease, with the goal of identifying individual-level variations and smaller subgroups based on genetic subtypes, variations in drug response, and/or social and environmental disparities [14, 15]. The identification of subgroups effectively creates rare disease subtypes of common conditions that likely will also be subject to similar limitations persistent and prevalent for rare diseases related to research methods, data collection, standards of care, clinical specialization, and policy challenges.

As the FDA turns its focus to natural history studies, real-world data, and accelerated approvals for rare diseases, there is a parallel emergent trend and broad acknowledgement of the importance of building collaborative relationships in rare disease research that empower patients and community organizations, while supporting formal partnerships with academic, industry, and government agencies to advance high-value, high-utility research for rare conditions. The recent FDA Draft Guidance “Rare Diseases: Natural History Studies for Drug Development” clearly states the importance of natural history registry studies in the drug development process and also acknowledges additional benefits through establishing communication pathways for the community, identifying centers of excellence for rare conditions, and evaluating differences and reducing variations in treatment practice, while establishing improved standards of care and tracking the disease to provide demographic and prevalence data [16]. In relation to the utility of real-world data, which is the basis for real-world evidence, the “Framework for FDA’s Real-World Evidence Program” recognizes the importance of patient registries as a source, and highlights that processes to minimize missing or incomplete data and the collection of rigorous, high-quality data are critical to ensuring that the data generated are fit for use as real-world evidence [11]. In concordance with certain provisions of the 21st Century Cures Act, foundational, structural, semantic, and organizational interoperability processes must be implemented and widely adopted, thereby optimizing the utility of data, in order to accelerate research and development [17, 18]. This is particularly salient for rare diseases, where harmonizing data from different sources through the use of common data elements, core outcome sets, and standardized data structures can support the exchange and comparability of data across datasets and the utility and scalability of patient registries [19,20,21,22]. Finally, the FDA has prioritized, through the 21st Century Cures Act and the institution of the Accelerated Approval Program, the expedited approval of drugs that fill a critical unmet medical need for treating serious conditions based on a surrogate endpoint thought to predict clinical benefit, rather than on an initial measure of clinical benefit itself [18, 23]. The subsequent increase in new molecular entity (NME) submissions and approvals for rare diseases over the last 5 years represents a much-needed expansion of research and product development. Patient registries and natural history studies are a critical piece of the puzzle, promoting the acceleration of scientific advancement and product development informed by the lived experiences of the community. This manuscript outlines use cases and specific considerations for the implementation and application of patient registries and natural history studies.

2 Rationale and Considerations

2.1 Community Ownership and Multi-Stakeholder Collaboration

With the surge of big data and real-world evidence, there is no shortage of personal health information and patient-generated data. The evolution from disease-centered healthcare to individualized precision medicine and patient-centered healthcare models will continue to shift the landscape of research and data collection techniques [24]. With the rise in shared decision-making, individuals and community organizations are increasingly regarded as research partners, particularly when it comes to patient registry study design and governance [25]. By centering the perspectives of individuals affected by a rare disease diagnosis and empowering their participation in the planning of patient registries, there is increased potential for collaboration and engagement with researchers and the development of outcomes that are meaningful to the community [8, 26].

Incorporating collaborative research processes that include a variety of stakeholders such as patient organizations, researchers, clinicians, industry, and government agencies in the planning and design of patient registries is essential for rare disease research [7, 8]. Although stakeholders may have different objectives, the development of a patient registry has the potential to align interests around a centralized initiative [7]. A fundamental step in the research process is to begin with the patient community, supporting community ownership of the registry and data in order to ensure that the assets are enduring, longitudinal resources for the patient population that are not subject to disruptions in funding, resources, and business priorities or impacted by proprietary or legacy ownership restrictions [6]. In addition, community advisory groups, Patient Listening Sessions and Patient-Focused Drug Development meetings are tools that can be used to support and inform the early and ongoing design, development, and implementation of a patient registry [27,28,29]. Multi-Stakeholder priority-setting processes help to ensure that the study reflects the needs, preferences, and priorities of the patients, incorporates the perspectives of industry to help minimize the development of proprietary solutions, and accommodates advancements in disease understanding over time [18, 24, 30]. Early-stage planning sessions have the ability to establish clear objectives for a proposed study, ensuring that the aims are well-defined prior to launching the patient registry [7, 8]. Multi-Stakeholder engagement, such as through research consortia, allows for different models of data governance and management and supports the inclusion of diverse perspectives in the overall study design, definition of purpose, incorporation of common and unique data elements, collection of standard measurement items and core outcome sets, refinement of inclusion and exclusion criteria for study participants, ethical requirements, analytical approaches, and dissemination plans [31]. There are a number of successful patient registry models and approaches that have linked both patient-reported outcomes and consolidated clinical health records to accelerate advancements for rare diseases [10]. While planning in advance of launching a registry is best practice, it is important to note that as registries mature, the objectives may evolve over time, particularly for registries designed for longevity, without a specified study end date [7]. Patient registries must be built with a modular framework, with the agility and ability to expand as studies progress, incorporating new data elements and measurement tools as scientific discovery continues to evolve and in-depth disease-specific knowledge emerges [7].

2.2 Identifying Modifiable Targets and Improving Efficiency and Quality of Clinical Trial Design

The development of rare disease clinical trials challenges traditional methodologies and study framework designs [32, 33]. It is well established that for rare diseases, inadequate information about disease progression and heterogeneity contributes to diagnostic and scientific delays; small, dispersed patient populations challenge recruitment practices and add considerations for study site location, sustainability, and travel for participants; and traditional clinical trial designs may not allow for adjustments in sample size and control groups that better suit many rare populations [33]. Despite the current rate of scientific advancement and product development, it has been estimated that it will take approximately 2000 years before every rare disease has an approved treatment, and that in order to expedite progress, the field of rare disease research must transform from individualized disease-specific efforts to cross-disease, multi-condition approaches [34, 35]. Comprehensive disease knowledge can help sponsors conduct well-controlled clinical trials incorporating designs that are of adequate and reasonable duration and that include appropriately powered samples that maximize treatment and minimize controls, reducing the need for large sample targets that may be unreasonable for small patient populations and lead to delays in trial initiation, while still allowing for the ability to demonstrate clinically meaningful change, safety, and efficacy [16, 32]. Patient registries are increasingly utilized to inform the design of clinical trials within rare disease research, serving as a tool to capture information, directly from the population living with the rare condition, on targeted needs and unique requirements that can and should inform the design of clinical trials for the community [36].

The data collected and analyzed have the potential to advance understanding of the natural history of a disease, illustrate important safety considerations, and define shortcomings in study designs related to demographically representative samples, variations in standards of care and treatment protocols, and heterogeneity in disease expression in terms of rates of progression and disease presentation to inform nuanced trial designs and approvals and additional actionable study design components [13, 33, 37]. Contribution of data to patient registries provides the opportunity to construct clinical trials in a manner that enables rare disease research to become more patient-centered; reshaping traditional methods of establishing protocols for rare disease clinical trial designs through, for example, identifying patient preferences, understanding meaningful patient endpoints, and informing study designs that reduce the burden of participation on patients and families. By creating disease-specific patient registries, individuals can participate regardless of their geographic location in a centralized data resource that can inform the design of trials and contribute to improved prognosis for the rare disease community.

2.3 Benefits and Challenges of Long-Term Monitoring and Follow-Up

Patient registries have the ability to enhance existing knowledge and disease characterization of specific rare diseases. Additionally, once a drug or device has been approved, patient registries provide an important source of safety data for post-market surveillance. As Ieva et al. stated, “When accurate measurement tools rigorously capture data that demonstrates disease progression over time, researchers can more reliably detect small effects and nuanced heterogenic characteristics in the population to inform research studies, trials, and therapeutic product development” [26]. Despite the convenience of patient registries, there are also challenges associated with long-term monitoring within observational and longitudinal studies. For instance, early recruitment may be difficult as education, messaging, and marketing are rolled out and patients are learning about the opportunity while assessing their level of interest, comfort, and trust in the study and team. Once participants have chosen to enroll in a registry study, there may be retention and engagement barriers that affect consistent participation throughout the duration of the study. Low and inconsistent participation can result in incomplete or missing values within a dataset, thus limiting the value and strength of the study [9]. Building relationships with and engaging and supporting community patient organizations is a powerful approach to enhancing partnership and understanding barriers and facilitators to participant engagement, and when done authentically, reinforces the overarching study objectives [9, 33].

As with all study designs, the risk of bias can be a challenge within patient-focused registries [38]. Strategies to mitigate response bias and recall bias should be considered when developing the study protocol and designing the registry. Ensuring that the questions asked in a registry are truly needed for the study purpose, notifying participants before sensitive information is asked, and allowing participants to opt-out of responding to personal questions may help to reduce response bias for self- or proxy-reported information [38]. Expanding the data sources in a patient registry, either through data integration processes or allowing participants to upload supporting information, may help participants as they navigate through the registry, reducing recall bias for detailed questions such as medication name and dosage, while also providing a secondary source to validate patient-entered information. Alternatively, a streamlined approach to adding real-time, ad-hoc updates to the data record may reduce the time between reporting, thus reducing additional challenges related to recall [38]. Long-term monitoring requires the sustainability of a patient registry to be assessed prior to study implementation and frequently thereafter [7]. Factors such as funding, ownership, partnerships, data analysis, quality controls, security audits, and regular and responsible communication with the community all contribute to overall reliability and sustainability [7].

2.4 Governance and Data Stewardship

Proper handling of ethical, legal, social, and privacy issues must be a foundational component of the design, implementation, and long-term sustainability of a patient registry [7]. A research study that involves the collection of identifiable information from human subjects requires formal review and approval by an Institutional Review Board (IRB), an independent ethics committee that reviews research studies and ensures that the study protocol, governance, protections, and methods are ethical and appropriate [7]. Participation in research is always voluntary and optional, and participants are allowed to withdraw at any time. Once enrolled, a participant or their legally authorized representative must provide informed consent for the collection, storage, and use of their personal health data prior to sharing any personal data.

It is the obligation of researchers to be data stewards and protect individual patient data within the registry study. Sound data governance protocols include a well-defined set of procedures to ensure protections for the participants are met and the overall management of data security, integrity, and availability is monitored and regulated [7]. Patient registries are required to ensure that they are compliant with the data collection and sharing regulations of their region [7]. In the United States, the collection, storage, and usage of medical information is governed by the Health Insurance Portability and Accountability Act (HIPAA), which was enacted in 1996 [7]. More recently, the European Union implemented the General Data Protection Regulation (GDPR) in May 2018 [7]. The regulation protects individuals within Europe, as it relates to sharing and using their personal data [7]. These are examples for consideration and are not intended to be an exhaustive list of security and compliance regulations. Each registry owner must ascertain that their study meets the necessary requirements for compliance.

3 Conclusion

Patient-centered registries collect cohort data to inform researchers of the natural progression of a disease; assist in the recruitment of participants for clinical trials; enable the monitoring of clinical treatments and outcomes in patients; and provide support for the establishment of disease-specific standards and care [7, 26]. The wide use of these registries increases research accessibility for individuals affected by rare diseases and provides researchers efficient access to valuable patient data, a cornerstone of improving disease-specific knowledge, management, and treatments. Throughout the development of a patient registry, it is essential for all stakeholders to clearly define the study objectives and ensure that the registry is designed with maximum sustainability and is ethically governed, the data purpose and analysis plans are well-established, and there are sustainability and transition protections set forth [7]. Although patient registries provide substantial value to specific rare disease communities, it is important to note that they may have greater impact when combined with multiple data sources [9].

The current global movement towards innovative and patient-centered healthcare is enabling patient registries to increasingly emerge as valuable tools within the rare disease research field [7, 9]. The United Nations, on September 23, 2019, adopted a declaration on Universal Health Coverage, as part of the 2030 Sustainable Development Agenda, that included, for the first time, recognition of rare diseases, marking a major milestone and priority indication for the population [39]. The declaration provides leverage for policy makers and practitioners to advocate for national action toward providing health services for all people affected by rare diseases, ensuring that no population is left behind. This declaration represents a critical shift in the dynamics of rare disease policy and research, with the potential to transform how national and international goals are set, prioritized, and pursued in order to best address the needs of and accelerate progress for the rare disease community. In the United States, the US FDA recently announced a landmark initiative, the Rare Disease Cures Accelerator–Data and Analytics Platform (RDCA–DAP), a centralized, standardized infrastructure platform to support and accelerate rare disease characterization, scientific discovery, and drug development [40]. The program establishes processes for the characterization of rare diseases and addresses some of the most complex and persistent research challenges to support innovation and expedite clinical trial design and regulatory review. The investments and advancements that are made today will be felt for generations to come. The power of patients is undeniable, and for rare diseases, it has never been more clear: the time to act is now.