Evaluating existing security and privacy requirements for legal compliance
- First Online:
- Cite this article as:
- Massey, A.K., Otto, P.N., Hayward, L.J. et al. Requirements Eng (2010) 15: 119. doi:10.1007/s00766-009-0089-5
- 440 Views
Governments enact laws and regulations to safeguard the security and privacy of their citizens. In response, requirements engineers must specify compliant system requirements to satisfy applicable legal security and privacy obligations. Specifying legally compliant requirements is challenging because legal texts are complex and ambiguous by nature. In this paper, we discuss our evaluation of the requirements for iTrust, an open-source Electronic Health Records system, for compliance with legal requirements governing security and privacy in the healthcare domain. We begin with an overview of the method we developed, using existing requirements engineering techniques, and then summarize our experiences in applying our method to the iTrust system. We illustrate some of the challenges that practitioners face when specifying requirements for a system that must comply with law and close with a discussion of needed future research focusing on security and privacy requirements.
KeywordsSecurity requirementsPrivacy requirementsLegal complianceRefactoring requirements
All that may come to my knowledge in the exercise of my profession or in daily commerce with men, which ought not to be spread abroad, I will keep secret and will never reveal.
The Hippocratic Oath
Although centuries old, the well-known Hippocratic Oath1 still influences our cultural understanding of ethics in healthcare. The Hippocratic Oath may be known best for its “do no harm” clause, but the privacy promise quoted above is extremely motivating to information security and privacy professionals, particularly in the field of medicine. Perhaps partly to fulfill the ancient privacy promise in the modern age, the United States passed the Health Insurance Portability and Accountability Act of 1996 (HIPAA).2 The resulting HIPAA regulations govern patient health information usage by providing detailed security and privacy procedures to support the practical needs of insurance companies, healthcare providers, law enforcement and other organizations that have a bona fide need for access to patient health information.
Although HIPAA’s focus is broader than computer-based systems, it was intentionally constructed to cover them. Healthcare facilities have been using computers to manage basic health information, such as patient admission and billing, for decades . However, most of these computer-based systems currently do not use software designed and built for this purpose, called an Electronic Health Records (EHR)3 system . EHR systems store, update and transmit a patient’s health records digitally. They are designed to reduce error rates in the administration and billing of medical services by storing and transmitting continuously updated medical records . EHR adoption has progressed slowly: a recent survey published in the New England Journal of Medicine found that only 4% of respondents had “an extensive, fully functional EHR system,” while 13% had a more basic EHR system .
This paper describes shortcomings and improvements with respect to HIPAA compliance for the security and privacy software requirements in the iTrust medical records system. iTrust is an open-source EHR system developed as a project in a graduate-level software testing and reliability course at North Carolina State University. The initial requirements for iTrust were expressed as Unified Modeling Language (UML) use cases and developed in consultation with both a practicing physician and a professional from the North Carolina Healthcare Information and Communications Alliance (NCHICA). Notably, the initial iTrust requirements called for HIPAA compliance.
Although iTrust is being designed for future use in medical settings, to date it has been used exclusively in a classroom setting where it has evolved over the course of eight semester-long courses. Each semester, students are responsible for adding and testing both new and existing functionality. From an educational standpoint, the project exposes students to realistic challenges in software engineering: working within non-trivial code bases, managing information security and privacy issues, and using modern integrated development and testing environments .
The iTrust system shares many characteristics with other existing software systems that must comply with laws and regulations. Financial services, industrial management, education, and many engineering industries also use computer systems extensively and are heavily regulated. Such systems must be evaluated, updated, and improved to achieve compliance with each enactment or amendment of relevant laws and regulations.
Requirements engineers evaluating systems like iTrust for legal compliance must be able to establish and maintain traceability from software artifacts to the relevant governing legal texts4 to demonstrate due diligence in satisfying security and privacy requirements . Herein, due diligence refers to careful efforts taken to satisfy a legal requirement or to fulfill an obligation. This paper demonstrates how requirements engineers can evaluate security and privacy software requirements for compliance with relevant law in the form of a specific legal text. Specifically, we discuss our study of the iTrust security and privacy requirements with respect to the HIPAA regulation. Our study entailed three primary activities: (1) mapping iTrust system terminology to the corresponding HIPAA terminology; (2) clarifying software security and privacy requirements with respect to legal requirements; and (3) improving traceability between the HIPAA regulations and the software requirements. To comply with legal texts governing security and privacy, we analyzed the entire iTrust requirements documentation to identify and improve the security and privacy requirements.
The remainder of this paper is structured as follows. Section 2 describes the related work and background material that forms the basis for our methodology. Section 3 describes our methodology in detail. Section 4 relates our experiences applying the methodology to iTrust. Section 5 discusses lessons we learned from our study. Section 6 offers several directions for future work in this area. The Appendix provides three sample legal requirements analyzed during our study.
2 Related work
Researchers advocate aligning software requirements and organizational security and privacy policies . Others have used regulations as a starting point for requirements identification . In contrast, the methodology presented herein entails evaluating and improving a set of existing requirements that must comply with relevant privacy and security law. This section discusses relevant work in legal compliance relating to software requirements as well as requirements engineering techniques employed in our methodology.
2.1 Legal compliance and security/privacy requirements
Legal compliance is a challenging software engineering issue . Monitoring systems for compliance has been recognized as a significant problem . Legal texts, however, present special problems for managing compliance: ambiguities, cross-references, domain-specific definitions, acronyms, and the evolution in law have all been identified as confounding factors facing requirements engineers working with such texts .
The HIPAA regulations, in particular, cover a diverse range of healthcare organizations. The regulations’ broad and, at times, ambiguous nature complicates the legal compliance landscape for requirements engineers attempting to implement healthcare software systems. Whereas many healthcare organizations and software companies are building EHR systems as a viable mechanism for improving HIPAA compliance, smaller organizations may experience difficulties because they lack resources, tools, and training. This not surprising given that size has been shown to be a limiting factor in adoption of other software engineering techniques—such as requirements management, project planning, and quality assurance—by smaller organizations .
In the United States, penalties for HIPAA non-compliance can be severe. The regulations (specifically, 45 C.F.R. §160.402)5 provide for civil penalties of up to $25,000 per individual per violation and criminal penalties of up to $250,000 and 10 years in prison. Furthermore, criminal violations of HIPAA can also involve violations of other laws. For example, in United States v. Gibson,6 Gibson pled guilty not only to illegally obtaining Protected Health Information (PHI), but also to committing mail and wire fraud with that PHI.
Attempts to achieve legal compliance in EHR systems have been based on summarizing and simplifying regulations . Some of these attempts depend on checklists with guidance for auditing current practices. For example, Beaver and Herold’s approach to HIPAA compliance contains 22 practical checklists to improve HIPAA compliance . Other efforts utilize HIPAA Privacy and Security Rule summaries provided by the U.S. Department of Health and Human Services (HHS), rather than the actual regulatory text. For example, iTrust, the system to which we applied our methodology, references HIPAA through the HHS website. These efforts distance engineers from the actual legal text, creating numerous obstacles and vulnerabilities for legal compliance .
We employ the term “due diligence” to refer to the legal concept of due diligence, which means “the diligence reasonably expected from, and ordinarily exercised by, a person who seeks to satisfy a legal requirement or to discharge an obligation” . In addition, “legal compliance” refers to having “a defensible position in a court of law” , which requires that a system has both requirements that accurately reflect the law as well as implementation and test procedures that demonstrate legal due diligence. We focus on evaluating and improving security and privacy requirements to achieve legal compliance and assume the existence of implementation and test procedures. Because our focus is on security and privacy requirements, our legal compliance efforts are focused on Part 164 of the HIPAA regulations—titled “Security and Privacy”—and §160.103, which contains definitions that are a part of the “General Provisions” governing all HIPAA regulations.
Researchers are developing techniques to ensure that software requirements comply with policies and governing legal texts [4, 5]. Recent efforts begin with deriving requirements directly from legal texts and stress maintaining traceability between requirements and regulations (e.g., [6, 12]). Other efforts focus on measuring compliance (e.g., [13–15]). Each of these approaches has merits that have influenced our methodology, as we now discuss.
Barth et al.  developed a first-order temporal logic framework to model privacy-related information found in legal texts. This framework can be used to measure compliance but does not maintain the traceability necessary to manage change in either legal texts or source code over time, making it difficult to map terminology between the two domains . Furthermore, the framework’s formalization does not capture certain types of legal obligations, such as modeling group attributes .
Breaux et al.  developed a method to extract rights and obligations from legal texts, called the Frame-Based Requirements Analysis Method (FBRAM). This methodology enables requirements engineers to identify ambiguities and implied rights and obligations . The rights and obligations can be used to build prioritizations and exceptions for software systems while maintaining traceability to the original legal text . This methodology has been used to examine the HIPAA Privacy Rule but has only recently begun to be applied more broadly to other regulations . In addition, it does not obviously extend to evaluating or improving existing software requirements.
May et al.  took an access-control approach to legal compliance. The construction of mandatory access-control rules from legal texts provides an API (Application Programmer Interface) that can be used to perform simple compliance verification . May et al.  assumed ambiguities within a legal text have a satisfiable condition; however, some ambiguities do not have a logically satisfiable condition because they are specified with the intention of being disambiguated and interpreted by courts or regulatory agencies . Furthermore, May et al.  assume that external references do not conflict or incur additional ambiguity for a software system. These external references, however, leave an organization vulnerable to non-compliance and data breaches. As with the FBRAM, this approach starts with legal texts but does not obviously extend to evaluating or improving existing software requirements.
Massacci et al.  adopt a goal-modeling approach to requirements compliance. This framework, called SecureTropos, has been used to evaluate compliance with the Italian Data Protection Act . The approach requires manual extraction of goals, soft goals, resources, tasks and relationships, but it does not capture traceability information . The primary focus of this approach is on security requirements; it is unknown whether this approach would cover non-security aspects of legal texts .
2.2 Requirements engineering techniques
The original iTrust requirements were expressed as Unified Modeling Language (UML) use cases. Use cases alone do not satisfy the needs of requirements engineers because they lack contextual information . Within the context of a medical care system, Glinz details nine specific problems resulting from employing use cases to specify the requirements . These nine problems include four issues Glinz labels as essential: the inability to model structure between use cases, challenges modeling use case interaction, the problem of expressing state-based systems, and the difficulty of modeling information flow for systems composed of subsystems . We encountered each of these four essential issues in the iTrust requirements specification. Glinz shows that no workarounds are possible for these essential problems . Therefore, we chose to pursue a natural language requirements approach.
Scenarios and use cases are important sources of requirements [18–23]. With this in mind, we applied a full iteration of the Inquiry Cycle model  to the iTrust requirements document as a part of our requirements analysis efforts. An inquiry-driven approach entails asking and answering a series of questions to, for example, detail when and where information flows through a proposed system . Potts et al.  identify several question types that are useful for requirements elicitation and analysis including what-is, when, what-if, and who. These types of questions directly relate to our three focus areas: What-is questions surface definitional issues, which we use to improve the terminology mapping and traceability of the requirements. When and What-if questions highlight timing considerations and exceptions respectively, which we use to improve prioritization. Who questions identify actor roles, which we use to improve the terminology mapping and traceability of the requirements.
Antón et al.  present techniques for removing ambiguity and inconsistency in requirements to avoid and resolve conflicts between policy documents and the systems they govern. They describe four relationship types—constrains, depends, supports, and operationalizes—that align and classify requirements and policy documents, and allow engineers to identify potential areas of incongruous behavior . We adopt the definitions for each of these relationship types in our methodology and identify them in our requirements to improve prioritization and traceability. Relationship questions have also been identified as useful when used as a part of the Inquiry Cycle model .
Antón et al.  also classify ambiguity into two groups: ambiguity and incomplete ambiguity. They define ambiguity as “differences between terms used within documentation in which there is a need to qualify or further refine some term” and incomplete ambiguity as a “specialized form of ambiguity that results from terms being left out of the documentation” . Unintended ambiguities are ambiguous statements resulting from both natural language syntax and semantics . Because unintended ambiguities need to be addressed, especially in the context of legal compliance, we documented any ambiguities in the original definitions or use cases. As shown in Sect. 3, we adopt both definitions for ambiguity and incomplete ambiguity provided by Antón et al. .
3 Methodology to evaluate legal compliance of security and privacy requirements
Terminology mapping takes as inputs the set of terms used in the software requirements and the set of terms described in the legal text with which those requirements must comply and provides as an output a mapping between those two sets. Relevant terminology broadly falls into three possible categories: Actors, Data Objects, and Actions.
Requirements identification and disambiguation takes as an input the set of requirements defined for the software system and the terminology mapping output. This activity produces as output a different set of requirements that cover the original software system and requirements implied by the legal text with which the system must comply.
Requirements elaboration takes as an input the set of requirements produced by the requirements identification and disambiguation process. The output is a set of disambiguated requirements that have been elaborated so that software engineers can understand them without specialized legal domain knowledge.
Tracing requirements to legal texts establishes traceability links for each requirement from the set of requirements produced by the requirements disambiguation and elaboration activity to the relevant statements in the legal text with which each requirement must comply.
3.1 Terminology mapping
In the first methodology activity, analysts create a terminology mapping. During terminology mapping, analysts identify terms in both software documentation and legal texts. The associations between these software terms and legal terms are then recorded. This terminology mapping provides an essential minimal level of traceability for establishing due diligence in security- and privacy-sensitive systems.
Actors/Stakeholders are individuals or organizations explicitly mentioned in either the software documentation under analysis or the legal text with which compliance must be achieved. This definition does not include individuals broadly considered to be stakeholders (e.g., customers, developers) unless they are explicitly mentioned in the law or in the software documentation.
Data objects are those information elements explicitly mentioned in either the software documentation under analysis or the legal text with which compliance must be achieved. For example, because HIPAA explicitly defines Protected Health Information (PHI), PHI is considered a data object.
Actions are tasks performed by actors, which may or may not be legally compliant. Actions are explicitly mentioned in either the software documentation under analysis or the legal text with which compliance must be achieved.
Legal domain experts and requirements engineers must first determine the focus of the legal text under analysis. Some legal texts, such as HIPAA, focus primarily on actor roles, whereas others, such as the Graham-Leach-Bliley Act,7 focus on data objects. Terminology mapping must be done systematically based on the focus of the legal text to reduce the effort required in the mapping process. Once determined, the primary focus area should be the first terminology type mapped. Actions should always be mapped last because they are dependant on actors and data objects. For example, if the legal text being studied is actor-focused, then the mapping order should be actors, followed by data objects, and then actions. If the legal text is data object-focused, then the mapping order should start with data objects, followed by actors, and then actions. Herein, we examine legal compliance with HIPAA, which is actor-focused. Thus, the terminology mapping is described in the following order: (1) Actors, (2) Data Objects, and (3) Actions. We now discuss the mapping of actors, data objects, and actions in turn.
3.1.1 Mapping actors
Identifying actors in a system and mapping them to the corresponding actors specified in the relevant legal text is an important initial step for establishing compliance with laws and regulations governing security and privacy. Mapping actors is complex because a single actor in a requirements document may fulfill zero, one, or many roles in a legal text. In addition, multiple actors in a requirements document may map to the same actor in a legal text.
Prior research has shown that it is helpful to create a stakeholder8 role hierarchy to clearly delineate roles . Creating this hierarchy helps analysts identify all actors both explicitly and implicitly referenced within the source materials. Hierarchies that appropriately express the legal actor roles as they relate to one another aid analysts in mapping from legal roles to those defined in software. Creating a stakeholder hierarchy from a legal text may further require a legal expert to properly disambiguate relationships among actors.
Mapping actors provides a basis for checking that the specified actions for the system actors are consistent with the corresponding legal actions for the actors specified in relevant law. It may also uncover important security, privacy, and legal conflicts, particularly when multiple mappings occur. In the example mapping above, surgeons may inherit legal obligations through their general classification as medical professionals, such as the requirement to report evidence of potential child abuse incumbent upon all medical professionals. If this obligation is not recognized by the software system, then surgical residents may not be able to report suspected child abuse properly. Conflicts must be resolved by adapting actors in the software requirements to those of the legal text for the resulting system to be able to demonstrate legal compliance. Software requirements actors must be adapted because the legal specification is presumed to be unchangeable from the requirements engineer’s perspective.
3.1.2 Data object mapping
The next activity entails identifying and mapping the data objects. Even if the legal text with which the software must comply is actor-focused, it still may be necessary to create a data object hierarchy for the legal text and map data objects from the requirements document to the legal text. This decision should be made in consultation with a legal expert. In HIPAA, identifying and mapping data objects is more straightforward than identifying and mapping actors because HIPAA is primarily actor-focused. Given our prior experience with HIPAA [6, 12, 26], we determined that each iTrust data object could be mapped to a single data object in HIPAA for the purposes of our high level requirements analysis without the need for data object hierarchies.
Data object hierarchies may still be used, even with HIPAA, if requirements engineers and legal experts believe the complexity of either the requirements document or the legal text calls for such a mapping. For example, consider the term “Approved Diagnostic Information” to be a requirements document definition meaning “the set of diagnostic information a patient allows any licensed health care professional to view.” In addition, consider Protected Health Information (PHI) to be a requirements document definition meaning “any spoken, written, or recorded information relating to the past, present, or future health of an individual.” This definition for PHI could be textually substituted for that found in HIPAA §160.103, with key differences being exceptions listed in HIPAA and defined in other regulations.
These definitions of approved diagnostic information and PHI are not perfectly compatible since the sets of information are not equivalent. For example, a conversation with a physician would be considered PHI but may not be considered as approved diagnostic information. A data object mapping may clarify the legal implications of this example situation. Storing a recording of every conversation between patient and doctor is outside the scope of an EHR system. A summary of procedures performed, however, clearly falls within the scope of an EHR system. In the iTrust system, a summary of procedures performed could be considered approved diagnostic information. Determinations of this nature must be made through the data object mapping process.
3.1.3 Action mapping
After both actors and data objects have been identified and mapped, the actions themselves must be identified and mapped. Actions are tuples that consist of an actor, an operation, and a target, where the target can be an actor, a data object, or null. Actions are defined more explicitly in some legal texts than others and may or may not form a hierarchical structure. Requirements engineers must identify all actions described by the software requirements and map them to those described in the relevant legal texts.
Security, privacy, and legal compliance issues detected while performing action mapping usually result in actors or data objects being refined. Consider the action that takes place when a patient views his or her medical records using an EHR system. The action tuple consists of the patient as the actor, the medical record as the target, and ‘viewing’ as the operation (e.g., <patient, medical record, viewing>). In HIPAA, patients generally have the right to view their medical records, but there are some notable security exceptions. For example, patients are not allowed to review psychotherapy notes regarding their own mental health. In this example, requirements engineers must refine the definition of the medical record data object because the definition makes no distinction between psychotherapy notes and other diagnostic information. This refinement must target the data objects because of the distinction discovered between psychotherapy notes and other medical record information. Refining or removing the “viewing” operation would result in legal non-compliance because patients do have the legal right to view some parts of their medical records.
3.2 Requirements identification and disambiguation
Once an initial terminology mapping is completed, the natural language requirements must be extracted from all available sources. Our approach adopts three analysis techniques: the Inquiry Cycle model , ambiguity classification , and relationship identification . During the analysis activities described in this section, any newly identified requirements are added and annotated accordingly, noting the origin for that new requirement.
Q1: What does this mean? (what-is question type)
Q2: Why does it exist? (what-is question type)
Q3: How often does this happen? (when question type)
Q4: What if this does not happen or was not here? (what-if question type)
Q5: Who is involved in this description or action? (who question type)
Q6: Do we have any questions about this user role or use case? (follow-on question type)
Q7: Does this user role or use case have any open issues? (assumption documentation)
Q8: Does this user role or use case have legal relevance? (what-kinds-of question type)
Q9: What is the context of this user role or use case? (relationship question type)
Every requirement is annotated with the answers to these questions as well as any additional open issues, such as security and/or legal issues, or potential engineering challenges. The primary mechanism by which the requirements are disambiguated is by answering these questions.
R1: Constrains. Item A constrains Item B.
R2: Depends. Item A depends upon Item B.
R3: Operationalizes. Item A operationalizes Item B.
R4: Supports. Item A supports Item B in some fashion.
We focused exclusively on identifying relationships from one requirement to another and did not consider relationships from requirements to legal texts in this activity because relationships from requirements to legal texts are captured in the terminology mapping activity described in Sect. 3.1. When requirements are properly extracted from legal texts, all relationships identifiable in the legal texts will be mirrored in the requirements. We discuss the traceability connections between requirements and legal texts in Sects. 3.3 and 3.4.
3.3 Requirements elaboration
The elaboration activity requires the analyst to document each requirement’s priority and origin, add mapping of initial user roles to new user roles, and move original supplementary material to appendices of the new requirements document. Analysts should also address any unresolved security and privacy issues in the annotations created during requirements identification. Newly identified requirements should be documented and annotated with newly identified issues.
3.3.1 Document each requirement’s priority and origin
Critical: These requirements are the core elements of the software system; without these requirements, the system simply cannot function at all. This prioritization category includes security and privacy requirements that protect core functionality.
High: These requirements are not core elements in that they will not cause the system to fail if not met; these requirements may be elaborated with additional domain knowledge. This prioritization category contains all remaining security and privacy requirements.
Medium: These requirements are added by the requirements engineers, based on their own domain knowledge; in addition, these requirements elaborate the primary benefits of the software system.
Low: These requirements primarily address minor system aspects that enhance a user’s experience during use of the software system.
Requirements engineers must document any specific legal implications uncovered during this activity. Note that requirements with no identified legal issues may still have legal implications. For example, a requirement that has no identified legal issue within the HIPAA regulations may still have a legal implication with respect to other laws. The intent is to capture known legal implications to prioritize the requirements accordingly, not to ensure that all possible legal implications are documented. We further discuss potential future work regarding legal prioritizations in Sect. 6.
Original (Use Case Number): Requirements extracted from the baseline iTrust requirements document . The use case from which the requirement was extracted is denoted in parentheses.
Legal (HIPAA Regulatory Section Number): Requirements with a known legal issue. The regulatory section from which the issue originated is denoted in parentheses.
Common Domain Knowledge (Developer Name): Requirements commonly found in EHR systems that support basic healthcare information technology.
Development Experience (Developer Name): Requirements commonly found in web applications that support basic web application maintenance and usage.
The domain knowledge and development origins are defined herein in terms of the healthcare domain and web application development, respectively, due to the nature of the iTrust system; these would be adapted for other domains and development approaches as needed.
3.3.2 Record document provenance
Existing large software systems may have extensive documentation and may need to comply with several laws and regulations. In addition, legal texts, security policies, and privacy policies may evolve over time. In these cases, legal requirements analysis may need to be conducted iteratively or in stages. Requirements engineers should record the provenance of original documentation materials to support future iterations or compliance efforts. This documentation maintains traceability to the original elements of the software documentation when it is not possible to immediately update the requirements through the identification process. Furthermore, existing design documentation or source code may still refer to the original user roles; a mapping from original user roles to new user roles will prove useful when these software artifacts are updated to reflect the new requirements.
Although it is important to update supplementary material—such as original use cases or diagrams—to achieve legal compliance, it may not be practical for requirements engineers to make such changes to existing documentation while focusing on evaluating and improving the legal compliance of the requirements. To facilitate traceability to the original documentation, requirements engineers must move this supplementary material to appendices. (If the document is available in an electronic form, hypertext links provide excellent traceability support.) In addition, recording the provenance of original documentation supports security and privacy efforts by providing engineers with the ability to analyze both the new requirements and the original requirements for the same security flaws.
3.4 Tracing requirements to legal texts
In the fourth activity, requirements engineers must attempt to resolve all remaining issues in the annotations and explicitly trace each requirement to the relevant regulatory section. Ideally, requirements engineers and any consulting legal domain experts will be able to accurately identify each requirement with its legal implications along with the specific origin of that requirement. The legal origin of some requirements may have been identified in the previous activity, but requirements engineers must focus exclusively on tracing requirements to the relevant regulation as a separate activity.
Engineering Issues: Issues relating to the software engineering process itself, including requirements, design, implementation, and testing.
Security Issues: Issues relating to the security and privacy of information stored in or services provided by the software system.
Legal Issues: Issues with legal implications that cannot be traced by the requirements engineer to the relevant regulatory section of the legal text.
Consider the example action discussed in Sect. 3.1.3 in which a patient uses iTrust to view his or her medical records. If a requirements engineer has completed all the steps in Sect. 3.2 through Sect. 3.3 and still has an annotated issue regarding this requirement, then that issue must be classified and clarified for future consultation with a legal domain expert. The requirements engineer may have Engineering Issues related to implementing this feature because it may require additional password management for patients. The requirements engineer may have a Security Issue related to ensuring that patients are not able to use the feature to view medical records belonging to other patients. Although such a Security Issue may also be a Legal Issue, the requirements engineer may wish to document a specific Legal Issue regarding psychotherapy notes if they are unable to find the specific section within HIPAA regulating patient access to psychotherapy notes.
4 Evaluating software requirements for compliance with legal texts
The objective in applying our methodology was to improve the security and privacy of the iTrust software requirements and to evaluate their compliance with the HIPAA regulations. The artifacts available to us include the HIPAA regulations and version 12 of the iTrust software requirements specification. The requirements specification includes a brief introduction containing 12 glossary items, 27 use cases classified as functional requirements, and four non-functional requirements (one of which calls for HIPAA compliance) . As tends to happen when a collection of use cases are employed as a substitute for an actual requirements specification , these use cases function as both requirements and design descriptions of the intended functionality within the iTrust system. In addition to the use cases, the iTrust requirements specification includes a data field formats section that describes the attributes of the data stored in the database.
The iTrust requirements we identified using our methodology—along with the original use cases—are available in the iTrust Medical Care Requirements Specification, a wiki-based document . We identified 73 total requirements, of which 63 were functional requirements and 10 were non-functional requirements. The final requirements specification contained five sections, not counting appendices, as follows: Introduction, Description of User Roles, Functional Requirements, Non-Functional Requirements, and Glossary. The final requirements specification has automatic version control and enables one-click access to an index of term mappings, software requirements, use cases, and external references. We now present our findings from each analysis activity—for each of the activities in Fig. 1, we separately present the findings in the following subsections.
4.1 Results from terminology mapping
The iTrust documentation discusses eight actors: Patient, Administrator, Healthcare Professional (iTrust HCP), Licensed Healthcare Professional (LHCP), Designated Licensed Healthcare Professional (DLHCP), Unlicensed Authorized Personnel (UAP), Software Tester, and Personal Representative. In contrast, there are at least 27 distinct actor roles discussed in the HIPAA regulations (specifically, the portions we described in Sect. 2.1), highlighting that the HIPAA regulations encompass a larger scope and spectrum of actors than the iTrust system. To create a terminology mapping, each actor in the iTrust system must be mapped to the appropriate and/or corresponding actor(s) in the legal texts.
Index of term mappings
Direct HIPAA actor
Indirect HIPAA actor
Health care professional
HCP, person, CE, workforce
HCP, person, CE, workforce
Licensed health care professional
HCP, person, CE, workforce
Designated licensed health care professional
HCP, person, CE, workforce
Unlicensed authorized personnel
HCP, person, CE, Workforce
Individual, other persons
Identifying the iTrust actors proved much easier than identifying the data objects or actions because the original iTrust requirements document was comprised of use cases, which describe user interactions with the resulting system. The HIPAA regulations do not explicitly define data objects for all data sets on which actors perform actions. HIPAA does not clearly define the relationship between two data sets and the boundary between them. HIPAA broadly defines the following sometimes-overlapping data sets: electronic Protected Health Information, Health Information, Individually Identifiable Health Information, and Designated Record Set.
Because HIPAA’s definitions of data are not always discrete sets, the same data object easily can fall into more than one category. Data object classification depends, in part, on the action being performed and the actor performing that action. For example, a standard health record could be considered as PHI or as a part of a Designated Record Set, which is defined in §164.501 as “a group of records maintained by or for a Covered Entity (CE).” These broad data sets apply to a variety of healthcare providers because every CE does not collect the same patient health information elements. For example, psychotherapy notes are not present in all CE practices’ data records.
Aligning the different actions of iTrust and HIPAA enables the requirements engineer to specify the security and privacy requirements so that all actions are legally compliant. Identifying security and privacy actions in HIPAA is more difficult than identifying actions in iTrust, however, because regulations are intended to be general, whereas use cases are intended to describe specific usage scenarios. For example, iTrust has an action “enter or edit demographic data” that could map to several different HIPAA actions including access control, disclosure of PHI, or transactions.
In several cases, actions are explicitly defined in HIPAA. In §160.103, for example, a transaction is defined as “the transmission of information between two parties to carry out financial or administrative activities related to health care.” This section also defines eleven types of information transmissions, some of which have different conditions that must be satisfied depending on the information transferred. For example, the action “coordination of health benefits” would have all the conditions for transferring PHI (defined in §160.103) as well as all the conditions for transferring payments (defined in §164.501). These conditions must maintain the same precedence in the requirements and the software system as they have in legal texts; care also must be taken to ensure that these conditions do not conflict with one another during secure operations. In contrast, iTrust has fewer and less detailed transactions, which are logged at the completion of an iTrust action.
Actions described in the software requirements that do not map or conflict with actions described in the legal text should be annotated as either a missing requirement or an unrelated element of the system. For example, iTrust has a Software Tester actor that performs a “determine operational profile” action, which is not explicitly defined in the HIPAA regulations. The data for this action is aggregate system usage information per user type. This action serves as part of system maintenance and testing and therefore may be considered as an unrelated element to establish legal compliance. It could be argued that including such a feature mixes software maintenance processes with software products, but operational profiles are neither prohibited nor permitted by HIPAA.
4.2 Results from requirements identification and disambiguation
In Sect. 3.2, we explain how requirements identification and disambiguation entails applying the Inquiry Cycle model, identifying ambiguities, and recording relationships between requirements. In the iTrust system, this requires us to examine the glossary terms, introductory statements and the 27 use cases found in the iTrust requirements specification. From this analysis, we identified 53 new software requirements. Because we focused on security and privacy requirements stemming from the HIPAA regulations, requirements beyond the scope of an EHR were tabled for future analysis.
Recording the rationale and context for each iTrust requirement is challenging for security, privacy, and legal compliance until one can justify each requirement by tracing it to a specific section in the law. Business rationale is important for legal compliance because business requirements may conflict with legal requirements. If such a conflict occurs, the requirements engineer should use the legal rationale to guide the resolution of the conflict. Business rationale was missing from the original iTrust requirements specification, so we did not encounter this type of conflict. However, we did develop legal rationale for the requirements identified in this activity.
User ID: A user ID is a random number that is assigned at account creation time. Note that a user ID may identify multiple roles (e.g., a physician that is also a patient at the hospital).
Password: A password is an ordered set of alphanumeric characters between 6 and 20 characters long.
Security Issue: Should we add a requirement to indicate iTrust should cryptographically hash passwords?
Security Issue: We are explicitly not changing a password when provided only a user ID to avoid allowing anyone who knows a user ID the ability to change the password associated with that user ID.
Engineering Issue: Users may also forget their user names. This is currently not handled by the system.
Security Issue: If a user has forgotten their password and has a new email, they can not access their account in any fashion.
Engineering Issue: How should personal representatives be able to request a password for people they represent? What are the privacy issues with this process?
The Inquiry Cycle questions Q8 and Q9 also allowed us to establish a legal rational supporting the development of requirements in this scenario. In answering Q8, we determined that there was legal relevance for this scenario found in HIPAA §164.308(a)(5)(ii)(D). In answering Q9, we determined that there was an important relationship between authenticating and unauthenticating.
Requirement 12 Description: iTrust shall support user authentication by allowing a user to input their user ID and password.
Requirement 13 Description: iTrust shall support user unauthentication by allowing a user to explicitly indicate they would like to unauthenticate.
Requirement 15 Description: iTrust shall allow a user, using their authorized account, to request their current password be emailed to their recorded email address and shall require the user’s user ID be provided to satisfy this request.
Requirement 16 Description: After a user is authenticated, iTrust shall automatically unauthenticate the user and redirect to the login screen for any user session in which the user has remained idle for the timeout limit.
Requirement 17 Description: iTrust shall, when installed, assign the timeout limit to be 5 min and shall, at any time after installation, allow a system administrator to configure the timeout limit.
Properly identifying ambiguity types and relationships is an important yet challenging task, as many subtle intricacies are involved in precisely identifying and categorizing an ambiguity or a relationship. As a result, we recorded ambiguities that clearly fit within one of the defined ambiguity classifications A1, A2, or A3. In UC3, for example, the statement “UC1/UC2 has completed” was identified as an instance of ambiguity: it could refer to either UC1 or UC2 completing, or it could refer to both UC1 and UC2 completing. In addition, we recorded relationship types that clearly fit into one of the defined relationship classifications. For example, the same statement from UC3 exhibited two instances of the depends relationship, represented as R2, one for depending on UC1 and another for depending on UC2.
4.3 Results from requirements elaboration
Number of requirements
Number of requirements
Original use cases
Common domain knowledge
We also updated requirements and definitions extracted in the second activity as we analyzed our annotations from the previous activity and found statements that needed further elaboration.
4.4 Results from tracing requirements to legal texts
Answer: Unknown at this time.
We were unable to answer this question because although HIPAA §164.520 regulates “Notices of Privacy Practices,” it is unclear whether a particular user role in an EHR system should have exclusive access to create or edit these policy statements. We must consult with legal and healthcare domain experts to answer this question.
Type of issue
Number of issues
Requirement 15 Description: iTrust shall allow a user, using their authorized account, to request their current password be emailed to their recorded email address and shall require the user’s user ID be provided to satisfy this request.
Engineering Issue: Users may also forget their user names. This is currently not handled by the system.
To address this engineering issue, requirements could specify a user ID recovery mechanism. These requirements must be analyzed for security and legal compliance.
Security Issue: If a user has forgotten their password and has a new email address, they cannot access their account in any fashion.
To address this security issue, requirements could be created to allow users to notify an administrator regarding their inability to access their own accounts. The requirements for this new feature must also be analyzed for security and legal compliance.
Requirement 21 Description: iTrust shall allow a patient, using their authorized account, to read or update their demographic information, the demographic information for the patients they represent, their list of personal representatives, the list of personal representatives for the patients they represent, their list of designated physicians, and the list of designated physicians for the patients they represent.
Legal Issue: We are not sure whether a personal representative should be allowed to remove himself from a list of personal representatives.
If a personal representative were able to remove his or her name from the list of personal representatives, then patients may be incapable of making medical decisions for themselves and also no longer have a personal representative. A legal domain expert must address this legal issue and similar legal issues for which we are lacking domain knowledge; we are consulting with lawyers as we continue our work on the iTrust system.
It is important to note that each of our legal issues could also be considered security issues because we limited our focus to legal issues involving security and privacy. However, we chose to record them as legal issues because of the importance of demonstrating due diligence and avoiding legal non-compliance.
Requirement 67 Description: iTrust code shall adhere to the Java Coding Standards.
5 Lessons learned
We now highlight four specific lessons learned through the development and application of our legal compliance methodology. In particular, we discuss lessons regarding the use of actor (or stakeholder) hierarchies, complications resulting from unresolved ambiguity, requirements prioritization, and tool support.
5.1 Actor hierarchies are essential for security and legal compliance
Building actor hierarchies for both the requirements document and the legal text allow the requirements engineer to clearly and unambiguously state the legal rights and obligations for actors in the resulting software. Without explicitly constructing these hierarchies, the rights and obligations for actors that have indirect mappings to actors in the legal text are much harder to identify.
The differences between the iTrust Health Care Professional (iTrust HCP) and the HIPAA term Health Care Professional (HIPAA HCP) exemplify the usefulness of these hierarchies. For reference, Fig. 2 depicts the iTrust stakeholder hierarchy, and Fig. 3 depicts the HIPAA stakeholder hierarchy. In iTrust, Health Care Personnel (iTrust HCP) refers to any health care professional allowed to access health records; this iTrust actor maps to multiple actors in HIPAA. Specifically, the iTrust HCP could map to either the HIPAA Covered Health Care Provider (HIPAA CHCP)10 or the HIPAA Health Care Providers (HIPAA HCP), defined in §162.402 and §160.103 of the HIPAA regulations, respectively.
Both of these mappings are necessary because the HIPAA HCP and the HIPAA CHCP are each assigned different sets of rights and obligations by the law. For example, when we directly map the iTrust HCP to the HIPAA CHCPs, then the iTrust HCP also indirectly maps, as explained in Sect. 3.1.1, to HIPAA HCP, Persons, and Covered Entities (CE), each of which are defined in §160.103 and shown in Fig. 3. As a result, the iTrust HCP must satisfy the rights and obligations for HIPAA CHCPs as well as HIPAA HCPs, Persons and CEs.
Alternatively, when we map the iTrust HCP to the HIPAA HCP, the iTrust HCP directly maps to the HIPAA HCP and indirectly maps to the Person actor. In this case, the iTrust HCP does not have to satisfy the rights and obligations for HIPAA CHCPs or CEs, but it would have to satisfy the rights and obligations of the HIPAA HCP and Person.
A system designer implementing the requirements for an iTrust HCP could easily overlook some of these responsibilities if they are not explicitly mapped. For example, a HIPAA CE is responsible for obtaining authorization to disclose PHI to an employer because it is outside the normal course of treatment, payment, and operations. This is the sort of responsibility that should have a single actor with an explicitly defined role in an EHR system, which correlates to multiple HIPAA actors. By explicitly noting this relationship, we improve the likelihood that system developers will implement it properly.
5.2 Unresolved ambiguity can lead to security and privacy non-compliance
Identifying, classifying, and resolving the ambiguity found in the requirements is one of the most challenging and rewarding aspects of our methodology. To resolve ambiguity properly, requirements engineers must be able to infer the intended meaning in the statements. Evaluating each requirement to determine whether it contains ambiguity or incomplete ambiguity makes it easier for requirements engineers to identify ambiguity at all.
Upon initial inspection, this use case appears quite detailed, but several terms, including demographic information, are not clearly defined in the rest of the document. Each ill-defined term is an instance of incomplete ambiguity reflecting a need for clarification.
We also identified two clear instances of ambiguity upon further inspection. First, the entire use case details two separate actions (entering and editing demographic information) but does not clearly state if they are conducted using the same interface. Second, error condition 2, denoted as [E2], does not state clearly whether or not the second submission of demographic information is checked for errors by the iTrust system. By analyzing and classifying the ambiguities in UC4, we were able to identify and improve requirements with clarity.
5.3 Prioritizing requirements is helpful for identifying critical security requirements
As part of our requirements elaboration, we prioritized requirements according to the following categories: Critical, High, Medium, and Low. For example, Requirement 14 was classified as a Critical priority requirement, which indicates that it may have serious security implications and it should be built early in the implementation phase. The description for this requirement is “iTrust shall generate a unique user ID and default password upon account creation.” If iTrust were to generate a duplicate user ID, then a patient could access another patient’s medical records, which is a clear HIPAA violation under §164.321 and a serious security issue.
5.4 Requirements engineers need tool support for determining legal compliance
Automatic generation of links to glossary terms.
Automatic ordering of both requirements and glossary terms.
Support for generating documents tailored to different audiences (e.g., a document showing only the requirements that deal with a particular user role or a document showing only requirements of a particular priority).
Support for denoting issues, which could be automatically and optionally removed from the final document.
Support for foldable nodes for sections in the requirements document similar to that found in many source code editors.
Support for a mechanism that allows easy defining of glossary terms and keywords.
Support for wiki-style editing and formatting.
Support for searching particular elements of the document while ignoring others (e.g., searching just the issues and not the requirements or the glossary terms).
These features address several key elements identified by Otto and Antón as supporting legal compliance within requirements engineering : automatic ordering of both requirements and glossary terms supports prioritizing requirements. The use of wiki-style editing supports management of evolving legal texts. The use of hyperlinks supports tracing from requirements to legal texts. Including a mechanism that allows easy defining of glossary terms supports the creation and management of a data dictionary. Support for searching particular elements of the document while ignoring others enables semi-automated navigation and searching.
6 Summary and future work
Evaluating legal compliance in existing systems is a persistent problem that will not be solved on its own. Although new systems are being developed constantly, the majority of software development involves maintaining or improving existing software. Furthermore, laws rarely are created proactively; instead, many laws and regulations are reactionary and created to address known problems. The regular evolution of laws and regulations creates a pressing need for a methodology for evaluating the legal compliance of existing software. This paper describes our effort to create such a methodology.
Herein, we presented a basic approach for evaluating and improving the legal compliance of an existing software system. The sample legal requirements that appear in the Appendix should be useful to requirements engineers who are developing software systems that must comply with regulations that govern healthcare information. Several avenues of future work follow from our efforts; we now discuss a subset of these avenues.
Systems built prior to the HIPAA regulations coming into effect or without HIPAA domain knowledge face legal compliance issues similar to those found in iTrust. Many of these software systems have insufficient requirements documentation and poor traceability to both software artifacts and legal texts. Many existing systems also are similar to iTrust in that they are past the initial design phase and are being maintained with extensive test suites and the occasional new feature. Testing techniques that verify functionality alone are not enough to ensure security, privacy, or legal compliance; the software requirements and design must be validated against the relevant legal texts.
Although we attempted to evaluate and improve legal compliance as a part of our methodology, we have by no means provided a comprehensive overview of the extensive legal compliance issues faced by an EHR system [4, 6, 12, 13, 26]. However, we believe our experience applying our methodology has provided the groundwork that can enable further research in legal compliance. In particular, a full requirements document allows more detailed traceability between requirements and regulation. If a design document were created, then this traceability could be extended from the requirements phase to the design phase and eventually to the test phase of software development. A design document would also enable the refactoring of the iTrust source code.
In addition to improving traceability, our plans for future work include refining the requirements by interviewing domain experts. The regulatory landscape is complex and the functionality needed by healthcare facilities for their health records management is extensive. Interviews with healthcare professionals having expertise in both healthcare regulation and administration are needed so that we can properly represent domain knowledge from each perspective. Incorporating this domain knowledge would appreciably improve the accuracy and usefulness of all software artifacts related to iTrust.
We are exploring requirements prioritization based on the number of traceability links from a requirement to the areas in the relevant legal text where that requirement is mapped. Prioritization would be proportional to the number of links, where a higher number of links would indicate a higher priority. Although simplistic, this method would be an easier prioritization to generate and maintain while still providing a useful prioritization. We would like to compare and contrast this approach with a prioritization based on legal expertise and domain knowledge.
Future work for iTrust in scenario modeling for requirements, scenario modeling for UML, and goal-based requirements analysis and modeling appears quite promising. These traditional areas of requirements engineering have a long research history associated with disambiguating and clarifying the issues that we have listed throughout our study [16, 29–34]. Our engineering, security, and legal issues may be better modeled through scenarios or goals. Scenario and goal models could provide clarifying questions and answers that could be added to the requirements document or they may directly resolve any conflicts resulting from the original issues.
Pub. L. No. 104–191, 110 Stat. 1936.
Note that Choi et al. use the term electronic medical records (EMR) system rather than electronic health records (EHR) system [CCK06]; we treat these terms as interchangeable for the purposes of this paper.
This paper uses “legal texts” to refer to both legislation and regulations.
All subsequent references in this paper to HIPAA regulatory sections are from Title 45 of the Code of Federal Regulations (C.F.R.).
United States v. Gibson, No. CR04-0374RSM, 2004 WL 2188280 (W.D. Wash. Aug. 19, 2004).
Pub. L. 106–102, 113 Stat. 1338 (1999).
Note that our work has a more limited definition of stakeholder than that used by Breaux and Antón. Specifically, we only consider stakeholders that are explicitly mentioned in the requirements document and in the relevant legal texts, whereas Breaux and Antón use stakeholder to refer to any entity with a ‘stake’ in the outcome .
This inheritance is similar to that found in UML class diagrams. The difference is that legal rights and obligations (instead of data or methods) are inherited by the subclass.
The CHCP definition appears in a section of HIPAA beyond the scope of our analysis, but HIPAA compliance beyond Part 164 would require its contemplation by requirements engineers during the use of our methodology.
The authors would like to thank Mr. Andy Meneely for hosting our wiki site throughout our study and Dr. Williams, Dr. Xie, and Mr. Meneely for their work in building iTrust, as well as ThePrivacyPlace.Org Reading Group members. This work was supported by NSF ITR Grant #0325269, NSF Cyber Trust Grant #0430166, and NSF Science of Design Grant #0725144.