Office 365 is built on the principle that the information contained in the cloud service is owned by the business. The data your company places in Office 365 is your data. Microsoft has as strict policy not to mine or process your data for any business purpose. If you choose to leave Office 365 for some other service, the data you leave behind will be destroyed within 90–120 days of your subscription termination.
There are two parts to compliance: Microsoft’s management of the Office 365 service, and your business processes in the management of your Office 365 data. Microsoft’s management of Office 365 service and their service standards are published on the Microsoft “trust” website (see Figure 9-1,
). If you are looking for a HIPAA (Health Insurance Portability and Accountability Act of 1996) Business Associate Agreement certification or request a copy of the service audit logs, you can request those directly from Microsoft. Microsoft is transparent in its process on Office 365 and built the service around the protection of your company information. This is in contrast to other cloud services that require an intellectual property rights assignment, which allows them to use your information to sell advertising, among other things.
When we refer to Office 365 compliance, we are referring to the capabilities of Office 365 data governance to preserve and manage information. Compliance and regulatory settings are the services you enable on the Office 365 site and that meet your business need or regulatory requirements. As an example, you can group information into three different categories: compliance, information review, or business data retention:
All information that you keep falls into these categories. For example, HIPAA requires you to manage certain types of data in a way to protect information. To meet HIPAA requirements, you must protect personal information by encrypting the information before it is sent externally to the organization. One of the HIPAA requirements is that the service you are using provides a Business Associates Agreement (BAA) for their services.
Information review typically means that the information is subject to an audit and is immutable—meaning it cannot be changed or deleted by the users or the organization—prior to review. Any type of regulator review requires that the data is immutable. The most common is litigation. When an organization enters into litigation, all information is frozen at that period in time. We refer to that as litigation hold. Regulator reviews such as FINRA are nothing more than an extension of a litigation hold.
Business data retention is nothing more than the business processes used to maintain information, subject to the regulatory requirements. As an example, if the business policy (or user policy) deletes information subject to the retention policy, the information is deleted from the user perspective, but may be kept for a very long time subject to the compliance needs of the organization. The user may delete information, but the compliance setting keeps the information in an area where it is immutable and fully searchable and hidden from the user.
The Office 365 administrator has complete control over the configuration of the compliance and retention polices. The administrator can enable these settings and all actions are auditable. The settings can be changed by using the Exchange Admin Center or using PowerShell commands. As Microsoft enhances the Office 365 service, these settings are simplified in an easy-to-use graphical interface.
The rest of this chapter discusses these concepts and provides a step-by-step implementation with examples of data loss protection (compliance), regulatory review (discovery), and business data retention policies. These three areas make up Office 365 data governance.
If you find that you need to perform discovery or mailbox searches, all users subject to search must be on Enterprise Subscription “Exchange Plan 2,” and there needs to be at least one E3 subscription to use the Electronic Discovery Center.
Data Governance Concepts
Microsoft provides the management service on Office 365 that meets or exceeds the regulatory compliance. The management of the data in Office 365 (and the subscription types) are managed and owned by the individual users. The Office 365 business owners need to look at the business and decide what makes business sense based on the needs of the business. To put this in perspective, when an external entity looks at email storage, it is considered modifiable by the user and is noncompliant to certain regulations. A compliant systems requires that the mail and document storage systems must be incapable of being modified, or immutable. The owner of a mailbox must not be able to go in and delete the information or document. These capabilities are options in the Office 365 enterprise plan and are include at no charge in some of the subscriptions suites (such as the Enterprise E3 subscription).
You are probably familiar with the various CSI and NCIS shows. A key message that these shows highlight lies in the evidentiary collection of information, and that there must be a “chain of custody” regarding information collected. Think of data governance in the same context as you would a murder with the collection of information for the legal prosecution of the suspect. It is all about chain of custody. Data governance on Office 365 is the same. Access to information that is under discovery or access cannot be tampered with. Further, access is recorded and auditable for all those who access the information. This is the data governance model of Office 365.
Archive and retention policies are implementations of our ability to manage the data to meet our data governance needs. Traditional approaches, such as journaling, record information external to the organization structure, and mostly just contain copies of the email communications. This archaic journaling approach does not address the changing landscape of data governance and data management. Journaling does not link data from storage sites and draft documents in an integrated form. Even an archive is nothing more than another mailbox that is used to store information.
Immutability, audit policy, archive/retention, and data loss prevention are all part of the Office 365 data governance structure. It is designed around chain of custody and the preservation of information—information that cannot be tampered with. If it is tampered with, then a full audit trail of access, as well as the original information that was modified, is created.
Before we discuss the practical aspects of the configuration of retention policy and eDiscovery, we need to frame the discussion with a definition of each of the four key areas of data governance to put them in perspective.
There has been much written about information immutability, and there are many misconceptions as to what this is and how it is managed in Office 365. The definition is simple: the preservation of data in its original form cannot be changed and is kept in a form that is discoverable.
Recall the discussion of chain of custody. The information that you are accessing and providing for data governance needs not only cannot be changed, but you must not have the ability to change it. In addition, any access to the information must be fully traceable. If you access information, the information that you extract will not change the underlining information.
The best example is to look at an email that flows in or is created by a user in the cloud (see Figure 9-2). In this case, information that arrives or is in a user mailbox can be changed and modified by the user. This is the normal process that we use in writing an email. An email that is immutable, on the other hand, keeps all parts of the message in a form that can be fully discoverable through searches. When an email message is drafted, all changes and drafts are kept and not deleted. Nothing is purged—all information is fully discoverable.
When we refer to compliance, we are referring to our ability to access communications and documents that are immutable. Retention rules are based on business policies in the management of email communications, specifically what email is visible to the user in the mailbox, and what is kept in the archive. For example, you may have a business policy that dictates the movement of email from a user mailbox to an archive if the email is too old, or if the user deletes an email. One company has a retention policy of 90 days; after 90 days, user incoming email is moved into the compliance archive. These retention rules move the mail from the user mailbox (or delete folder) into the archive. These rules can be systems level (user has no control), or they can be local level (user has complete control), or any combination.
Litigation hold is an action that is placed on a mailbox to meet compliance requirements for future discovery and searching. What litigation hold does is to ensure that the data in a user mailbox is immutable. As an example, if the user tries to delete an email, the email is deleted (or purged) from the user’s view, but the litigation hold function blocks the email from being deleted in the system and is fully discoverable by the administrator (or compliance officer).
Referring back to Figure 9-2, we see the life of an email in a user mailbox. In Figure 9-2, the user only sees the message in steps 1–3. The compliance officer has access to all transactions in steps 1–6. When a discovery action—a search—is executed, all information is displayed in the search request, including the information in the deleted items, purges, and draft folders.
Companies in the cloud need to know who has access to their company data. The ability to monitor and produce the necessary reports are part of the Office 365 audit capability. Companies need to do the following:
To verify that their mailbox data isn’t being accessed by Microsoft.
To enforce compliance and privacy regulations and access by nonowners.
To have the ability to determine who has access to data at a given time in a specific mailbox.
To have the ability to identify unauthorized access to mailbox data by users inside and outside your organization.
The ability to monitor the mailbox data is a fundamental part of the Office 365 organization (see Figure 9-3). Once the audit capabilities are enabled (via PowerShell), the audit reports can be generated by the administrator or an individual who has been given this capability.
The audit reports are displayed in the search results in the Exchange Administrator Panel. However, if the audit reports are not enabled, the information is not logged. Each audit report contains the following information:
Who accessed the mailbox and when
The actions performed by the nonowner
The affected message and its folder location
Whether the action was successful
The first step in setting up a compliant organization is to enable the audit capabilities to ensure that you have a complete record of all accesses to user mailbox data by nonowner users. This information is used to supplement future reports. Figure 9-4 provides a descriptive explanation of the terms in the audit reports.
The audit reports that are generated contain detailed information about who has accessed the information and how they have changed it. As you’ll see in Figure 9-4, users have different levels of access, and that access can be tracked in audit logs. If a legal hold was placed on the user mailbox, then the search of the user mailbox will show the history non-mailbox owners access. The areas marked “Yes” are those that can be tracked in the audit logs. This is different than the tracking of the information in the discovery center. The discovery center can track all information that is placed on legal hold. The audit logs track the non mailbox owners who access information.
Information immutability takes this one step further and integrates Lync Communications, and SharePoint documents (as well as SkyDrivePro document synchronization), into the equation. The Office 365 approach is designed to shrink and reduce the amount of information by removing duplicate information. This reduces the complexity of the searches and allows the compliance officer to clearly see the thread of the information and the root cause (if any) of the discovery request. The searched data can be exported in the industry standard Electronic Discover Reference Model (EDRM) standard in an XML format to provide content to a third party. The Office 365 approach is designed to remove duplicate data from searches and does not remove any data from the user SharePoint or email mailbox. The data stays where it is and is immutable.
In Office 365, data governance and compliance is simplified. The scope of the discovery is reduced to the specific set of key words and can be easily restricted to a few users in questions. It is not uncommon that an eDiscovery request on Office 365 would cost 90 percent less than an eDiscovery request using an older journaling system for email communication management.
As you read the rest of this chapter, the discussion on archive and retention polices are built around data immutability to manage an organization’s compliance needs. In Office 365, this is referred to as compliance management. Administrators are enabled to set up controls based on the business polices of the organization.
Office 365 Archiving and Retention
The term archive is overused. It often implies more than what it really is. Archive is nothing more than a second mailbox designed for long-term storage. The relevancy of an archive is based on the business process rules that are used to manage it. This is where immutability and retention policies come into play. Immutability refers to how information is retained (in a form that can’t be changed) in the mailbox and the archive. Retention polices (see Figure 9-5) describe the length of time you need to keep the data that is not subject to any legal action (legal hold to guarantee immutability).
There are two types of archive in Office 365: personal archives and server archives (see Table 9-1). Server archives can be immutable (meaning they can be configured to ignore any change using litigation hold or in-place hold). Personal archives are stored locally on the user desktop and are not immutable (users can change the contents). The retention policies only refer to the moving of data from the user mailbox to the archive.
Retention policy is nothing more than the business processes that define the movement of data. Retention polices are a set of rules that are executed concerning a message (see Figure 9-6). Retention policy is a combination of different retention tags, which are actions placed on a message. You can have only one retention policy applied to a mailbox. In an organization where you have compliance requirements, retention tags are used to manage the user mailbox information and to control mailbox sizes.
Retention tags define and apply the retention settings to messages and folders in the user mailbox. These tags specify how long a message is kept and what action is taken when a message reaches the retention age. Retention tags are used to control the amount of information that is on the user’s desktop. Typically this means that the message is moved to the archive folder or it is deleted. Looking at Figure 9-6, you can see three types of retention tags: Default retention tags, Policy retention tags, and Personal retention tags (described below):
The default policy applies to all items in a mailbox that do not have a retention tag applied.
Policy tags are applied to folders (inbox, deleted items, and so on) and override the default policy tags. The only retention action for policy is to delete items.
Personal tags are only used for Outlook clients to move data to customer folders in the user’s mailboxes.
The best way to understand retention policy is to follow the example in our implementation section (later in this chapter). Keep in mind that the implementation of a retention policy directly effects the amount of information kept in a user mailbox. Retention tags (which make up the retention policy) are just another tool used for information management. Depending on your business needs, you may have different retention polices to manage information of different groups in your organization. In one organization we managed, the data retention policy was 90 days, unless the mailbox was placed on in-place hold for litigation or discovery.
Compliance archives may or may not have a retention policy applied to them, but they will have the mailbox placed under litigation hold and the data retention policy of the SharePoint site also placed under litigation hold. User mailboxes that are placed under litigation hold with the external audit enabled meet all compliance requirements, because the data is immutable.
Data Loss Prevention
Data loss prevention (DLP) operates with either a template rule (see Figure 9-7), or with a trigger from the Rights Management Service based on business policy. The purpose of DLP is to execute an action based on rules. DLP does not prevent an individual from doing something bad. All DLP does is to limit the information flow in case someone sends electronic communications to a third party that violates business policy.
What DLP does is minimize mistakes that individuals make in sending information to individuals that do not have a business need to know the information. Add to this capability auditing and discovery, and you will be able to determine which individual had last access to the information.
There are many rules that you can select to implement in addition to the rights management rules on Office 365.Figure 9-7 shows the different templates that can be managed in your organization to control information to meet federal and state regulations. Rights management is the extension of DLP to manage internal documents and information using Active Directory. DLP functions are managed using both the Office 365 interface and PowerShell commands (Figure 9-8)