Enabling Persistence Using a Real-World Application

This chapter turns the theory from Chapter 4 (and other chapters) into practice. We show how an application can take advantage of persistent memory by building a persistent memory-aware database storage engine. We use MariaDB (https://mariadb.org/), a popular open source database, as it provides a pluggable storage end model.


The Database Example
A tremendous number of existing applications can be categorized in many ways. For the purpose of this chapter, we explore applications from the common components perspective, including an interface, a business layer, and a store. The interface interacts with the user, the business layer is a tier where the application's logic is implemented, and the store is where data is kept and processed by the application.
With so many applications available today, choosing one to include in this book that would satisfy all or most of our requirements was difficult. We chose to use a database as an example because a unified way of accessing data is a common denominator for many applications.

Different Persistent Memory Enablement Approaches
The main advantages of persistent memory include: • It provides access latencies that are lower than flash SSDs.
• It has higher throughput than NAND storage devices.
• Real-time access to data allows ultrafast access to large datasets.
• Data persists in memory after a power interruption.
Persistent memory can be used in a variety of ways to deliver lower latency for many applications: • In-memory databases: In-memory databases can leverage persistent memory's larger capacities and significantly reduce restart times. Once the database memory maps the index, tables, and other files, the data is immediately accessible. This avoids lengthy startup times where the data is traditionally read from disk and paged in to memory before it can be accessed or processed.
• Fraud detection: Financial institutions and insurance companies can perform real-time data analytics on millions of records to detect fraudulent transactions.
• Cyber threat analysis: Companies can quickly detect and defend against increasing cyber threats.
• Web-scale personalization: Companies can tailor online user experiences by returning relevant content and advertisements, resulting in higher user click-through rate and more e-commerce revenue opportunities.
• Financial trading: Financial trading applications can rapidly process and execute financial transactions, allowing them to gain a competitive advantage and create a higher revenue opportunity.
• Internet of Things (IoT): Faster data ingest and processing of huge datasets in real-time reduces time to value.
• Content delivery networks (CDN): A CDN is a highly distributed network of edge servers strategically placed across the globe with the purpose of rapidly delivering digital content to users. With a memory capacity, each CDN node can cache more data and reduce the total number of servers, while networks can reliably deliver low-latency data to their clients. If the CDN cache is persisted, a node can restart with a warm cache and sync only the data it is missed while it was out of the cluster.

Developing a Persistent Memory-Aware MariaDB* Storage Engine
The storage engine developed here is not production quality and does not implement all the functionality expected by most database administrators. To demonstrate the concepts described earlier, we kept the example simple, implementing table create(), open(), and close() operations and INSERT, UPDATE, DELETE, and SELECT SQL operations. Because the storage engine capabilities are quite limited without indexing, we include a simple indexing system using volatile memory to provide faster access to the data residing in persistent memory. Although MariaDB has many storage engines to which we could add persistent memory, we are building a new storage engine from scratch in this chapter. To learn more about the MariaDB storage engine API and how storage engines work, we suggest reading the MariaDB "Storage Engine Development" documentation (https:// mariadb.com/kb/en/library/storage-engines-storage-engine-development/). Since MariaDB is based on MySQL, you can also refer to the MySQL "Writing a Custom Storage Engine" documentation (https://dev.mysql.com/doc/internals/en/customengine.html) to find all the information for creating an engine from scratch.

Understanding the Storage Layer
MariaDB provides a pluggable architecture for storage engines that makes it easier to develop and deploy new storage engines. A pluggable storage engine architecture also makes it possible to create new storage engines and add them to a running MariaDB server without recompiling the server itself. The storage engine provides data storage and index management for MariaDB. The MariaDB server communicates with the storage engines through a well-defined API.
In our code, we implement a prototype of a pluggable persistent memory-enabled storage engine for MariaDB using the libpmemobj library from the Persistent Memory Development Kit (PMDK).

Creating a Storage Engine Class
The implementation of the storage engine described here is single-threaded to support a single session, a single user, and single table requests. A multi-threaded implementation would detract from the focus of this chapter. Chapter 14 discussed concurrency in more detail. The MariaDB server communicates with storage engines through a well-defined handler interface that includes a handlerton, which is a singleton handler that is connected to a table handler. The handlerton defines the storage engine and contains pointers to the methods that apply to the persistent memory storage engine.
The first method the storage engine needs to support is to enable the call for a new handler instance, shown in Listing 13-1. When a handler instance is created, the MariaDB server sends commands to the handler to perform data storage and retrieve tasks such as opening a table, manipulating rows, managing indexes, and transactions. When a handler is instantiated, the first required operation is the opening of a table. Since the storage engine is a single user and single-threaded implementation, only one handler instance is created.
Various handler methods are also implemented; they apply to the storage engine as a whole, as opposed to methods like create() and open() that work on a per- The abstract methods defined in the handler class are implemented to work with persistent memory. An internal representation of the objects in persistent memory is created using a single linked list (SLL). This internal representation is very helpful to iterate through the records to improve performance.
To perform a variety of operations and gain faster and easier access to data, we used the simple row structure shown in Listing 13-3 to hold the pointer to persistent memory and the associated field value in the buffer.

Closing a Database Table
When the server is finished working with a table, it calls the closeTable() method to close the file using pmemobj_close() and release any other resources (see Listing 13-6). The pmemobj_close() function closes the memory pool indicated by objtab and deletes the memory pool handle.

INSERT Operation
The INSERT operation is implemented in the write_row() method, shown in Listing 13-7. During an INSERT, the row objects are maintained in a singly linked list. If the table is indexed, the index table container in volatile memory is updated with the new row objects after the persistent operation completes successfully. write_row() is an important method because, in addition to the allocation of persistent pool storage to the rows, it is used to populate the indexing containers. pmemobj_tx_alloc() is used for inserts. write_row() transactionally allocates a new object of a given size and type_num.

UPDATE Operation
The server executes UPDATE statements by performing a rnd_init() or index_init() table scan until it locates a row matching the key value in the WHERE clause of the UPDATE statement before calling the update_row() method. If the table is an indexed table, the index container is also updated after this operation is successful. In the update_row() method defined in Listing 13-9, the old_data field will have the previous row record in it, while new_data will have the new data.

DELETE Operation
The DELETE operation is implemented using the delete_row() method. Three different scenarios should be considered: • Deleting an indexed value from the indexed

SELECT Operation
SELECT is an important operation that is required by several methods. Many methods that are implemented for the SELECT operation are also called from other methods. The rnd_init() method is used to prepare for a table scan for non-indexed tables, resetting counters and pointers to the start of the table. If the table is an indexed table,  This concludes the basic functionality our persistent memory enabled storage engine set out to achieve. We encourage you to continue the development of this storage engine to introduce more features and functionality.

Summary
This chapter provided a walk-through using libpmemobj from the PMDK to create a persistent memory-aware storage engine for the popular open source MariaDB database. Using persistent memory in an application can provide continuity in the event of an unplanned system shutdown along with improved performance gained by storing your data close to the CPU where you can access it at the speed of the memory bus. While database engines commonly use in-memory caches for performance, which take time to warm up, persistent memory offers an immediately warm cache upon application startup.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons. org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made. The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.