Advertisement

Temporal Event Tracing on Big Healthcare Data Analytics

  • Chin-Ho LinEmail author
  • Liang-Cheng Huang
  • Seng-Cho T. Chou
  • Chih-Ho Liu
  • Han-Fang Cheng
  • I-Jen Chiang
Chapter
Part of the International Series on Computer Entertainment and Media Technology book series (ISCEMT)

Abstract

This study presents a comprehensive method for rapidly processing, storing, retrieving, and analyzing big healthcare data. Based on NoSQL (not only SQL), a patient-driven data architecture is suggested to enable the rapid storing and flexible expansion of data. Thus, the schema differences of various hospitals can be overcome, and the flexibility for field alterations and addition is ensured. The timeline mode can easily be used to generate a visual representation of patient records, providing physicians with a reference for patient consultation. The sharding-key is used for data partitioning to generate data on patients of various populations. Subsequently, data reformulation is conducted as a first step, producing additional temporal and spatial data, providing cloud computing methods based on query-MapReduce-shard, and enhancing the search performance of data mining. Target data can be rapidly searched and filtered, particularly when analyzing temporal events and interactive effects.

Keywords

Big medical data NoSQL Temporal event analysis Shard Data mining Medical record 

1 Introduction

Considering five pillars of healthcare outcome policy priorities: (1) Improving quality, safety, efficiency and reducing health disparities; (2) Engaging patients and families in their health; (3) Increasing the coordination of care; (4) Improving the health of the population; and (5) Ensuring patient privacy and security protection for personal health information [1], an sophisticated information technology has engaged in effort to adopt and implement eligible electronic health records. In practice, medical records are maintained to support patient care assessments and related decisions, legitimate medical practice reports, healthcare institution management and planning, and clinician and nurse education. In addition, medical records serve as the primary sources for studies investigating the enhancement of medical care and health [2, 3]. However, medical data must be rapidly processed, preserved, acquired, analyzed, and used to ensure its contribution to public health promotion.

The rapid development of science and medical technology has yielded rapid and effective methods for verifying, detecting, preventing, and treating diseases. This phenomenon has generated big healthcare data, specifically (1) a rapid accumulation in the number of medical records; (2) increased number of medical evaluation factors (e.g., items investigated during laboratory, biochemical, and genetic tests); (3) diverse data types (e.g., texts and numerals, diagrams, tables, images, and handwritten documents); and (4) difficulties processing and managing data. Combined, these aspects of big data delay response times and increase costs. Using a traditional relational database management system (RDBMS) conserves storage space, avoids repeated data, and supports complex table structures, enabling users to engage in various queries [4]. However, the use of big medical data is increasingly prevalent and previous relational model methods adopted to ensure data efficiency and consistency (e.g., categorizing medical records in various tables or databases) can no longer be used to effectively collect and analyze data. Thus, a bottleneck negatively affects the system performance level, hindering the data compilation process. Although RDBMS is advantageous regarding storage space conservation and data quality, it cannot efficiently be used to manage or ensure the meaningful use of big medical data.

NoSQL (not only SQL) has become increasingly popular [5], and scholars have indicated that it exhibits more advantages compared with RDBMS. However, research regarding the use of NoSQL in big medical data remains nascent; thus, various vital and urgent research concerns must be addressed. In this study, methods involving NoSQL structural approaches were proposed for rapidly processing, preserving, acquiring, analyzing, and using big medical data. Currently, a big medical data environment that can rapidly expand and be used to effectively process accumulated health records is required. In addition to using cloud computing to rapidly analyze relevant medical records and assess patient conditions, medical researchers can incorporate empirical data (PubMed) to produce evidence-based clinical guidelines, providing references for physician diagnoses and treatments. Consequently, scholars can rapidly compile various data types for use in analyses and mining (i.e., tracing and analyzing temporal events) to facilitate medical research on the interactions and relationships among disease-inducing factors.

The proposed method is divided into three parts based on the model-view-controller (MVC) design pattern. Part 1 represents the model (M), which adopts the NoSQL data platform and a patient-driven data format design and partitions the data based on the sharding-key into distributed environments to build highly efficient data cubes for Big Data analytics. Part 2 denotes view (V), which employs a Web-based user interface. The search input interface can accept combined search conditions, such as diseases, drugs, population characteristics, and temporal sequences, and displays the search results by using various statistical distribution charts and data tables. Individual patient records are presented using a visual timeline. Finally, Part 3 is the controller (C), which involves a data-reconstruction algorithm that converts original relational data model into NoSQL data. It provides additional temporal information through data reformulation, in cloud computing environment to use query-MapReduce-shard to improve the performance of search. To verify this method, we performed a prototype system test by using the MongoDB database platform [6] and produced related programs. The data comprised information from Taiwan’s National Health Insurance Research Database (NHIRD) for 2010; the sample involved one million people over 15 years and 1,175,186,752 medical records [7]. We successfully imported the data to the test platform and all functions passed the test. In addition to demonstrating the advantages of NoSQL, the empirical results indicated that the query-MapReduce-shard approach enhanced the performance by more than tenfold. Regarding temporal event tracing and analysis, superior performance was observed in medical record search and display, as well as in complex data mining.

2 Related Work

The Taiwanese Health Insurance program comprises 23 million people [8], and more than one million new medical records are generated each day. In addition to these records, other data, such as insurance claim data, as well as clinical data, patient records, empirical literature, and medical images are accumulated over time, yielding a considerable amount of data that cannot be quantified. Similar situations have occurred in numerous fields such as social media, customer behaviors, and sensing devises. Big data has become a concept typically described using the 3Vs (volume, variety, and velocity). Simply expressed, big data is large volumes of versatile data that require high temporal effectiveness to retrieve, analyze, process, and store. This cannot be achieved in a limit time by using current database query software tools. For example, RDBMSs experience the following problems: (1) the SQL JOIN syntax must be used to perform cross-table queries. However, when data volumes accumulate, JOIN causes a performance bottleneck; and (2) the pre-designed schema architecture generally impedes field updates, particularly when involving vast data volumes. Updates require redefining schema, which potentially disturbs the relational logic within the database or even changes the data format. For example, when Twitter sought to modify a data field, 2 weeks were required to perform an alter table command that would change the definition of an existing data table. Such a situation severely influences data collection and analysis. Traditional data warehouse or intelligent analytic systems are adept at handling structured data, but not semi-structured (e.g., XML, logs, click-streams, and RFID tags) and unstructured data (e.g., Web pages, email, multimedia, and instant messages). These systems will be unable to cope with the rapidly increasing data volumes of the present day.

Numerous researches have focused on designing relevant strategies and suggesting a future direction for databases [9, 10]. Furthermore, enterprises have developed NoSQL databases according to their application needs. For example, Amazon developed Dynamo [11], Google established Big Table [12], and Facebook proposed Cassandra [13]. Compared with traditional relational database systems, NoSQL systems break the restrictions of schema fields, providing schema-less data storage. NoSQL has become widespread in recent years, and more than 150 types of NoSQL databases are used for various applications and purposes [5]. Studies have confirmed the advantages of using NoSQL compared with traditional relational databases [14, 15, 16, 17, 18], such as its flexibility of data design, system performance level, storage capacity, scalability, and low cost. Integrating the NoSQL database architecture with a cloud platform enables MapReduce distributed computing. In addition, NoSQL can expand horizontally; it can dynamically expand new database nodes and the old nodes automatically copy the data to the new nodes to balance the data access loads between them [19, 20]. Thus, common database partitioning procedures, whereby databases are normalized, tables are segmented, data are copied, and application links are manually specified, are rendered superfluous.

In contrast to relational databases, no fundamental theory exists for non-relational databases; thus, a universal data modeling technique is lacking. Non-relational databases can be categorized as key-value, column-oriented, document-oriented, and graph databases [5, 21]. Considering the distinct types and diversity of healthcare data, the temporal sequence characteristics, and the continuously increasing number of data items, document-oriented and column-oriented NoSQL databases are most suitable for constructing healthcare analytic databases. During data searching and mining analysis, it is often necessary to change, recompile, and recombine the data. To verify or refute hypotheses, various arms must frequently be combined for comparative analysis. Temporal and spatial information is typically a crucial factor in such analyses; however, time and space are often separated, rendering the visual representation of search results extremely difficult when collecting and analyzing healthcare data. This problem must be resolved. After careful consideration and based on our previous study [22], we employed cloud computing as the foundation of our long-term research and plans, collecting, archiving, analyzing, and visually presenting the obtained data to establish a mining and knowledge-based big healthcare data service platform.

3 Methods

The paper proposes methods and overall operational architecture are presented in Fig. 1. In the present section, the MVC design pattern is used to explain the architecture in three steps.
Fig. 1

Overall operational architecture of the proposed methods

3.1 Model Layer

To respond to the increasing variety and velocity of healthcare data, we compiled all healthcare data of each patient into separate documents called patient-driven medical documents (PaMeDocs). Each patient has an independent PaMeDoc. PaMeDocs are used in document-oriented databases and their basic elements are key-value pairs such as (Birthday, “2013-12-31”) where “Birthday” is the key and “2013-12-31” is the value. The key is used for identification and is unique within a PaMeDoc; thus, it cannot be repeated. The value can be arbitrary data. One PaMeDoc can contain another PaMeDoc, forming a tree structure; thus, the medical data of individual patients can form a tree. The PaMeDoc tree structure features height and horizontal growth, which facilitates the rapid retrieval and flexible expansion of data, rendering the structure suitable for use in big healthcare data.

PaMeDocs are a type of highdimensional data that can be concentrated to specific dimensions or perspectives by using sharding to form the patient data of a specific population (Fig. 2). Sharding is a type of horizontal partitioning and is highly expandable, increasing the level of search performance [23, 24]. When data are partitioned into multiple small shards, these shards become individually searchable. Multiple shards can be combined using MapReduce [25] to conduct parallel computing and provide increased operating speed. The key-value pair is the basic elements of PaMeDocs, thus deeming it suitable for use in key-based sharding. Sharding-keys can be formed using one or several key fields, where the value of the sharding-key is the basis for the sharding of data and can be a single value or a range of values. Using the birthday example, which involved a yyyy-mm-dd format, the year (yyyy), month (mm), month and year (yyyy-mm), or a certain time frame can be used to produce shards comprising populations of various ages or those born in various seasons. However, only meaningful shards improve search performance; thus, the data characteristics and query purpose must be considered when selecting suitable sharding-keys and values. For example, if disease codes are used as sharding-keys, the code arrangements can be used to determine the values.
Fig. 2

The data were divided into season shards according to birthday

The PaMeDoc storage platform can use an existing NoSQL database, preferably the document-oriented type (e.g., MongoDB). PaMeDocs can easily be expressed using the JSON (JavaScript Object Notation) [26] or BSON (binary-encoded serialization of JSON-like documents) [27] data formats. If other NoSQL platform types are used, the data format can be converted to import the PaMeDocs into the database.

In addition to being a NoSQL database, this model has the advantages of PaMeDocs and sharding, enabling clinicians to rapidly view all relevant records for the present patient (data are collected in the same PaMeDoc). Thus, medical institutions can easily and efficiently manage and exchange patient medical records in a schema-less manner. PaMeDocs also exhibit a superior performance in complex searching and filtering of data (e.g., data on the interactive effects among diseases, drugs, and treatments) and temporal analysis, because the search is ultimately conducted within the same PaMeDoc.

3.2 View Layer

A Web-based user interface was adopted. The interface receives user requests and displays the results. Shared standards such as Ajax (asynchronous JavaScript and XML) [28] and D3.js (just D3 for data-driven documents) [29] can be used for page rendering technology. As described in Sect. 3.1, various diseases, drugs, populations, and temporal sequence characteristics can be used to form complicated query request conditions (Fig. 3). Based on the PaMeDoc and shard design, Ajax and D3.js can be used to easily render various statistical distribution charts and data tables, specifically the timeline visual representation of individual patient records.
Fig. 3

The web-based search functions and results

3.3 Controller Layer

In this subsection, the approach used for restructuring the data into a PaMedoc is described. In addition, data reformulation, shading and using MapReduce, and conducting targeted queries and searches are explained.
  1. 1)

    Restructuring Relations to PaMeDocs: When attempting to restructure medical records, which are stored in the tables of a relational database, into PaMeDocs, the clinical operations of hospitals must first be understood. At a patient’s first consultation, the hospital collects the demographic information of the patient, which is stored in the patient data table (ID). In clinics, the physician records outpatient prescriptions and treatments; each time the patient attends a clinic for treatment, multiple new medical orders (drug prescriptions and medical examination sheets) may be issued. Medical operations of inpatient and outpatient care and pharmacies generate various data tables, which are mutually concatenated using primary keys (PK) and foreign keys (FK). Therefore, when restructuring these data into patient PaMeDocs, the basic information can be acquired from the ID table, after which the PK-FK relationship is determined to obtain the target data from the linked data tables. This recursive procedure continues until all patient PaMeDocs have been completed, as shown by the Restructure() algorithm. In practice, redundant keys or duplicated fields in the data tables are deleted during restructuring to conserve storage space.

     
  2. 2)
    Temporal and Spatial Data Reformulation: Temporal and spatial data are vital factors in event analysis but these data may not be obtained directly from the existing data. By analyzing past research related to the NHIRD, we found that obtaining data regarding the patient age and year of seeking medical advice, patient residential area, drug type, days of drug use after a single hospital/clinic visit, total days of drug use, duration of medical history, and the time interval between the occurrence of diseases, requires additional processing or computation. However, when the data volume is substantial, additional processing and computation may require a prolonged duration, rendering waiting impossible. Therefore, additional processing and computation of these factors can be performed in advance based on the currently known data (Table 1), and new fields and collections can be generated and saved for immediate access in future research, eliminating the need for recomputation.
    Table 1

    Newly added data items and generating methods

    Newly added key fields

    Computing methods

    Property

    FUNC_AGE

    FUNC_DATE-ID_BIRTHDAY

    Numerical

    FUNC_YEAR

    FUNC_DATE(YYYY)

    Time

    Residential_Area

    AREA_NO_I

    Category

    Drug_Category

    Searching drug classification table

    Category

    Drug_Use_Day

    Total_QTY/Drug_Use × Drug_Fre

    Numerical

    Total_Drug_Use_Day

    SUM(Drug_Use_Day)

    Numerical

    Disease_History

    CURRENT_DATE -FIRST_ FUNC_DATE

    Time

    Interval_Diseases

    FIRST_ FUNC_DATE_D2 -FIRST_ FUNC_DATE_D1

    Time

     
  3. 3)

    Using MapReduce and sharding-key for targeted queries and searches: When performing conditional searches, MapReduce uses the sharding-key to perform a targeted query, enhancing the search and computing performance. Because the information is dispersed in various shards, when the user searches the information, the query conditions are mapped to the corresponding sharding-key and parallel searching is conducted on the corresponding shard. Finally, all search results are compiled (Reduce). For example, Fig.  2 shows the field birthday, which is used as a sharding-key for sharding; the data table is separated into four-season shards, where birthdays yyyy-(02, 03 and 04)-dd for spring; yyyy-(05, 06 and 07)-dd for summer; yyyy-(08, 09 and 10)-dd for autumn; and yyyy-(11, 12 and 01)-dd for winter. When searching the number of people of each season, search and computation can be performed directly in the corresponding shard.

     

Restructure( )

 tt <- construct all tables as a tree exhibiting a root ID;

 for each tuple in table ID

   pomedoc <- Sub_restruct(tuple, tt);

 return all;

Sub_restruct(tuple, tt)

 doc <- new pomedoc();

 (doc.keys, doc.values) <- tuple

 for each subtt in tt.subtrees {

   doc1 <- new pomedoc();

   for each tuple1 in table subtt and tuple1.FK == tuple.PK

     doc1.subdocs++ <- Sub_restruct(tuple1, subtt);

   doc.subdocs++ <- doc1;

 } return doc;

4 Results and Discussion

The experiment material was gathered from the 2010 Taiwanese NHIRD, which contained all the medical data of one million people randomly sampled from the 2010 Registry for Beneficiaries of the NHIRD. The NHIRD data were then serially connected to all medical data retrieved between 1996 and 2010, yielding 1,175,186,752 medical records. The data were separated into seven types of documents: registry for beneficiaries (ID), ambulatory care expenditures by visits (CD), inpatient expenditures by admissions (DD), expenditures for prescriptions dispensed at contracted pharmacies (GD), details of ambulatory care orders (OO), details of inpatient orders (DO), and details of prescriptions dispensed at contracted pharmacie (GO). All documents were connected using a key value. Legal access to these data can only be gained for research purposes, and an access application must be filed. Basic information about the patients is contained within the ID document, however this document does not contain patient names, and the addresses are only indicated by area. The encrypted ID numbers are used as key values to connect to other detail documents.

MongoDB 2.2.0 was used as the database platform and the experiment equipment comprised a Windows 7 server involving four core CPUs, 16 GB of memory, and 3 TB of storage. We used Apache, PHP, Java, Ajax, D3.js, and the Google Chrome browser as tools to design and produce related components and programs (e.g., data restructure and reformulation, BSON-format PoMeDocs, data importation, search and MapReduce, and Web page). After completion, all PaMeDocs were successfully imported into MongoDB and the diagnostic, drug, and operation codes were used as sharding-keys to establish corresponding shards and complete related function tests.

To filter and view the results of various combined conditions, we designed a Web-based search function (Fig. 3), which consists of three parts: (1) targeted query. Users can input a patient ID, ICD code or disease name, drug code or component, product name, or operation code to retrieve population proportion and distribution charts in accordance with the search conditions; (2) data mining. Data mining can be conducted on specified populations, diseases, drugs, procedures and temporal conditions to produce population distribution charts, statistics regarding comorbidities, the time interval between two query conditions, types of drugs taken, and statistical analysis results; and (3) code description. Descriptions of various codes can be searched. The search results of targeted queries and data mining also list patient information and medical records.

The sharding-key evaluation involved using patients with diabetes as query targets and three methods: (1) a direct search without shards; (2) search using system-defined shards; and (3) search using user-defined shards. Using these methods yielded 88,601 diabetic patients from the total population of one million. The search duration for Methods 1, 2, and 3 were 791.969, 50.142, and 20.466 s, respectively. This demonstrates that sharding considerably enhances the search performance. Specifically, when users define the shards, the performance increased 40 times. Inconvenient key value settings can be avoided using system-defined shards; however, if users can set an appropriate value, the search performance is further enhanced. According to our previous experience and analyses of NHIRD research plans, researchers and physicians most commonly explore diseases, drugs, and operation procedures, making these items suitable sharding-keys. Nevertheless, the acquisition of appropriate key values requires careful analysis to ensure that shards effectiveness is maximized.

The temporal event tracing and analysis was conducted by using the case study that “taking the drug Januvia might cause acute pancreatitis in diabetic patients” as an example [30]. We used the data mining function of the system and set four data-filtering conditions: diabetic population, acute pancreatitis, Januvia, and chronological order of occurrences. The system generated various related statistical charts for researchers to reference, including population distribution charts of various aspects, statistics on chronic complications, time interval between the occurrence of two specific diseases (Fig. 4), drug type statistics, and days of drug use. Table 2 shows the statistical analysis results generated by the system. The odds ratio value was 1.626, verifying that the results corresponded to the warning statement. In practice, analysis is often performed gradually, extensively, and repeatedly. By contrast, the proposed method involving PaMeDocs, temporal information, and query-MapReduce-shard can rapidly yield results by focusing the data and conducting parallel searches based on filtering conditions. In addition, filtering conditions can be set for new drugs or specific populations, and the system can perform periodic automatic statistical analyses, providing a monitoring function.
Fig. 4

Statistical charts: (a) day of drug Januvia use; (b) Interval between acute pancreatitis and diabetes mellitus

Table 2

Statistical analysis results

Diabetes mellitus

Acute pancreatitis

Sum

Yes

No

Drug Januvia

+

32

2,830

2,862

592

85,147

85,793

Total

624

87,977

88,601

Odds Ratio

1.626

The system visually represents the medical records of individual patients on a timeline. In Fig. 5, each point represents a medical record, and the point colors represent categories. The points may be expanded to reveal detailed content on diagnosis, operations, medical orders, and drugs. The filtering conditions can be set to show the desired records (Fig. 5). Visual representations of patient cases are produced using Ajax and D3.js; specifically, when patent information is compiled in PaMeDocs, the visual representation is particularly rapid and simple. In practice, physicians can use the system as a reference for making diagnoses by tracing patient medical histories and using the data to elucidate their current situations. Furthermore, this information can provide a health care reference for family physicians and patients.
Fig. 5

Timelime representation to medical records, where (a) patient lists; (b) one patient’s all medical records; and (c) desired records of the patient

5 Conclusion

Although no fundamental theory or universal data modeling technique existed for use in NoSQL databases, their strong expansibility and flexibility enabled abandoning relational data model for a novel cloud database. When adopting a NoSQL storage architecture, processing, storing, accessing, and analyzing big healthcare data must be rapid in order for meaningful use to be possible and public health promotion to be achieved. Based on NoSQL, this study proposed a PaMeDoc data tree structure, in which the tree features height and horizontal growth. This facilitated rapid storage and flexible expansion, rendering the proposed structure suitable for managing big healthcare data. Regarding management, using PaMeDocs can overcome the schema differences between various medical institutions, enabling flexible field alteration and addition. Concerning treatment, using PaMeDocs simplifies the timeline visual representation of patient medical history and provides references for physicians in clinics. For the research analysis, we used sharding-keys to perform data partition on the PaMeDocs, generating patient information for various populations. Data reformulation was performed in advance to generate temporal and spatial information, facilitating analyses of temporal events and interactive effects. We used cloud computing combined with query-MapReduce-shard to enhance the search performance of data mining. Although the experimental data volume was not big, the proposed methods were all verified during the test, and their superior level of performance was particularly evident when searching and analyzing temporal events.

References

  1. 1.
    W. Hersh et al., Health-care hit or miss? Nature 470, 327–329 (2011)CrossRefGoogle Scholar
  2. 2.
    M. Porta, J.M. Last, A dictionary of epidemiology (Oxford University Press, New York, 2008)Google Scholar
  3. 3.
    M.A. Musen, J.H. Bemmel, Handbook of medical informatics (Bohn Stafleu Van Loghum, Houten, 1999)Google Scholar
  4. 4.
    E.F. Codd, A relational model of data for large shared data banks. Commun. ACM 13(6), 377–387 (1970)CrossRefzbMATHGoogle Scholar
  5. 5.
    NoSQL Databases, Available: http://www.nosql-database.org/
  6. 6.
    10gen. MongoDB, http://www.mongodb.org/
  7. 7.
    National Health Insurance Research Database, Available: http://nhird.nhri.org.tw/en/index.htm
  8. 8.
    National Health Insurance Administration, Available: http://www.nhi.gov.tw/english/index.aspx
  9. 9.
    P.A. Bernstein et al., Future directions in DBMS research - the Laguna Beach Participants. ACM SIGMOD Record. 18(1), 17–26 (1989)Google Scholar
  10. 10.
    A. Silberschatz, S. Zdonik, Strategic directions in database systems—breaking out of the box. ACM Comput. Surv. 28(4), 764–778 (1996)CrossRefGoogle Scholar
  11. 11.
    G. DeCandia et al., Dynamo: amazon’s highly available key-value store. ACM SIGOPS 41(6), 205–220 (2007)CrossRefGoogle Scholar
  12. 12.
    F. Chang et al., Bigtable: a distributed storage system for structured data, ACM T. Comput. Syst., 26(2), art. 4, (2006)Google Scholar
  13. 13.
    A. Lakshman, P. Malik, Cassandra: a decentralized structured storage system. ACM SIGOPS 44(2), 35–40 (2010)CrossRefGoogle Scholar
  14. 14.
    N. Jatana, S. Puri, M. Ahuja, I. Kathuria, D. Gosain, A survey and comparison of relational and non-relational database, Int. J. Eng. Res. Tech., 1(6), (2012)Google Scholar
  15. 15.
    R. Cattell, Scalable SQL and NoSQL data stores. ACM SIGMOD Record. 39(4), 12–27 (2010)CrossRefGoogle Scholar
  16. 16.
    M. Stonebraker, SQL databases v. NoSQL databases. Commun. ACM 53(4), 10–11 (2010)CrossRefGoogle Scholar
  17. 17.
    A.B.M. Moniruzzaman, S.A. Hossain, NoSQL database: new era of database for big data analytics – classification, characteristics and comparison. Int. J. Database Theor. App. 6(4), 1–14 (2013)Google Scholar
  18. 18.
    I. Lungu, B.G. Tudorica, The development of a benchmark tool for NoSQL databases. Database Syst. J. 4(2), 13–20 (2013)Google Scholar
  19. 19.
    J. Pokorny, NoSQL databases: a step to database scalability in web environment. Int. J. Web Inform. Syst. 9(1), 69–82 (2013)CrossRefGoogle Scholar
  20. 20.
    M. Ward, NoSQL database in the cloud: MongoDB on AWS, Amazon Web Services, 2013Google Scholar
  21. 21.
    J. Han, E. Haihong, L. Guan, J. Du, A survey on NoSQL databases, Int. Conf. on Pervas. Comput. and Appl. (ICPCA), IEEE Press, Oct 2011, pp. 363–366, doi: 10.1109/ICPCA.2011.6106531Google Scholar
  22. 22.
    C.H. Lin, P.H. Tseng, L.C. Huang, Y.J. Oyang, M.S. Wu, S.C. Chou, A multi-level cloud-based virtual health exam system on health cloud. J. Med. Biol. Eng. 33(4), 373–379 (2013)CrossRefGoogle Scholar
  23. 23.
    A. Pavlo, C. Curino, S. Zdonik, Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems, ACM SIGMOD, pp. 61–72, May 2012Google Scholar
  24. 24.
    Y. Liu, Y. Wang, Y. Jin, Research on the improvement of MongoDB auto-sharding in cloud environment, IEEE ICCSE, pp. 851–854, July 2012Google Scholar
  25. 25.
    J. Dean, S. Ghemawat, MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)CrossRefGoogle Scholar
  26. 26.
    JSON, Available: http://www.json.org/
  27. 27.
    BSON, Available: http://bsonspec.org/
  28. 28.
    J.J. Garrett, Ajax: a new approach to Web applications (Adaptive Path, CA, 2005)Google Scholar
  29. 29.
    D3.js, Available: http://d3js.org/
  30. 30.

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Chin-Ho Lin
    • 1
    Email author
  • Liang-Cheng Huang
    • 1
  • Seng-Cho T. Chou
    • 1
  • Chih-Ho Liu
    • 2
  • Han-Fang Cheng
    • 2
  • I-Jen Chiang
    • 2
  1. 1.Department of Information ManagementNational Taiwan UniversityTaipeiRepublic of China
  2. 2.Institute of Biomedical EngineeringNational Taiwan UniversityTaipeiRepublic of China

Personalised recommendations