Use of NoSQL Database for Handling Semi Structured Data: An Empirical Study of News RSS Feeds
Abstract
Evolution of Web 2.0 has rapidly contributed to the volume and variety of data. Semi structured and unstructured data are various varieties generated by different sources in Web 2.0. The challenge is to handle semi structured and unstructured data which does not have any consistent format. Handling semi structured data, where data has varying formats urges a need for a DBMS to be less restrictive on the structure of the stored data. This paper discusses features, available data model and query model for NoSQL databases which are competent to handle semi structured data. Document-oriented NoSQL database MongoDB is compared with relational database MySQL in terms of evaluating the query response time. This comparison is presented as a case study for News dataset. News items are collected from various news channels in the form of RSS feeds which generate data in varying formats essentially exhibiting the property of being semi structured. Handling RSS feeds using relational database requires defining a schema and requires preprocessing the feeds. On the other hand, this data generated by heterogeneous data sources can be efficiently handled by NoSQL without any preprocessing. Result of comparison of NoSQL database MongoDB with relational database MySQL shows that NoSQL databases are better than relational database for semi structured data in terms of fabricating the structure of database and in query response time.
Keywords
NoSQL MongoDB Semi structured data RSS feedsNotes
Acknowledgments
The authors duly acknowledge the University of Delhi for extending their support via the research grant number RC/2014/6820 and University Grants Commission (UGC) for funding this research work via UGC Junior Research Fellowship (JRF) Ref No.: 3492/(NET-DEC. 2012).
References
- 1.Hecht, R., Jablonski, S.: NoSQL evaluation a use case oriented survey. In: International Conference on Cloud and Service Computing, pp. 336–341 (2011)Google Scholar
- 2.Parker, Z., Poe, S., Vrbsky, S.: Comparing NoSQL MongoDB to an SQL DB. In: ACM Southeast Conference, p. 5 (2013)Google Scholar
- 3.Wei-ping, Z., Ming-xin, L., Huan, C.: Using MongoDB to implement textbook management system instead of MySQL. In: IEEE 3rd International Conference on Communication Software and Networks, pp. 303–305 (2011)Google Scholar
- 4.Cattell, R.: Scalable SQL and NoSQL data stores. ACM SIGMOD Rec. 39(4), 12–27 (2010)Google Scholar
- 5.Stonebraker, M.: SQL databases vs. NoSQL databasesGoogle Scholar
- 6.Leavitt, N.: Will NoSQL databases live up to their promise? Comput. Mag. 43, 12–14 (2010)CrossRefGoogle Scholar
- 7.Kaur, K., Rani, R.: Modeling and querying data in NoSQL databases. In: IEEE International Conference on Big Data, pp. 1–7 (2013)Google Scholar
- 8.Bedi, P., Jindal, V., Gautam, A.: Beginning with big data simplified. In: International Conference on Data Mining and Intelligent Computing (ICDMIC), pp. 1–7 (2014)Google Scholar
- 9.Jatana, N., Puri, S., Ahuja, M., Kathuria, I., Gosain, D.: A survey and comparison of relational and non-relational database. Int. J. Eng. Res. Technol. 1(6), 1–5 (2012)Google Scholar
- 10.Gilbert, S., Lynch, N.: Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services. ACM SIGACT News 33(2), 51–59 (2002)CrossRefGoogle Scholar
- 11.Chang, F., Dean, J., Ghemawat, S., Hsieh, W., Wallach, D., Burrows, M., Chandra, T., Fikes, A., Gruber, R.: Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. (TOCS) 26(2), 1–26 (2008)CrossRefGoogle Scholar
- 12.UnQL : Unstructured query language. http://unql.sqlite.org/index.html/wiki?name=UnQL