The VLDB Journal

, Volume 25, Issue 4, pp 519–544

Mining billion-scale tensors: algorithms and discoveries

  • Inah Jeon
  • Evangelos E. Papalexakis
  • Christos Faloutsos
  • Lee Sael
  • U. Kang
Regular Paper

DOI: 10.1007/s00778-016-0427-4

Cite this article as:
Jeon, I., Papalexakis, E.E., Faloutsos, C. et al. The VLDB Journal (2016) 25: 519. doi:10.1007/s00778-016-0427-4
  • 522 Downloads

Abstract

How can we analyze large-scale real-world data with various attributes? Many real-world data (e.g., network traffic logs, web data, social networks, knowledge bases, and sensor streams) with multiple attributes are represented as multi-dimensional arrays, called tensors. For analyzing a tensor, tensor decompositions are widely used in many data mining applications: detecting malicious attackers in network traffic logs (with source IP, destination IP, port-number, timestamp), finding telemarketers in a phone call history (with sender, receiver, date), and identifying interesting concepts in a knowledge base (with subject, object, relation). However, current tensor decomposition methods do not scale to large and sparse real-world tensors with millions of rows and columns and ‘fibers.’ In this paper, we propose HaTen2, a distributed method for large-scale tensor decompositions that runs on the MapReduce framework. Our careful design and implementation of HaTen2 dramatically reduce the size of intermediate data and the number of jobs leading to achieve high scalability compared with the state-of-the-art method. Thanks to HaTen2, we analyze big real-world sparse tensors that cannot be handled by the current state of the art, and discover hidden concepts.

Keywords

Tensor Distributed computing Big data MapReduce Hadoop 

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Inah Jeon
    • 1
  • Evangelos E. Papalexakis
    • 2
  • Christos Faloutsos
    • 2
  • Lee Sael
    • 3
  • U. Kang
    • 4
  1. 1.LG ElectronicsSeoulKorea
  2. 2.Computer Science Department and iLabCMUPittsburghUSA
  3. 3.Department of Computer ScienceSUNYIncheonKorea
  4. 4.Department of Computer Science and EngineeringSeoul National UniversitySeoulKorea

Personalised recommendations