Advertisement

Heuristic Ranking Classification Method for Complex Large-Scale Survival Data

  • Nasser FardEmail author
  • Keivan Sadeghzadeh
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 360)

Abstract

Unlike traditional datasets with a few explanatory variables, analysis of datasets with high number of explanatory variables requires different approaches. Determining effective explanatory variables, specifically in a complex and large-scale data provides an excellent opportunity to increase efficiency and reduce costs. In a large-scale data with many variables, a variable selection technique could be used to specify a subset of explanatory variables that are significantly more valuable to analyze specially in the survival data analysis. A heuristic variable selection method through ranking classification to analyze large-scale survival data which reduces redundant information and facilitates practical decision-making by evaluating variable efficiency (the correlation of variable and survival time) is presented. A numerical simulation experiment is developed to investigate the performance and validation of the proposed method.

Keywords

ranking classification decision-making variable selection largescale data survival data 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    IBM: Information Integration and Governance (2011), http://www.ibm.com
  2. 2.
    McAfee, A., Brynjolfsson, E.: Big data: the management revolution. Harvard Business Review 90, 60–66 (2012)Google Scholar
  3. 3.
    IBM: What is Big Data? Bringing Big Data to the Enterprise (2013), http://www.ibm.com
  4. 4.
    Hilbert, M., Lopez, P.: The World’s Technological Capacity to Store, Communicate, and Compute Information. Science 332(6025), 60–65 (2011)CrossRefGoogle Scholar
  5. 5.
  6. 6.
    Hellerstein, J.: Parallel Programming in the Age of Big Data (2008), https://gigaom.com/2008/11/09/mapreduce-leads-the-way-for-parallel-programming
  7. 7.
    Segaran, T., Hammerbacher, J.: Beautiful Data: The Stories Behind Elegant Data Solutions. O’Reilly Media, Inc. (2009) Google Scholar
  8. 8.
    Feldman, D., Schmidt, M., Sohler, C.: Turning big data into tiny data: Constant-size coresets for k-means, pca and projective clustering. In: Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1434–1453. SIAM (2013)Google Scholar
  9. 9.
    Manyika, J., et al.: Big Data: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute (2011)Google Scholar
  10. 10.
    Moran, J.: Is Big Data a Big Problem for Manufacturers? (2013), http://www.sikich.com/blog/post/Is-Big-Data-a-Big-Problem-for-Manufacturers#.VPswcU_F_BM
  11. 11.
    Brown, B., Chui, M., Manyika, J.: Are you ready for the era of ‘big data’. McKinsey Quarterly 4, 24–35 (2011)Google Scholar
  12. 12.
    Russom, P.: Big Data Analytics. TDWI Best Practices Report, Fourth Quarter (2011)Google Scholar
  13. 13.
    Sadeghzadeh, K., Salehi, M.B.: Mathematical Analysis of Fuel Cell Strategic Technologies Development Solutions in the Automotive Industry by the TOPSIS Multi-Criteria Decision Making Method. International Journal of Hydrogen Energy 36(20), 13272–13280 (2010)CrossRefGoogle Scholar
  14. 14.
    Chai, J., Liu, J.N., Ngai, E.W.: Application of decision-making techniques in supplier selection: A systematic review of literature. Expert Systems with Applications 40(10), 3872–3885 (2013)CrossRefGoogle Scholar
  15. 15.
    Yao, F.: Functional Principal Component Analysis for Longitudinal and Survival Data. Statistica Sinica 17(3), 965 (2007)zbMATHMathSciNetGoogle Scholar
  16. 16.
    Cox, D.R.: Regression Models and Life-Tables. Journal of the Royal Statistical Society 34(2), 187–220 (1972)zbMATHGoogle Scholar
  17. 17.
    Kalbfleisch, J.D., Prentice, R.L.: The Statistical Analysis of Failure Time Data, vol. 360. John Wiley & Sons (2011)Google Scholar
  18. 18.
    Buckley, J., James, I.: Linear Regression with Censored Data. Biometrika 66(3), 429–436 (1979)CrossRefzbMATHGoogle Scholar
  19. 19.
    Ishwaran, H., Kogalur, U.B., Blackstone, E.H., Lauer, M.S.: Random Survival Forests. The Annals of Applied Statistics, 841–860 (2008)Google Scholar
  20. 20.
    Ma, S., Kosorok, M.R., Fine, J.P.: Additive Risk Models for Survival Data with High-Dimensional Covariates. Biometrics 62(1), 202–210 (2006)CrossRefzbMATHMathSciNetGoogle Scholar
  21. 21.
    Huang, J., Ma, S., Xie, H.: Regularized Estimation in the Accelerated Failure Time Model with High-Dimensional Covariates. Biometrics 62(3), 813–820 (2006)CrossRefzbMATHMathSciNetGoogle Scholar
  22. 22.
    Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC Press (1984)Google Scholar
  23. 23.
    Lee, E.T., Wang, J.: Statistical Methods for Survival Data Analysis, vol. 476. John Wiley & Sons (2003)Google Scholar
  24. 24.
    Holford, T.R.: Multivariate Methods in Epidemiology. Oxford University Press (2002)Google Scholar
  25. 25.
    Mendes, A.C., Fard, N.: Accelerated Failure Time Models Comparison to the Proportional Hazard Model for Time-Dependent Covariates with Recurring Events. International Journal of Reliability, Quality and Safety Engineering 21(2) (2014)Google Scholar
  26. 26.
    Zeng, D., Lin, D.Y.: Efficient Estimation for the Accelerated Failure Time Model. Journal of the American Statistical Association 102(480), 1387–1396 (2007)CrossRefzbMATHMathSciNetGoogle Scholar
  27. 27.
    Sadeghzadeh, K., Fard, N.: Nonparametric Data Reduction Approach for Large-Scale Survival Data Analysis. IEEE (2015)Google Scholar
  28. 28.
    Sadeghzadeh, K., Fard, N.: Multidisciplinary Decision-Making Approach to High-Dimensional Event History Analysis through Variable Reduction. European Journal of Economics and Management 1(2), 76–89 (2014)Google Scholar
  29. 29.
    Stute, W., Wang, J.L.: The Strong Law under Random Censorship. The Annals of Statistics, 1591–1607 (1993)Google Scholar
  30. 30.
    Feo, T.A., Resende, M.G.: Greedy Randomized Adaptive Search Procedures. Journal of Global Optimization 6(2), 109–133 (1995)CrossRefzbMATHMathSciNetGoogle Scholar
  31. 31.
    Hart, J.P., Shogan, A.W.: Semi-Greedy Heuristics: An Empirical Study. Operations Research Letters 6(3), 107–114 (1987)CrossRefzbMATHMathSciNetGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Department of Mechanical and Industrial EngineeringNortheastern UniversityBostonUSA

Personalised recommendations