Large Data Matrices: Random Walk Model and Application of Entropy in HIV Mother to Child Transmission (MTCT)

  • D. M. Basavarajaiah
  • Bhamidipati Narasimha Murthy


Random walk model (RWM) is one of the important models that help to reduce massive data sets, which can be used for data reduction techniques (Brand M et al: IJMS 98:21–25, 1998; Abello PM, Pardalos MG, Resende C et al: Handbook of massive data sets. Kluwer Academic Publishers, Norwell, 2002; Andrijich SM, Caccetta L et al: J Nonlinear Analysis 47:5525–5536, 2001). In (RWM) formulated model, each variable of HIV mother to child transmission will reduce massive data sets without loss of any information (variables, namely, ARV prophylaxis at the time of onset and before birth of neonatal baby, prolonged breastfeeding, placental abruption and high RNA plasma viral load, drug-induced toxicity, HIV-associated illness, lower CD4 counts, WHO advanced clinical stage, opportunistic infections (OIs), etc., were taken randomly). The main goal of highly active anti-retroviral therapy (HAART) infers to increase the length of longevity and sustain good quality of life (QOL)—improving their immune function. This was achieved by reducing the large amount of replicating virus VL to as low as possible in all sites where HIV-infected cells are present, thereby preventing infection of new cells and further damage to the immune system. The amount of replicating virus in plasma can be assayed by measuring the concentration of HIV RNA, referred to as the viral load (VL). In practice, increased trend of viral load (VL) has been observed in pregnant mothers when they are not adhered to the “HAART” treatment. In addition, consideration of a plan for salvage or second-line regimen is required if initial therapy fails. However, HIV is a deadly disease, and it requires a lifelong treatment. Usually it is presumed that various parametric and non-parametric high-dimensional life data sets have been generated from the treatment follow-up records. Practically indeed, clinicians and researchers have been facing many problems while carrying out the analysis of high-dimensional data sets to draw the meaningful interpretations of the results to take appropriate and necessary clinical decisions about the population. Many researchers and investigators have been using various techniques of data reduction by traditional methods, i.e., divergent analysis (DA), principal component analysis (PCA), factor analysis, etc. Many authors reported that the analytical intervention is a very essential domain to solve the realistic problems of HIV progression; this research gap will need to be tested and formulate new analytical mathematical models for smoothening of driven HIV data sets (Agrawal R, Gehrke JD, Raghaven P et al: ACM Sigmod 27(Suppl 2):94–105, 1998; Ahuja RK, Magnanti TL, Orlin JB et al: Network flows-theory on algorithm and applications, vol 185. Prentice Hall, Englewood Cliffs, pp 1851–1856, 1993; Artho P, Hirase H, Monoconduit L, Zugaro M et al: J Neurophys 92:600–608, 2004). In order to overcome these problems, the present chapter aims to fit random walk model for massive data sets on HIV mother to child transmission (MTCT), and also an attempt was made to explore the application of various entropy deviation methods.


  1. Abello PM, Pardalos MG, Resende C et al (2002) Handbook of massive data sets. Kluwer Academic Publishers, NorwellCrossRefGoogle Scholar
  2. Agarwal R, Gehrke JD, Raghaven P et al (1998) Automatic subspace clustering of high dimensional data for data mining. ACM Sigmod 27(Suppl 2):94–105Google Scholar
  3. Ahuja RK, Magnanti TL, Orlin JB et al (1993) Network flows- theory on algorithm and applications, vol 185. Prentice Hall, Englewood Cliffs, pp 1851–1856Google Scholar
  4. Anderson RP, Gomez-Laverde M, Peterson AT (2002) Geographical distributions of spiny pocket mice in South America: insights from predictive models. Glob Ecol Biogeogr 11:131–141CrossRefGoogle Scholar
  5. Andrijich SM, Caccetta L et al (2001) Solving the multisensor data association problem. J Nonlinear Analysis 47:5525–5536CrossRefGoogle Scholar
  6. Artho P, Hirase H, Monoconduit L, Zugaro M et al (2004) Characterization of neocortical principal cells and interneurons by network interactions and extracellular features. J Neurophys 92:600–608CrossRefGoogle Scholar
  7. Aspinall R et al (1993) An inductive modeling procedure based on Bayes’ theorem for analysis of pattern in spatial data. Int J Geogr Inf Syst 6(2):105–121CrossRefGoogle Scholar
  8. Beggs JM, Plenz D (2004) Neuronal avalanches are diverse and precise activity patterns that are stable for many hours in cortical slice cultures. J Neurosci 24:5216–5229CrossRefGoogle Scholar
  9. Bellingham J, Richards A et al (2002) Receding horizon control of autonomous aerial vehicles. In: Proceedings of the American control conference, Anchorage, AK, pp 8–10Google Scholar
  10. Berry MJ, Lino G et al (1997) Data mining techniques for marketing, sales and customer support. Wiley, New YorkGoogle Scholar
  11. Blake CL, Merz CJ et al (1998) UCI repository of machine learning databasesGoogle Scholar
  12. Brand M et al (1998) Pattern discovery via entropy minimization, Uncertainty 99: Inter-national workshop on artificial intelligence and statistics. IJMS 98:21–25Google Scholar
  13. Brown JH, Lomolino MV Biogeography, 2nd edn. Sinauer Associates, SunderlandGoogle Scholar
  14. Busby JR (1986) A biogeographical analysis of Nothofagus cunninghamii (Hook.) Oerst. in southeastern Australia. Aust J Ecol 11:1–7CrossRefGoogle Scholar
  15. Chaovaliwongse W (2003) Optimization and dynamical approaches in nonlinear time series analysis with applications in Bioengineering. Ph.D Thesis. University of FloridaGoogle Scholar
  16. Cheng C, Fu AW, Zhang Y et al (1999) Entropy-based subspace clustering for mining numerical data. In: Proceedings of international conference on knowledge discovery and data mining 18:84–93Google Scholar
  17. Cormen T, Leiserson C, Rivest L (2001) Introduction to algorithms. MIT Press, CambridgeGoogle Scholar
  18. Dewar R et al (2003) Information theory explanation of the fluctuation theorem, maximum entropy production and self-organized criticality in non-equilibrium stationary states. J Phys A Math Gen 36:631–641CrossRefGoogle Scholar
  19. Duba RO, Hart PE (1974) Pattern classification and scene analysis. Wiley-Interscience, New YorkGoogle Scholar
  20. Fischer Y, Wittner L, Freund TF, Gahwiler BH (2002) Simultaneous activation of gamma and theta network oscillations in rat hippocampal slice cultures. J Physiol Lond 539:857–868CrossRefGoogle Scholar
  21. Fisher RA (1939) The sampling distribution of some statistics obtained from nonlinear equations. Ann Eugenics 9:238–249CrossRefGoogle Scholar
  22. Guisan A, Edwards TC, Hasti T (2002) Generalized linear and generalized additive models in studies of species distributions: setting the scene. Ecol Model 157:89–100CrossRefGoogle Scholar
  23. Hanley J, McNeil B (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36CrossRefGoogle Scholar
  24. Hanley J, McNeil B (1983) A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148:839–843CrossRefGoogle Scholar
  25. Hershkovitz P et al (1998) Report on some sigmodontine rodents collected in southeastern Brazil with descriptions of a new genus and six new species. Bonner zoologische Beitrage 47:193–256Google Scholar
  26. Hirzel AH, Hausser J, Chessel D, Perrin N (2002) Ecological niche factor analysis: how to compute habitat-suitability maps without absence data? Ecology 87:2027–2036CrossRefGoogle Scholar
  27. Holdridge L, Grenke W, Hatheway W, Liang T, Tosi J Jr (1997) Forest Environments. In: Tropical life zones: a pilot study. Pergamon Press, New YorkGoogle Scholar
  28. Hutchinson GE et al (1957) Concluding remarks. In: Cold spring harbor symposia on quantitative biology 22:415–427CrossRefGoogle Scholar
  29. Phillips SJ, Dud’ık M, Schapire RE (2004) A maximum entropy approach to species distribution modeling. In: Proceedings of the 21st international conference on machine learning. ACM Press, New York 45:655–662Google Scholar
  30. Ponder WF, Carter GA, Flemons P, Chapman RR (2001) Evaluation of museum collection data for use in biodiversity assessment. Conserv Biol 15:648–657CrossRefGoogle Scholar
  31. Pulliam HR et al (2000) On the relationship between niche and distribution. Ecol Lett 3:349–361CrossRefGoogle Scholar
  32. Ratnaparkhi A et al (1998) Maximum entropy models for natural language ambiguity resolution. Ph.D. Thesis. University of Pennsylvania, PhiladelphiaGoogle Scholar
  33. Reddy S, Davalos LM et al (2003) Geographical sampling bias and its implications for conservation priorities in Africa. J Biogeogr 30:1719–1727CrossRefGoogle Scholar
  34. Root T et al (1998) Environmental factors associated with avian distributional boundaries. J Biogeogr 15:489–505CrossRefGoogle Scholar
  35. Stoppini L, Buchs PA, Muller D (1991) A simple method for organotypic cultures of nervous tissue. J Neurosci Methods 37:173–182CrossRefGoogle Scholar
  36. Tang A, Jackson D, Hobbs J, Chen W, Smith J, Patel H, Beggs JM (2007) Asecond-order maximum entropy model predicts correlated network states, but not their evolution over time (Computational and systems neuroscience, Salt Lake City) Abstract II-62Google Scholar
  37. Wagenaar DA, Madhavan R, Pine J, Potter SM (2005) Controlling bursting with closed-loop multi-electrode stimulation. J Neurosci 25:680–688CrossRefGoogle Scholar
  38. Welk E, Schubert K, Hoffmann M et al (2002) Present and potential distribution of invasive mustard (Alliara petiolata) in North America. Divers Distrib 8:219–233CrossRefGoogle Scholar
  39. Wiley EO, McNyset KM, Peterson AT, Robins CR, Stewart AM (2003) Niche modeling and geographic range predictions in the marine environment using a machine-learning algorithm. Oceanography 16(3):120–127CrossRefGoogle Scholar
  40. Williams PM et al (1995) Bayesian regularization and pruning using a Laplace prior. Neural Comput 7(1):117–143CrossRefGoogle Scholar
  41. Wu JY, Guan L, Tsau Y (1999) Propagating activation during oscillations and evoked responses in neocortical slices. J Neurosci 19:5005–5015CrossRefGoogle Scholar
  42. Zaniewski AE, Lehmann A, Overton JM (2002) Predicting species spatial distributions using presence-only data: a case study of native New Zealand ferns. Ecol Model 157:261–280CrossRefGoogle Scholar
  43. Zweig MH, Campbell G (1993) Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem 39(4):561–577PubMedGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • D. M. Basavarajaiah
    • 1
  • Bhamidipati Narasimha Murthy
    • 2
  1. 1.Department of Statistics and Computer ScienceVeterinary Animal and Fisheries Sciences UniversityBengaluruIndia
  2. 2.Department of BiostatisticsNational Institute of Epidemiology, ICMRChennaiIndia

Personalised recommendations