Contextual Probability Estimation from Data Samples – A Generalisation

  • Hui WangEmail author
  • Bowen Wang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11103)


Contextual probability (G) provides an alternative, efficient way of estimating (primary) probability (P) in a principled way. G is defined in terms of P in a combinatorial way, and they have a simple linear relationship. Consequently, if one is known, the other can be calculated. It turns out G can be estimated based on a set of data samples through a simple process called neighbourhood counting. Many results about contextual probability are obtained based on the assumption that the event space is the power set of the sample space. However, the real world is usually not the case. For example, in a multidimensional sample space, the event space is typically the set of hyper tuples which is much smaller than the power set. In this paper, we generalise contextual probability to multidimensional sample space where the attributes may be categorical or numerical. We present results about the normalisation constant, the relationship between G and P and the neighbourhood counting process.


Probability estimation Contextual probability Neighbourhood counting 


  1. 1.
    Ash, R.B., Doléans-Dade, C.: Probability and Measure Theory. Academic Press, San Diego (2000)zbMATHGoogle Scholar
  2. 2.
    Chen, S., Ma, B., Zhang, K.: On the similarity and the distance metric. Theoret. Comput. Sci. 410(24–25), 2365–2376 (2009)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, New York (1973)zbMATHGoogle Scholar
  4. 4.
    Feller, W.: An Introduction to Probability Theory and Its Applications. Wiley, New York (1968)zbMATHGoogle Scholar
  5. 5.
    Hajek, A.: Probability, logic and probability logic. In: Goble, L. (ed.) Blackwell Companion to Logic, pp. 362–384. Blackwell, Oxford (2000)Google Scholar
  6. 6.
    Lin, Z., Lyu, M., King, I.: Matchsim: a novel similarity measure based on maximum neighborhood matching. Knowl. Inf. Syst. 32, 141–166 (2012)CrossRefGoogle Scholar
  7. 7.
    Mani, A.: Comparing dependencies in probability theory and general rough sets: Part-a. arXiv:1804.02322v1
  8. 8.
    Mani, A.: Probabilities, dependence and rough membership functions. Int. J. Comput. Appl. 39, 17–35 (2017)Google Scholar
  9. 9.
    TolgaKahraman, H.: A novel and powerful hybrid classifier method: development and testing of heuristic k-nn algorithm with fuzzy distance metric. Data Knowl. Eng. 103, 44–59 (2016)CrossRefGoogle Scholar
  10. 10.
    Wang, H.: Nearest neighbors by neighborhood counting. IEEE Trans. Pattern Anal. Mach. Intell. 28(6), 942–953 (2006)CrossRefGoogle Scholar
  11. 11.
    Wang, H., Düentsch, I., Trindade, L.: Lattice machine classification based on contextual probability. Fundamenta Informaticae 127(1–4), 241–256 (2013). Scholar
  12. 12.
    Wang, H., Düntsch, I., Gediga, G., Skowron, A.: Hyperrelations in version space. Int. J. Approximate Reasoning 36(3), 223–241 (2004)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Wang, X., Ouyang, J., Chen, G.: Simplifying calculation of graph similarity through matrices. In: Li, D., Li, Z. (eds.) CCTA 2015. IAICT, vol. 479, pp. 417–428. Springer, Cham (2016). Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Ulster UniversityJordanstownUK
  2. 2.Mavern SecuritiesLondonUK

Personalised recommendations