Advertisement

IJCRS 2018: Rough Sets pp 337-349

# Contextual Probability Estimation from Data Samples – A Generalisation

• Hui Wang
• Bowen Wang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11103)

## Abstract

Contextual probability (G) provides an alternative, efficient way of estimating (primary) probability (P) in a principled way. G is defined in terms of P in a combinatorial way, and they have a simple linear relationship. Consequently, if one is known, the other can be calculated. It turns out G can be estimated based on a set of data samples through a simple process called neighbourhood counting. Many results about contextual probability are obtained based on the assumption that the event space is the power set of the sample space. However, the real world is usually not the case. For example, in a multidimensional sample space, the event space is typically the set of hyper tuples which is much smaller than the power set. In this paper, we generalise contextual probability to multidimensional sample space where the attributes may be categorical or numerical. We present results about the normalisation constant, the relationship between G and P and the neighbourhood counting process.

## Keywords

Probability estimation Contextual probability Neighbourhood counting

## References

1. 1.
Ash, R.B., Doléans-Dade, C.: Probability and Measure Theory. Academic Press, San Diego (2000)
2. 2.
Chen, S., Ma, B., Zhang, K.: On the similarity and the distance metric. Theoret. Comput. Sci. 410(24–25), 2365–2376 (2009)
3. 3.
Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. Wiley, New York (1973)
4. 4.
Feller, W.: An Introduction to Probability Theory and Its Applications. Wiley, New York (1968)
5. 5.
Hajek, A.: Probability, logic and probability logic. In: Goble, L. (ed.) Blackwell Companion to Logic, pp. 362–384. Blackwell, Oxford (2000)Google Scholar
6. 6.
Lin, Z., Lyu, M., King, I.: Matchsim: a novel similarity measure based on maximum neighborhood matching. Knowl. Inf. Syst. 32, 141–166 (2012)
7. 7.
Mani, A.: Comparing dependencies in probability theory and general rough sets: Part-a. arXiv:1804.02322v1
8. 8.
Mani, A.: Probabilities, dependence and rough membership functions. Int. J. Comput. Appl. 39, 17–35 (2017)Google Scholar
9. 9.
TolgaKahraman, H.: A novel and powerful hybrid classifier method: development and testing of heuristic k-nn algorithm with fuzzy distance metric. Data Knowl. Eng. 103, 44–59 (2016)
10. 10.
Wang, H.: Nearest neighbors by neighborhood counting. IEEE Trans. Pattern Anal. Mach. Intell. 28(6), 942–953 (2006)
11. 11.
Wang, H., Düentsch, I., Trindade, L.: Lattice machine classification based on contextual probability. Fundamenta Informaticae 127(1–4), 241–256 (2013).
12. 12.
Wang, H., Düntsch, I., Gediga, G., Skowron, A.: Hyperrelations in version space. Int. J. Approximate Reasoning 36(3), 223–241 (2004)
13. 13.
Wang, X., Ouyang, J., Chen, G.: Simplifying calculation of graph similarity through matrices. In: Li, D., Li, Z. (eds.) CCTA 2015. IAICT, vol. 479, pp. 417–428. Springer, Cham (2016).

## Copyright information

© Springer Nature Switzerland AG 2018

## Authors and Affiliations

1. 1.Ulster UniversityJordanstownUK
2. 2.Mavern SecuritiesLondonUK