Integration of Data-Space and Statistics-Space Boundary-Based Test to Control the False Positive Rate

  • Jin-Xiong Lv
  • Shikui TuEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10956)


Many multivariate statistical methods have been applied to detect the difference between case and control population. However, it is difficult to control the false positive rate, especially under small sample size. Traditional family-wise error rate or false discovery rate adjusts the p values based on the distribution or ranks of p value in the same multiple testing. In this paper, we investigated the performance of integrating the Data-space boundary-based test (BBT) and Statistics-space BBT to control the false positive rate, under a previous proposed framework called Integrative Hypothesis Tests (IHT). The classification accuracy rate by Data-space BBT provides valuable information complementary to the p value from Statistics-space BBT. The simulation results demonstrated that the integration effectively controls the false positive rate even for small-sample-size cases. Experiments on the real-world dataset of bipolar disorder also validated the effectiveness of the integration.


Integrative Hypothesis Test Boundary-based test False positive rate Multivariate statistical method Joint-SNVs analysis Bipolar disorder 



This work was supported by a grant from Shanghai Jiao Tong University, NO. WF220403029.


  1. 1.
    Han, F., Pan, W.: A data-adaptive sum test for disease association with multiple common or rare variants. Hum. Hered. 70(1), 42–54 (2010)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Lee, S., Wu, M.C., Lin, X.: Optimal tests for rare variant effects in sequencing association studies. Biostatistics 13(4), 762–775 (2012)CrossRefGoogle Scholar
  3. 3.
    Wu, M.C., et al.: Rare-variant association testing for sequencing data with the sequence kernel association test. Am. J. Hum. Genet. 89(1), 82–93 (2011)CrossRefGoogle Scholar
  4. 4.
    Price, A.L., et al.: Pooled association tests for rare variants in exon-resequencing studies. Am. J. Hum. Genet. 86(6), 832–838 (2010)CrossRefGoogle Scholar
  5. 5.
    Dunn, O.J.: Multiple comparisons among means. J. Am. Stat. Assoc. 56(293), 52–64 (1961)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Holm, S.: A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70 (1979)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Royal Stat. Soc. Series B (Methodol.) 57(1), 289–300 (1995)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Storey, J.D.: The positive false discovery rate: a Bayesian interpretation and the q-value. Ann. Stat. 31(6), 2013–2035 (2003)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Xu, L.: Integrative hypothesis test and A5 formulation: sample pairing delta, case control study, and boundary based statistics. In: Sun, C., Fang, F., Zhou, Z.-H., Yang, W., Liu, Z.-Y. (eds.) IScIDE 2013. LNCS, vol. 8261, pp. 887–902. Springer, Heidelberg (2013). Scholar
  10. 10.
    Xu, L.: Bi-linear matrix-variate analyses, integrative hypothesis tests, and case-control studies. Appl. Inform. 2, 4 (2015). Springer, Berlin HeidelbergCrossRefGoogle Scholar
  11. 11.
    Xu, L., Jiang, C.: Semi-blind bilinear matrix system, BYY harmony learning, and gene analysis applications. In: 2012 6th International Conference on New Trends in Information Science and Service Science and Data Mining (ISSDM). IEEE (2012)Google Scholar
  12. 12.
    Jiang, K.-M., Lu, B.-L., Xu, L.: Bootstrapped integrative hypothesis test, COPD-lung cancer differentiation, and joint miRNAs biomarkers. In: He, X., Gao, X., Zhang, Y., Zhou, Z.-H., Liu, Z.-Y., Fu, B., Hu, F., Zhang, Z. (eds.) IScIDE 2015. LNCS, vol. 9243, pp. 538–547. Springer, Cham (2015). Scholar
  13. 13.
    Lv, J.-X., et al.: A comparison study on multivariate methods for joint-SNVs association analysis. In: 2016 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE (2016)Google Scholar
  14. 14.
    Lv, J., Tu, S., Xu, L.: A comparative study of joint-SNVs analysis methods and detection of susceptibility genes for gastric cancer in Korean population. In: Sun, Y., Lu, H., Zhang, L., Yang, J., Huang, H. (eds.) IScIDE 2017. LNCS, vol. 10559, pp. 619–630. Springer, Cham (2017). Scholar
  15. 15.
    Lv, J.-X., et al.: Comparative studies on multivariate tests for joint-SNVs analysis and detection for bipolar disorder susceptibility genes. Int. J. Data Min. Bioinform. 17(4), 341–358 (2017)CrossRefGoogle Scholar
  16. 16.
    Suykens, J.A., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999)CrossRefGoogle Scholar
  17. 17.
    Hotelling, H.: The generalization of student’s ratio. In: Kotz, S., Johnson, N.L. (eds.) Breakthroughs in Statistics. Springer Series in Statistics (Perspectives in Statistics), pp. 54–65. Springer, New York (1992). Scholar
  18. 18.
    Anderson, I.M., Haddad, P.M., Scott, J.: Bipolar disorder. Br. Med. J. BMJ (Online) 345 (2012)CrossRefGoogle Scholar
  19. 19.
    Pompili, M., et al.: Epidemiology of suicide in bipolar disorders: a systematic review of the literature. Bipolar Disord. 15(5), 457–490 (2013)CrossRefGoogle Scholar
  20. 20.
    Merikangas, K.R., et al.: Prevalence and correlates of bipolar spectrum disorder in the world mental health survey initiative. Arch. Gen. Psychiatry 68(3), 241–251 (2011)CrossRefGoogle Scholar
  21. 21.
    Wang, K., Li, M., Hakonarson, H.: ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38(16), e164–e164 (2010)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Computer Science and Engineering, School of Electronic Information and Electrical EngineeringShanghai Jiao Tong UniversityShanghaiChina

Personalised recommendations