Measuring Reproducibility of High-Throughput Deep-Sequencing Experiments Based on Self-adaptive Mixture Copula

  • Qian Zhang
  • Junping Zhang
  • Chenghai Xue
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7818)


Measurement of the statistical reproducibility between biological experiment replicates is vital first step of the entire series of bioinformatics analysis for mining meaningful biological discovery from mega-data. To distinguish the real biological relevant signals from artificial signals, irreproducible discovery rate (IDR) employing Copula, which can separate dependence structure and marginal distribution from data, has been put forth. However, IDR employed a Gaussian Copula which may cause underestimation of risk and limit the robustness of the method. To address the issue, we propose a Self-adaptive Mixture Copula (SaMiC) to measure the reproducibility of experiment replicates from high-throughput deep-sequencing data. Simple and easy to implement, the proposed SaMiC method can self-adaptively tune its coefficients so that the measurement of reproducibility is more effective for general distributions. Experiments in simulated and real data indicate that compared with IDR, the SaMiC method can better estimate reproducibility between replicate samples.


Marginal Distribution Dependence Structure Tail Dependence Copula Model Gaussian Copula 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Li, Q., Brown, J.B., Huang, H., Bickel, P.: Measuring reproducibility of high-throughput experiments. The Annals of Applied Statistics 5(3), 1752–1779 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Kole, E., Koedijk, K., Verbeek, M.: Selecting copulas for risk management. Journal of Banking & Finance 31(8), 2405–2423 (2007)CrossRefGoogle Scholar
  3. 3.
    Frey, R., McNeil, A.: Dependent defaults in models of portfolio credit risk. Journal of Risk 6, 59–92 (2003)Google Scholar
  4. 4.
    Trivedi, P., Zimmer, D.: Copula modeling: an introduction for practitioners, vol. 1. Now Pub. (2007)Google Scholar
  5. 5.
    Sklar, A.: Fonctions de répartition à n dimensions et leurs marges. Publ. Inst. Statist. Univ. Paris 3, 229–231 (1959)MathSciNetGoogle Scholar
  6. 6.
    Deheuvels, P.: A Kolmogorov-Smirnov type test for independence and multivariate samples. Rev. Roumaine Math. Pures Appl. 26(2), 213–226 (1981)MathSciNetzbMATHGoogle Scholar
  7. 7.
    Nelsen, R.B.: An introduction to copulas. Springer, New York (1999)CrossRefzbMATHGoogle Scholar
  8. 8.
    Oakes, D.: Multivariate survival distributions. Nonparametric Statistics 3(3-4), 343–354 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Genest, C., Ghoudi, K., Rivest, L.P.: A semiparametric estimation procedure of dependence parameters in multivariate families of distributions. Biometrika 82(3), 543–552 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Joe, H.: Asymptotic efficiency of the two-stage estimation method for copula-based models. Journal of Multivariate Analysis 94(2), 401–419 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Chen, X., Fan, Y.: Estimation of copula-based semiparametric time series models. Journal of Econometrics 130(2), 307–335 (2006)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Abegaz, F., Naik-Nimbalkar, U.V.: Modeling statistical dependence of markov chains via copula models. Journal of Statistical Planning and Inference 138(4), 1131–1146 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Klugman, S.A., Parsa, R.: Fitting bivariate loss distributions with copulas. Insurance: Mathematics and Economics 24(1-2), 139–148 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Hu, L.: Dependence patterns across financial markets: a mixed copula approach. Applied Financial Economics 16(10), 717–729 (2006)CrossRefGoogle Scholar
  15. 15.
    Engle, R.F., Manganelli, S.: Caviar. Journal of Business and Economic Statistics 22(4), 367–381 (2004)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Patton, A.J.: Modelling asymmetric exchange dependence. International Economic Review 47(2), 527–556 (2006)MathSciNetCrossRefGoogle Scholar
  17. 17.
    Embrechts, P., McNeil, A., Straumann, D.: Correlation: pitfalls and alternatives. RISK Magazine 12, 69–71 (1999)Google Scholar
  18. 18.
    Kim, J.M., Jung, Y.S., Sungur, E., Han, K.H., Park, C., Sohn, I.: A copula method for modeling directional dependence of genes. BMC Bioinformatics 9(225) (2008)Google Scholar
  19. 19.
    Zhang, Y., Liu, T., Meyer, C., Eeckhoute, J., Johnson, D., Bernstein, B., Nussbaum, C., Myers, R., Brown, M., Li, W., et al.: Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9(9), R137 (2008)CrossRefGoogle Scholar
  20. 20.
    Myers, R., Stamatoyannopoulos, J., Snyder, M., Dunham, I., Hardison, R., Bernstein, B., Gingeras, T., Kent, W., Birney, E., et al.: A user’s guide to the encyclopedia of dna elements (ENCODE project consortium). PLoS Biol. 9(4), e1001046 (2011)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Qian Zhang
    • 1
  • Junping Zhang
    • 1
  • Chenghai Xue
    • 2
  1. 1.Shanghai Key Lab of Intelligent Information Processing, School of Computer ScienceFudan UniversityChina
  2. 2.Cold Spring Harbor LaboratoryUSA

Personalised recommendations