Abstract
Intrinsically disorder proteins are abundant in nature and can be accurately identified from sequences using computational predictors. While predictions of disorder are relatively easy to obtain there are no tools to assess their quality for a particular amino acid or protein. Quality assessment (QA) scores that quantify correctness of the predictions are not available. We define QA for the prediction of intrinsic disorder and use a large dataset of over 25 thousand proteins and ten modern predictors of disorder to empirically assess the first approach to quantify QA scores. We formulate the QA scores based on the readily available propensities of the intrinsic disorder generated by the ten methods. Our evaluation reveals that these QA scores offer good predictive performance for native structured residues (AUC > 0.74) and poor predictive performance for native disordered residues (AUC < 0.67). Specifically, we show that most of the native disordered residues that are incorrectly predicted as structured have high QA values that inaccurately suggest that these predictions are correct. Consequently, more research is needed to develop high-quality QA scores. We also outline three possible future research directions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Dunker, A.K., Babu, M.M., Barbar, E., Blackledge, M., Bondos, S.E., Dosztányi, Z., Dyson, H.J., Forman-Kay, J., Fuxreiter, M., Gsponer, J., Han, K.-H., Jones, D.T., Longhi, S., Metallo, S.J., Nishikawa, K., Nussinov, R., Obradovic, Z., Pappu, R.V., Rost, B., Selenko, P., Subramaniam, V., Sussman, J.L., Tompa, P., Uversky, V.N.: What’s in a name? Why these proteins are intrinsically disordered. Intrinsically Disord. Proteins 1, e24157 (2013)
van der Lee, R., Buljan, M., Lang, B., Weatheritt, R.J., Daughdrill, G.W., Dunker, A.K., Fuxreiter, M., Gough, J., Gsponer, J., Jones, D.T., Kim, P.M., Kriwacki, R.W., Oldfield, C.J., Pappu, R.V., Tompa, P., Uversky, V.N., Wright, P.E., Babu, M.M.: Classification of intrinsically disordered regions and proteins. Chem. Rev. 114, 6589–6631 (2014)
Peng, Z., Yan, J., Fan, X., Mizianty, M.J., Xue, B., Wang, K., Hu, G., Uversky, V.N., Kurgan, L.: Exceptionally abundant exceptions: comprehensive characterization of intrinsic disorder in all domains of life. Cell. Mol. Life Sci. 72, 137–151 (2015)
Xue, B., Dunker, A.K., Uversky, V.N.: Orderly order in protein intrinsic disorder distribution: disorder in 3500 proteomes from viruses and the three domains of life. J. Biomol. Struct. Dyn. 30, 137–149 (2012)
Ward, J.J., Sodhi, J.S., McGuffin, L.J., Buxton, B.F., Jones, D.T.: Prediction and functional analysis of native disorder in proteins from the three kingdoms of life. J. Mol. Biol. 337, 635–645 (2004)
Fuxreiter, M., Toth-Petroczy, A., Kraut, D.A., Matouschek, A., Lim, R.Y., Xue, B., Kurgan, L., Uversky, V.N.: Disordered proteinaceous machines. Chem. Rev. 114, 6806–6843 (2014)
Xue, B., Blocquel, D., Habchi, J., Uversky, A.V., Kurgan, L., Uversky, V.N., Longhi, S.: Structural disorder in viral proteins. Chem. Rev. 114, 6880–6911 (2014)
Kozlowski, L.P., Bujnicki, J.M.: MetaDisorder: a meta-server for the prediction of intrinsic disorder in proteins. BMC Bioinform. 13, 1–11 (2012)
Peng, Z., Oldfield, C.J., Xue, B., Mizianty, M.J., Dunker, A.K., Kurgan, L., Uversky, V.N.: A creature with a hundred waggly tails: intrinsically disordered proteins in the ribosome. Cell. Mol. Life Sci. 71, 1477–1504 (2014)
Xue, B., Mizianty, M.J., Kurgan, L., Uversky, V.N.: Protein intrinsic disorder as a flexible armor and a weapon of HIV-1. Cell. Mol. Life Sci. 69, 1211–1259 (2012)
Pentony, M.M., Jones, D.T.: Modularity of intrinsic disorder in the human proteome. Proteins 78, 212–221 (2010)
Wang, C., Uversky, V.N., Kurgan, L.: Disordered nucleiome: abundance of intrinsic disorder in the DNA- and RNA-binding proteins in 1121 species from Eukaryota. Bacteria and Archaea. Proteomics 16, 1486–1498 (2016)
Peng, Z., Xue, B., Kurgan, L., Uversky, V.N.: Resilience of death: intrinsic disorder in proteins involved in the programmed cell death. Cell Death Differ. 20, 1257–1267 (2013)
Oldfield, C.J., Xue, B., Van, Y.Y., Ulrich, E.L., Markley, J.L., Dunker, A.K., Uversky, V.N.: Utilization of protein intrinsic disorder knowledge in structural proteomics. Biochim. Biophys. Acta 1834, 487–498 (2013)
Potenza, E., Domenico, T.D., Walsh, I., Tosatto, S.C.E.: MobiDB 2.0: an improved database of intrinsically disordered and mobile proteins. Nucleic Acids Res. 43, D315–D320 (2015)
Di Domenico, T., Walsh, I., Martin, A.J.M., Tosatto, S.C.E.: MobiDB: a comprehensive database of intrinsic protein disorder annotations. Bioinformatics 28, 2080–2081 (2012)
Oates, M.E., Romero, P., Ishida, T., Ghalwash, M., Mizianty, M.J., Xue, B., Dosztányi, Z., Uversky, V.N., Obradovic, Z., Kurgan, L., Dunker, A.K., Gough, J.: D2P2: database of disordered protein predictions. Nucleic Acids Res. 41, D508–D516 (2013)
Deng, X., Eickholt, J., Cheng, J.: A comprehensive overview of computational protein disorder prediction methods. Mol. BioSyst. 8, 114–121 (2012)
Monastyrskyy, B., Fidelis, K., Moult, J., Tramontano, A., Kryshtafovych, A.: Evaluation of disorder predictions in CASP9. Proteins 79(Suppl 10), 107–118 (2011)
Monastyrskyy, B., Kryshtafovych, A., Moult, J., Tramontano, A., Fidelis, K.: Assessment of protein disorder region predictions in CASP10. Proteins 82(Suppl 2), 127–137 (2014)
Peng, Z.L., Kurgan, L.: Comprehensive comparative assessment of in-silico predictors of disordered regions. Curr. Protein Pept. Sci. 13, 6–18 (2012)
Walsh, I., Giollo, M., Di Domenico, T., Ferrari, C., Zimmermann, O., Tosatto, S.C.: Comprehensive large-scale assessment of intrinsic protein disorder. Bioinformatics 31, 201–208 (2015)
Noivirt-Brik, O., Prilusky, J., Sussman, J.L.: Assessment of disorder predictions in CASP8. Proteins 77(Suppl 9), 210–216 (2009)
Kihara, D., Chen, H., Yang, Y.D.: Quality assessment of protein structure models. Curr. Protein Pept. Sci. 10, 216–228 (2009)
Skwark, M.J., Elofsson, A.: PconsD: ultra rapid, accurate model quality assessment for protein structure prediction. Bioinformatics 29, 1817–1818 (2013)
McGuffin, L.J., Buenavista, M.T., Roche, D.B.: The ModFOLD4 server for the quality assessment of 3D protein models. Nucleic Acids Res. 41, W368–W372 (2013)
Cao, R., Bhattacharya, D., Adhikari, B., Li, J., Cheng, J.: Massive integration of diverse protein quality assessment methods to improve template based modeling in CASP11. Proteins 84(Suppl 1), 247–259 (2016)
Kryshtafovych, A., Fidelis, K.: Protein structure prediction and model quality assessment. Drug Discov. Today 14, 386–393 (2009)
UniProt Consortium: UniProt: a hub for protein information. Nucleic Acids Res. 43, D204–D212 (2015)
Fu, L., Niu, B., Zhu, Z., Wu, S., Li, W.: CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012)
Walsh, I., Martin, A.J., Di Domenico, T., Tosatto, S.C.: ESpritz: accurate and fast prediction of protein disorder. Bioinformatics 28, 503–509 (2012)
Sickmeier, M., Hamilton, J.A., LeGall, T., Vacic, V., Cortese, M.S., Tantos, A., Szabo, B., Tompa, P., Chen, J., Uversky, V.N., Obradovic, Z., Dunker, A.K.: DisProt: the database of disordered proteins. Nucleic Acids Res. 35, D786–D793 (2007)
Dosztanyi, Z., Csizmok, V., Tompa, P., Simon, I.: IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content. Bioinformatics 21, 3433–3434 (2005)
Linding, R., Jensen, L.J., Diella, F., Bork, P., Gibson, T.J., Russell, R.B.: Protein disorder prediction: implications for structural proteomics. Structure 11, 1453–1459 (2003)
Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The protein data bank. Nucleic Acids Res. 28, 235–242 (2000)
Yang, Z.R., Thomson, R., McNeil, P., Esnouf, R.M.: RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins. Bioinformatics 21, 3369–3376 (2005)
Peng, K., Radivojac, P., Vucetic, S., Dunker, A.K., Obradovic, Z.: Length-dependent prediction of protein intrinsic disorder. BMC Bioinform. 7, 208 (2006)
Linding, R., Russell, R.B., Neduva, V., Gibson, T.J.: GlobPlot: exploring protein sequences for globularity and disorder. Nucleic Acids Res. 31, 3701–3708 (2003)
Acknowledgments
We thank Dr. Silvio Tosatto and his research group from University of Padova for sharing their dataset and predictions of disorder, which they published in ref. [22]. This research was supported in part by the National Science Foundation grant 1617369 and by the Qimonda Endowed Chair from Virginia Commonwealth University to L.K.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Wu, Z., Hu, G., Wang, K., Kurgan, L. (2017). Exploratory Analysis of Quality Assessment of Putative Intrinsic Disorder in Proteins. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J. (eds) Artificial Intelligence and Soft Computing. ICAISC 2017. Lecture Notes in Computer Science(), vol 10245. Springer, Cham. https://doi.org/10.1007/978-3-319-59063-9_65
Download citation
DOI: https://doi.org/10.1007/978-3-319-59063-9_65
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59062-2
Online ISBN: 978-3-319-59063-9
eBook Packages: Computer ScienceComputer Science (R0)