Sequence-Based Prediction of Protein Secretion Success in Aspergillus niger

  • Bastiaan A. van den Berg
  • Jurgen F. Nijkamp
  • Marcel J. T. Reinders
  • Liang Wu
  • Herman J. Pel
  • Johannes A. Roubos
  • Dick de Ridder
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6282)

Abstract

The cell-factory Aspergillus niger is widely used for industrial enzyme production. To select potential proteins for large-scale production, we developed a sequence-based classifier that predicts if an over-expressed homologous protein will successfully be produced and secreted. A dataset of 638 proteins was used to train and validate a classifier, using a 10-fold cross-validation protocol. Using a linear discriminant classifier, an average accuracy of 0.85 was achieved. Feature selection results indicate what features are mostly defining for successful protein production, which could be an interesting lead to couple sequence characteristics to biological processes involved in protein production and secretion

Keywords

Aspergillus niger protein secretion sequence-based prediction classification 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Benita, Y., Wise, M., Lok, M., Humphery-Smith, I., Oosting, R.: Analysis of high throughput protein expression in Escherichia coli. Mol. Cell. Proteomics 5(9), 1567 (2006)CrossRefPubMedGoogle Scholar
  2. 2.
    Chang, C., Lin, C.: LIBSVM: a library for support vector machines (2001)Google Scholar
  3. 3.
    Duin, R., Juszczak, P., Paclik, P., Pekalska, E., de Ridder, D., Tax, D., Verzakov, S.: A Matlab toolbox for pattern recognition. PRTools version 4.1, 3 (2000)Google Scholar
  4. 4.
    Horton, P., Park, K., Obayashi, T., Fujita, N., Harada, H., Adams-Collier, C., Nakai, K.: WoLF PSORT: protein localization predictor. Nucleic Acids Res. 35(Web Server issue), W585–W587 (2007)CrossRefGoogle Scholar
  5. 5.
    Klee, E., Sosa, C.: Computational classification of classically secreted proteins. Drug Discovery Today 12(5-6), 234–240 (2007)CrossRefPubMedGoogle Scholar
  6. 6.
    Kurgan, L., Razib, A., Aghakhani, S., Dick, S., Mizianty, M., Jahandideh, S.: CRYSTALP2: sequence-based protein crystallization propensity prediction. BMC Struct. Biol. 9, 50 (2009)CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Kyte, J., Doolittle, R.: A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157(1), 105–132 (1982)CrossRefPubMedGoogle Scholar
  8. 8.
    Magnan, C., Randall, A., Baldi, P.: SOLpro: accurate sequence-based prediction of protein solubility. Bioinformatics 25(17), 2200–2207 (2009)CrossRefPubMedGoogle Scholar
  9. 9.
    Matthews, B.: Comparison of the predicted and observed secondary structure of t4 phage lysozyme. BBA-Protein Struct. 405(2), 442–451 (1975)CrossRefGoogle Scholar
  10. 10.
    Mitra, N., Sinha, S., Ramya, T., Surolia, A.: N-linked oligosaccharides as outfitters for glycoprotein folding, form and function. Trends Biochem. Sci. 31(3), 156–163 (2006)CrossRefPubMedGoogle Scholar
  11. 11.
    Nevalainen, K., Te’o, V., Bergquist, P.: Heterologous protein expression in filamentous fungi. Trends Biotechnol. 23(9), 468–474 (2005)CrossRefPubMedGoogle Scholar
  12. 12.
    Nielsen, H., Engelbrecht, J., Brunak, S., Von Heijne, G.: Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng., Des. Sel. 10(1), 1 (1997)CrossRefGoogle Scholar
  13. 13.
    Pel, H., de Winde, J., Archer, D., Dyer, P., Hofmann, G., Schaap, P., Turner, G., de Vries, R., Albang, R., Albermann, K., et al.: Genome sequencing and analysis of the versatile cell factory Aspergillus niger CBS 513.88. Nat. Biotechnol. 25(2), 221–231 (2007)CrossRefPubMedGoogle Scholar
  14. 14.
    Pierleoni, A., Martelli, P., Fariselli, P., Casadio, R.: BaCelLo: a balanced subcellular localization predictor. Bioinformatics 22(14), e408–e416 (2006)CrossRefGoogle Scholar
  15. 15.
    Sharp, P.M., Li, W.H.: The codon adaptation index - a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 15(3), 1281 (1987)CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Tsang, A., Butler, G., Powlowski, J., Panisko, E., Baker, S.: Analytical and computational approaches to define the Aspergillus niger secretome. Fungal Genet. Biol. 46(1), S153 (2009)CrossRefGoogle Scholar
  17. 17.
    Wessels, L., Reinders, M., Hart, A., Veenman, C., Dai, H., He, Y., van’t Veer, L.: A protocol for building and evaluating predictors of disease state based on microarray data. Bioinformatics 21(19), 3755–3762 (2005)CrossRefPubMedGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Bastiaan A. van den Berg
    • 1
    • 2
    • 4
  • Jurgen F. Nijkamp
    • 1
    • 4
  • Marcel J. T. Reinders
    • 1
    • 2
    • 4
  • Liang Wu
    • 3
  • Herman J. Pel
    • 3
  • Johannes A. Roubos
    • 3
  • Dick de Ridder
    • 1
    • 2
    • 4
  1. 1.The Delft Bioinformatics LabDelft University of TechnologyThe Netherlands
  2. 2.Netherlands Bioinformatics CentreThe Netherlands
  3. 3.DSM Biotechnology CenterThe Netherlands
  4. 4.Kluyver Centre for Genomics of Industrial FermentationThe Netherlands

Personalised recommendations