Abstract
In 2001, the release of the first draft of the human genome marked the beginning of the Big Data era for biological sciences. Since then, the complexity of datasets generated by laboratories worldwide has increased exponentially. Public repositories such as the Protein Data Bank, which has exceeded the 200000 entries in 2023, have been instrumental not only to collect, organize, and distill this enormous research output but also to promote further research enterprises. The achievements of artificial intelligence programs such as AlphaFold would not have been possible without the collective efforts of countless researchers who made their work publicly available. Here, I provide a practical, but far from exhaustive, list of resources useful to investigate protease function.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lander ES, Linton LM, Birren B et al (2001) Initial sequencing and analysis of the human genome. Nature 409(6822):860–921
Zhou Y, Zhou B, Pache L et al (2019) Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat Commun 10(1):1523
Ochoa D, Hercules A, Carmona M et al (2023) The next-generation open targets platform: reimagined, redesigned, rebuilt. Nucleic Acids Res 51(D1):D1353–D1359
Landrum MJ, Chitipiralla S, Brown GR et al (2020) ClinVar: improvements to accessing data. Nucleic Acids Res 48(D1):D835–D844
Tate JG, Bamford S, Jubb HC et al (2019) COSMIC: the catalogue of somatic mutations in cancer. Nucleic Acids Res 47(D1):D941–D947
Grossman RL, Heath AP, Ferretti V et al (2016) Toward a shared vision for cancer genomic data. N Engl J Med 375(12):1109–1112
Groza T, Gomez FL, Mashhadi HH et al (2023) The International Mouse Phenotyping Consortium: comprehensive knockout phenotyping underpinning the study of human disease. Nucleic Acids Res 51(D1):D1038–D1045
Krupke DM, Begley DA, Sundberg JP et al (2017) The mouse tumor biology database: a comprehensive resource for mouse models of human cancer. Cancer Res 77(21):e67–e70
Uhlén M, Fagerberg L, Hallström BM et al (2015) Proteomics. Tissue-based map of the human proteome. Science 347(6220):1260419
GTEx Consortium (2013) The Genotype-Tissue Expression (GTEx) project. Nat Genet 45(6):580–585
Chang A, Jeske L, Ulbrich S et al (2021) BRENDA, the ELIXIR core data resource in 2021: new developments and updates. Nucleic Acids Res 49(D1):D498–D508
Rawlings ND, Barrett AJ, Thomas PD et al (2018) The MEROPS database of proteolytic enzymes, their substrates and inhibitors in 2017 and a comparison with peptidases in the PANTHER database. Nucleic Acids Res 46(D1):D624–D632
Santamaria S, Buemi F, Nuti E et al (2021) Development of a fluorogenic ADAMTS-7 substrate. J Enzyme Inhib Med Chem 36(1):2160–2169
Santamaria S, Nagase H (2018) Measurement of protease activities using fluorogenic substrates. Methods Mol Biol 1731:107–122
Cuffaro D, Ciccone L, Rossello A et al (2022) Targeting aggrecanases for osteoarthritis therapy: from zinc chelation to exosite inhibition. J Med Chem 65(20):13505–13532
Schechter I, Berger A (1967) On the size of the active site in proteases. I Papain. Biochem Biophys Res Commun 27(2):157–162
Colaert N, Helsens K, Martens L et al (2009) Improved visualization of protein consensus sequences by iceLogo. Nat Methods 6(11):786–787
Crooks GE, Hon G, Chandonia JM et al (2004) WebLogo: A sequence logo generator. Genome Res 14:1188–1190
Pérez-Silva JG, Español Y, Velasco G et al (2016) The Degradome database: expanding roles of mammalian proteases in life and disease. Nucleic Acids Res 44(D1):D351–D355
Thomas PD, Ebert D, Muruganujan A et al (2022) PANTHER: making genome-scale phylogenetics accessible to all. Protein Sci 31(1):8–22
Duvaud S, Gabella C, Lisacek F et al (2021) Expasy, the Swiss Bioinformatics Resource Portal, as designed by its users. Nucleic Acids Res 49(W1):W216–W227
Gupta R, Brunak S (2002) Prediction of glycosylation across the human proteome and the correlation to protein function. Pac Symp Biocomput:310–322
Taherzadeh G, Dehzangi A, Golchin M et al (2019) SPRINT-Gly: predicting N- and O-linked glycosylation sites of human and mouse proteins by using sequence and predicted structural properties. Bioinformatics 35(20):4140–4146
Blom N, Gammeltoft S, Brunak S (1999) Sequence and structure-based prediction of eukaryotic protein phosphorylation sites. J Mol Biol 294(5):1351–1362
Madeira F, Pearce M, Tivey ARN et al (2022) Search and sequence analysis tools services from EMBL-EBI in 2022. Nucleic Acids Res 50(W1):W276–W279
Klein J, Eales J, Zürbig P et al (2013) Proteasix: a tool for automated and large-scale prediction of proteases involved in naturally occurring peptide generation. Proteomics 13(7):1077–1082
Burley SK, Bhikadiya C, Bi C et al (2023) RCSB protein data Bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million computed structure models of proteins from artificial intelligence/machine learning. Nucleic Acids Res 51(D1):D488–D508
Pettersen EF, Goddard TD, Huang CC et al (2004) UCSF Chimera – a visualization system for exploratory research and analysis. Comput Chem 25(13):1605–1612
Yang J, Zhang Y (2015) I-TASSER server: new development for protein structure and function predictions. Nucleic Acids Res 43(W1):W174–W181
Kryshtafovych A, Schwede T, Topf M et al (2019) Critical assessment of methods of protein structure prediction (CASP)-Round XIII. Proteins 87:1011–1020
Zheng W, Zhang C, Li Y et al (2021) Folding non-homology proteins by coupling deep-learning contact maps with I-TASSER assembly simulations. Cell Rep Methods 1:100014
Jumper J, Evans R, Pritzel A et al (2021) Highly accurate protein structure prediction with AlphaFold. Nature 596(7873):583–589
Bienert S, Waterhouse A, de Beer TA et al (2017) The SWISS-MODEL Repository-new features and functionality. Nucleic Acids Res 45(D1):D313–D319
Oughtred R, Rust J, Chang C et al (2021) The BioGRID database: a comprehensive biomedical resource of curated protein, genetic, and chemical interactions. Protein Sci 30(1):187–200
Orchard S, Ammari M, Aranda B et al (2014) The MIntAct project – IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Res 42(Database issue):D358–D363
Szklarczyk D, Kirsch R, Koutrouli M et al (2023) The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res 51(D1):D638–D646
Fortelny N, Yang S, Pavlidis P et al (2015) Proteome TopFIND 3.0 with TopFINDer and PathFINDer: database and analysis tools for the association of protein termini to pre- and posttranslational events. Nucl. Acids Res. 43:D290–D297
Naba A (2023) Ten years of extracellular matrix proteomics: accomplishments, challenges, and future perspectives. Mol Cell Proteomics 22(4):100528
Shao X, Gomez CD, Kapoor N et al (2023) MatrisomeDB 2.0: 2023 updates to the ECM-protein knowledge database. Nucl Acids Res 51:D1519–D1530
Clerc O, Deniaud M, Vallet SD et al (2019) MatrixDB: integration of new data with a focus on glycosaminoglycan interactions. Nucl. Acids Res 47:D376–D381
Kontio J, Soñora VR, Pesola V et al (2022) Analysis of extracellular matrix network dynamics in cancer using the MatriNet database. Matrix Biol 110:141–150
Acknowledgments
The Santamaria Lab is supported by the British Heart Foundation (FS/IBSRF/20/25032).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Santamaria, S. (2024). Web-Based Resources to Investigate Protease Function. In: Santamaria, S. (eds) Proteases and Cancer. Methods in Molecular Biology, vol 2747. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-3589-6_1
Download citation
DOI: https://doi.org/10.1007/978-1-0716-3589-6_1
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-3588-9
Online ISBN: 978-1-0716-3589-6
eBook Packages: Springer Protocols