Search for proteins with similarity to the CFTR R domain using an optimized RDBMS solution, mBioSQL
- Cite this article as:
- Hegedűs, T. & Riordan, J.R. cent.eur.j.biol. (2006) 1: 29. doi:10.2478/s11535-006-0003-9
- 56 Views
The cystic fibrosis transmembrane conductance regulator (CFTR) comprises ATP binding and transmembrane domains, and a unique regulatory (R) domain not found in other ATP binding cassette proteins. Phosphorylation of the R domain at different sites by PKA and PKC is obligatory for the chloride channel function of CFTR. Sequence similarity searches on the R domain were uninformative. Furthermore, R domains from different species show low sequence similarity. Since these R domains resemble each other only in the location of the phosphorylation sites, we generated different R domain patterns masking amino acids between these sites. Because of the high number of the generated patterns we expected a large number of matches from the UniProt database. Therefore, a relational database management system (RDBMS) was set up to handle the results. During the software development our system grew into a general package which we term Modular BioSQL (mBioSQL). It has higher performance than other solutions and presents a generalized method for the storage of biological result-sets in RDBMS allowing convenient further analysis. Application of this approach revealed that the R domain phosphorylation pattern is most similar to those in nuclear proteins, including transcription and splicing factors.