Characteristics and prediction of domain linker sequences in multi-domain proteins
- Cite this article as:
- Tanaka, T., Kuroda, Y. & Yokoyama/snm>, S. J Struct Func Genom (2003) 4: 79. doi:10.1023/A:1026163008203
- 161 Downloads
To facilitate swift structural characterizations, structural genomic/proteomic projects need to divide large multi-domain proteins into structural domains and to determine their structures separately. Thus, the assignment of structural domains based solely on sequence information, especially on the physico-chemical properties of the amino acid sequences, could be very helpful for such projects. In this study, we examined the characteristics of ‘domain linker sequences’, which are loop sequences connecting two structural domains. To this end, we prepared a set of 101 non-redundant multi-domain protein sequences with known structures, and performed an analysis of the linker sequences. The analysis revealed that the frequencies of five (Pro, Gly, Asp, Asn, Lys) amino acid residues differed significantly between the linker and non-linker loop sequences. Moreover, we observed a similar deviation for the residue pair frequencies between the two types of loop sequences. Finally, we describe an automated method, based on the above analysis, to detect loops that have high probabilities of being domain linkers in a protein sequence.