'In Nature's infinite book of secrecy, a little I can read.'

Antony and Cleopatra [Act I, Scene 2], William Shakespeare

Pathological mutations occurring within the extended consensus sequences of exon-intron splice junctions account for ~10 per cent of all inherited lesions logged in The Human Gene Mutation Database (HGMD®; http://www.hgmd.org)[1] and are frequently encountered in mutation screening studies [2]. Mutations residing in other intronic locations (including the canonical branch-point sequence,[3] 5'-YURAY-3'), however, may often go undetected unless patient RNA can be analysed and the mutations in question induce aberrant splicing (eg exon skipping or cryptic splice site utilisation) that is readily distinguishable qualitatively or quantitatively from normal (and/or normal alternative) splicing. Indeed, introns probably represent a substantially larger mutational target than has hitherto been appreciated, on account of their containing a multiplicity of functional elements, including intron splice enhancers and silencers that regulate alternative splicing,[4, 5]trans-splicing elements [6] and other regulatory elements, some of which may be deeply embedded within very large introns [7].

In addition to pathological mutations sensu stricto, introns also harbour functional polymorphisms that can influence the expression of the genes that host them. Some of these intronic variants may also confer susceptibility to disease or otherwise modulate the genotype-phenotype relationship. For the reasons discussed above, it is very likely that such variants will have been seriously under-ascertained to date. Although most of these variants are single nucleotide polymorphisms (SNPs), others may be of the insertion/deletion type [8]. With the advent of genome-wide association studies (GWAS), an increasing number of potentially functional intronic variants are being identified [9]. In the majority of cases, however, it is unclear whether such variants are of direct functional significance, as opposed to simply being in linkage disequilibrium with another (as yet unidentified) functional SNP in the vicinity [10]. Even when GWAS studies deem a newly identified intronic polymorphism to be 'functional', it should be appreciated that such a term may often be ascribed solely on the basis of an observed association between a specific allele and a plasma protein level, enzymatic activity or a clinical/laboratory phenotype -- even although in reality such associations cannot readily distinguish a bona fide functional SNP from a linkage disequilibrium effect.

As has been noted with pathological mutations, the vast majority of known functional intronic polymorphisms are located within the extended consensus sequences of exon-intron splice junctions [2]. Some intronic polymorphic variants do not occur within the splice junctions, however, but nevertheless still act so as to change the splicing phenotype as a consequence of their being located within an intron splice enhancer or branchpoint site, or by activating a cryptic splice site [11, 12]. This is, from a biological point of view, a more interesting category of intronic SNP to study, since the mechanisms by which these variants exert their effects on the splicing phenotype are often unclear and may be quite subtle. In the pages of this issue, Millar et al.[13] report that a SNP, buried deep within intron 4 of the human growth hormone (GH1) gene, is of direct functional significance by virtue of its influence on the expression of this gene. This polymorphism therefore joins the ranks of the hitherto relatively small number of human intronic SNPs located outwith exon-intron splice junctions that have been shown by various methods of in vitro characterisation to be of direct functional significance. Table 1 lists some of the best characterised examples of such functional SNPs, most of which are located at least ~30 base pairs (bp) from the nearest splice site. These SNPs have been shown to influence either the transcriptional activity or the splicing efficiency of their host genes, or instead to alter the expression of alternative transcripts.

Table 1 Selected examples of in vitro characterised human functional intronic polymorphisms located more than ~30 bp from the nearest splice site

How should we go about increasing the number of identified functional intronic polymorphisms? One approach would be to employ exon-tiling microarrays to perform genome-wide scans to identify intronic SNPs responsible for inter-individual differences in the splicing phenotype [11, 14, 15]. Since currently available bioinformatics tools are inadequate to the task of predicting splicing consequences,[14] however, all SNPs identified in this way would have to be further validated using mini-gene constructs to determine the resulting splicing phenotype [14]. One feature that might prove helpful in identifying intronic SNPs is that such variants are often located within gene regions that are characterised by a reduced level of genetic variation [16].

Precisely because we invariably adopt a gene-centric approach to screening introns for functional polymorphisms, we should be wary of the existence of overlapping genes, a not infrequent occurrence in our complex genome. Thus, for example, the functional SNP rs4988235, located 13.9 kilobases upstream of the lactase (LCT) gene and associated with adult-type hypolactasia, actually resides deep within intron 13 of the minichromosome maintenance complex component 6 (MCM6) gene [1719]. In addition, since disease-associated intronic SNPs that play a role in long-range gene regulation have also recently been identified,[20, 21] we should be aware that some SNPs may influence the expression of remote genes at distance, rather than the expression of those genes which actually host them. These caveats notwithstanding, new techniques such as chromosome conformational capture [22] and chromatin immunoprecipitation followed by deep sequencing (ChIP-seq)[23] promise greatly to increase the number of functional intronic polymorphisms identified, thereby potentially pinpointing the locations of a whole new lexicon of intron-located regulatory elements, which will increase our understanding of intron structure and function.