Introduction

Single nucleotide changes that affect gene expression by impacting gene regulatory sequences such as promoters, enhances, and silencers are known as regulatory SNPs (rSNPs) [14]. A rSNP within a transcriptional factor binding site (TFBS) can change a transcriptional factor’s (TF’s) ability to bind to its TFBS [58] in which case the TF would be unable to effectively regulate its target gene [913]. This concept is examined for three rSNPs (rs2010963, rs1570360 and rs699946) in the promoter region of the vascular endothelial growth factor (VEGF)-A gene and their allelic association with TFBS and human disease. The human VEGFA gene is encoded on chromosome 6 and is usually expressed as a 46-kDa disulfide-linked homodimer. VEGFA is a signaling protein involved in the regulation of angiogenesis, vasculogenesis and endothelial cell growth. It induces endothelial cell proliferation, promotes cell migration, inhibits apoptosis, and induces permeabilization of blood vessels. The VEGFA rs2010963 [1418], rs1570360 [15, 1921] and rs699947 [18] rSNPs have been associated with several human diseases or conditions (Supplement). In this report, I discuss these rSNP associations with changes in potential TFBS and their possible relationship to the reported diseases or conditions. The potential TFBS for these rSNPs have previously been discussed in association with high altitude sickness [22].

Materials and methods

Identifying TFBS

The JASPAR CORE database [23, 24] and ConSite [25] were used to identify the TFBS in this study. JASPAR is a collection of transcription factor DNA-binding preferences used for scanning genomic sequences and ConSite is a web-based tool for finding cis-regulatory elements in genomic sequences. The TFBS and rSNP location within the binding sites have previously been discussed [22, 26]. The Vector NTI Advance 11 computer program (Invitrogen, Life Technologies) was used to locate the TFBS in the VEGFA gene (NCBI Ref Seq NM_001171626).

Results

VEGFA rSNPs and TFBS

The rs2010963 rSNP (C/G) is located −634 base pairs (bp) from the VEGFA TSS, while the rs1570360 rSNP (A/G) is at −1,154 bp and the rs699947 rSNP (A/C) is at −2,578 bp (Table 1). The rSNPs are located in several TFBS which have previously been reported [22]. Sometimes the rSNP alleles do not change the TFBS but in other instances each allele may provide a unique TFBS such as shown in Table 1. As an example, the rs2010963 VEGFA-C allele generates the potential binding sites for the GA-binding protein alpha (GAPBα) and the interferon regulatory factors 1 and 2 (IRF1, 2) TFs, while the VEGFA-G allele generates a potential binding site for the specificity protein 1 (SP1) TF. The rs1570360 VEGFA-A allele generates the potential bindings sites for the SP1 and zinc finger protein 354C (ZNF354C) TFs, while the VEGFA-G allele generates potential binding sites for Krueppel-like factor 4 (KLF4) and methyl-CpG-binding protein 2-interacting zinc finger protein (MIZF) TFs. The rs699947 VEGFA-A allele generates a potential binding site for the nuclear factor 1 C-type (NFIC) TF while the VEGFA-C allele generates potential binding sites for GATA binding protein 3 (GATA3), the hypoxia-inducible factor 1::aryl hydrocarbon receptor nuclear translocator (HIF1α::ARNT) and the T cell acute lymphocytic leukemia 1::transcriptional factor 3 (TAL1::TCF3) TFs. The function of these TFs has previously been reported and discussed [22].

Table 1 Human diseases and VEGFA rSNPs found to be significantly associated in the referenced study

VEGFA rSNP alleles and disease associations

A number of human diseases or conditions have significantly been associated with VEGFA rSNP alleles shown in bold lettering in the table along with the potential TFBS generated by the specific allele. These diseases or conditions have been listed along with the VEGFA rSNP genotypes and allele frequencies for patients with the disease versus their controls (Supplement). The allele listed in bold lettering in the table is shown to increase in disease patients compared to the controls (Supplement). As an example, the rs2010963 VEGFA-G allele has been found to significantly increase in patients with severe ischemic complications in giant cell arteritis (GCA) [27]. This allele generates the potential SP1 binding site (Table 1), where SP1 can activate or repress transcription in response to physiological and pathological stimuli. SP1 can regulate the expression of a large number of genes involved in a variety of processes such as cell growth, apoptosis, differentiation and immune responses, whereas the rs2010963 VEGFA-C allele has been found to significantly increase in patients with coronary artery disease (CAD) [18]. This allele generates the potential GAPBα and the IRF1, 2 binding sites (Table 1), were GAPBα is involved in activation of cytochrome oxidase expression and has nuclear control of mitochondrial function, and IRF1, 2 are involved in interferon regulation. The rs1570360 VEGFA-A allele has been found to significantly increase in patients with proliferative diabetic retinopathy (PDR) compared to their controls [19]. This allele generates the potential SP1 and ZNF354C binding sites, where the ZNF354C TF functions as a transcriptional repressor. The rs1570360 VEGFA-G allele has been found to significantly increase in patients with sporadic Alzheimer’s disease (SAD) [20]. This allele generates the potential KLF4 and MIZF binding sites, where KLF4 acts as both an activator and repressor and MIZF plays a role in DNA methylation and transcription repression. The rs699947 VEGFA-A allele has been found to significantly increase in CAD patients compared to their controls [18]. This allele generates the potential NFIC binding site. NFIC is a member of the NFI gene family and is expressed in numerous tissues including brain, liver, spleen and heart [28]. The proteins from these genes are individually capable of activating transcription and replication. Other significant disease associations with these VEGFA rSNP alleles are listed in Table 1.

Discussion

GWAS over the last decade have identified nearly 6,500 disease or trait-predisposing SNPs where only 7 % of these are located in protein-coding regions of the genome [29, 30] and the remaining 93 % are located within non-coding areas [31, 32], such as regulatory or intergenic regions. SNPs which occur in the putative regulatory region of a gene where a single base change in the DNA sequence of a potential TFBS may affect the process of gene expression, are drawing more attention [1, 3, 33]. A SNP in a TFBS can have multiple consequences. Often, the SNP does not change the TFBS interaction nor does it alter gene expression, since a TF will usually recognize a number of different binding sites in the gene. In some cases the SNP may increase or decrease the TF binding which results in allele-specific gene expression. In rare cases, a SNP may eliminate the natural binding site or generate a new binding site, in which cases the gene is no longer regulated by the original TF. Therefore, functional rSNPs in TFBS may result in differences in gene expression, phenotypes and susceptibility to environmental exposure [33]. Examples of rSNPs associated with disease susceptibility are numerous and several reviews have been published [3336].

Human diseases or conditions that have been significantly associated with rSNPs of the VEGFA gene are shown in Table 1, along with rSNP allele-specific TFBS. What a change in the rSNP alleles can do, is to alter the DNA landscape around the SNP for potential TFs to attach and regulate a gene. This change in the regulatory landscape can alter gene regulation which in turn can result in human disease, a change in condition or illness. In this report several examples have been described to illustrate that a change in rSNP alleles can provide different TFBS, which in turn are also significantly associated with human disease.