Tumor Biology

, Volume 34, Issue 2, pp 947–952

Germline copy number variations associated with breast cancer susceptibility in a Japanese population

Authors

    • Department of Oncology & Laboratory MedicineYamaguchi University Graduate School of Medicine
  • Takae Okada
    • Department of PathologyYamaguchi University Graduate School of Medicine
  • Naoya Shikamoto
    • Department of Oncology & Laboratory MedicineYamaguchi University Graduate School of Medicine
  • Yibo Zhan
    • Department of Oncology & Laboratory MedicineYamaguchi University Graduate School of Medicine
  • Kohei Sakai
    • Department of Oncology & Laboratory MedicineYamaguchi University Graduate School of Medicine
  • Naoko Okayama
    • Department of Oncology & Laboratory MedicineYamaguchi University Graduate School of Medicine
  • Mitsuaki Nishioka
    • Department of Oncology & Laboratory MedicineYamaguchi University Graduate School of Medicine
  • Tomoko Furuya
    • Department of PathologyYamaguchi University Graduate School of Medicine
  • Atsunori Oga
    • Department of PathologyYamaguchi University Graduate School of Medicine
  • Shigeto Kawauchi
    • Department of PathologyYamaguchi University Graduate School of Medicine
  • Noriko Maeda
    • Department of Digestive Surgery and Surgical Oncology (Surgery II)Yamaguchi University Graduate School of Medicine
  • Michiko Tamesa
    • Department of Digestive Surgery and Surgical Oncology (Surgery II)Yamaguchi University Graduate School of Medicine
  • Yukiko Nagashima
    • Department of Digestive Surgery and Surgical Oncology (Surgery II)Yamaguchi University Graduate School of Medicine
  • Shigeru Yamamoto
    • Department of Digestive Surgery and Surgical Oncology (Surgery II)Yamaguchi University Graduate School of Medicine
  • Masaaki Oka
    • Department of Digestive Surgery and Surgical Oncology (Surgery II)Yamaguchi University Graduate School of Medicine
  • Yuji Hinoda
    • Department of Oncology & Laboratory MedicineYamaguchi University Graduate School of Medicine
  • Kohsuke Sasaki
    • Department of PathologyYamaguchi University Graduate School of Medicine
Open AccessResearch Article

DOI: 10.1007/s13277-012-0630-x

Cite this article as:
Suehiro, Y., Okada, T., Shikamoto, N. et al. Tumor Biol. (2013) 34: 947. doi:10.1007/s13277-012-0630-x

Abstract

Although copy number variations (CNVs) are expected to affect various diseases, little is known about the association between CNVs and breast cancer susceptibility. Therefore, we investigated this relation. Array comparative genomic hybridization was performed to search for candidate CNVs related to breast cancer susceptibility. Subsequent quantitative real-time polymerase chain reaction was carried out for confirmation. We found seven CNV markers associated with breast cancer risk. The means of the relative copy numbers of patients with a history of breast cancer and women in the control group were 0.8 and 1.8 for Hs06535529_cn on 1p36.12 (P < 0.0001), 2.9 and 2.2 for Hs03103056_cn on 3q26.1 (P < 0.0001), 1.2 and 1.8 for Hs03899300_cn on 15q26.3 (P < 0.0001), 1.0 and 1.5 for Hs03908783_cn on 15q26.3 (P < 0.0001), and 1.1 and 1.7 for Hs03898338_cn on 15q26.3 (P < 0.0001), respectively. Interestingly, nine or more copies of Hs04093415_cn on 22q12.3 were found only in 8/193 (4.1 %) patients with a history of breast cancer and in none of the controls (P = 0.0081). Similarly, 12 or more copies of Hs040908898_cn on 22q12.3 were found only in 7/193 (3.6 %) patients with a history of breast cancer and in none of the controls (P = 0.016). A combination of two CNVs resulted in 80.3 % sensitivity, 80.6 % specificity, 82.4 % positive predictive value, and 78.3 % negative predictive value for the prediction of breast cancer susceptibility. These findings may lead to a new means of risk assessment for breast cancer. Confirmatory studies using independent data sets are needed to support our findings.

Keywords

CNVBreast cancer susceptibilityCGHReal-time PCRDigital PCR

Abbreviations

bp

Base pair

CNV

Copy number variation

CGH

Comparative genomic hybridization

PCR

Polymerase chain reaction

SNPs

Single-nucleotide polymorphisms

Introduction

Breast cancer is a multifactorial disease caused by genetic and environmental factors [1]. So far, genetic studies have identified four high-penetrance genes (BRCA1, BRCA2, TP53, and PTEN) related to breast cancer [2]. In addition, genetic variations including single-nucleotide polymorphisms (SNPs), small insertion–deletion polymorphisms, and variable numbers of repetitive sequences have been reportedly associated with breast cancer risk, comprising 51 variants in 40 genes graded as a strong relation for 10 variants in 6 genes (ATM, CASP8, CHEK2, CTL4, NBN, and TP53), moderate for 4 variants for 4 genes (ATM, CYP19A1, TERT, and XRCC3), and weak for 37 variants [3].

Another variation in the human genome is that of genomic structural variants including copy number variations (CNVs) [4]. The CNVs involve gains or losses of several to hundreds of kilobases of genomic DNA among phenotypically normal individuals, and at least 11,700 CNV regions larger than 443 bp have been identified [5]. CNVs have been shown to significantly influence messenger RNA expression levels [6, 7], and recent studies have described associations of CNVs with various common disorders [8] as well as with mental illness [9]. As examples, The Wellcome Trust Case Control Consortium identified three CNVs associated with common diseases: IRGM for Crohn’s disease; HLA for Crohn’s disease, rheumatoid arthritis, and type 1 diabetes; and TSPAN8 for type 2 diabetes [10]. In regard to neoplasms, CNVs have recently been reported as factors predisposing individuals to neuroblastoma, prostate cancer, pancreatic cancer, colorectal cancer, and BRCA1-associated ovarian cancer [6, 1115]. Although CNVs are expected to affect breast cancer risk, little is known about this association except for a previous report in which the proportion of rare CNVs was excessive in patients with hereditary breast cancer without BRCA1/BRCA2 mutations compared with controls [16]. These gaps, in our knowledge, prompted us to study this relation. Here, we report that CNVs significantly affect the susceptibility to breast cancer.

Materials and methods

The study protocol was approved by the institutional review board of Yamaguchi University Graduate School of Medicine, and informed consent was obtained from each patient.

Screening of CNVs by array comparative genomic hybridization

We obtained 30 DNA samples from the peripheral blood of women without a history of breast cancer and 30 DNA samples from the peripheral blood of patients with a history of breast cancer. A pool of blood-derived DNA from the 30 healthy women was used as a reference sample for all hybridizations performed. Assessment of the CNVs in the human genome by oligonucleotide array comparative genomic hybridization (CGH) (human CGH 2.1 M whole-genome tiling array; Roche NimbleGen) was performed according to the manufacturer’s protocol. Array image analysis and normalization were performed with NimbleScan version 2.5 software (Roche NimbleGen). The normalized data were then processed using Nexus Copy Number version 5.0 software (BioDiscovery).

Copy number validation by real-time polymerase chain reaction

Quantitative real-time polymerase chain reaction (PCR) using predesigned TaqMan® Copy Number Assays (Applied Biosystems) containing a primer pair and a FAM dye-labeled minor groove binder (MGB) probe was performed to detect the copy number of the genomic sequence of interest using a larger cohort. For the internal control, a predesigned TaqMan® Copy Number Reference Assay RNase P (Applied Biosystems), which is known to exist in two copies in a diploid genome, was used. We obtained 193 DNA samples from the peripheral blood of patients with a history of breast cancer and 170 DNA samples from age-matched women without a history of breast cancer. The mean age was 57.3 years in the patient group and 55.6 years in the control group. There was no statistical difference in age distribution between the groups. The calibrator sample for quantitative real-time PCR was the DNA pooled from 30 healthy women; the same was used as the reference in the array CGH assay, and the copy number of the calibrator sample was assumed to be 2. The 7900HT system and the StepOnePlus system (Applied Biosystems) were used for the quantitative real-time PCR analysis. The PCRs were carried out according to the manufacturer’s protocol.

TA cloning

To confirm the DNA sequence, a part of the real-time PCR products were gel purified and cloned into the T/A cloning vector pGEM-T Easy (Promega). At least five subclones were isolated and identified by direct sequencing.

Copy number validation by digital PCR

Digital PCR was available for six CNVs including Hs06535529_cn, Hs03899300_cn, Hs03908783_cn, Hs03898338_cn, Hs04090898_cn, and Hs040904315_cn to evaluate absolute copy numbers. Regarding Hs03103056_cn, digital PCR was not available because of difficulties in designing primers and probes for digital PCR. To evaluate the copy number of Hs03899300_cn, we designed forward and reverse primers and a TaqMan® MGB probe of Hs03899300_cn region and hTERT. hTERT was used as the internal control because it is known to exist in two copies in a diploid genome [17]. The primers were 5′-TGCCTGGCACTAAGGTTTAGAGTT-3′ (forward) and 5′-CACTCAGAGGGTTAAGTGAAGTGACA-3′ (reverse) for the Hs03899300_cn region and 5′-GGGTCCTCGCCTGTGTACAG-3′ (forward) and 5′-CCTGGGAGCTCTGGGAATTT-3′ (reverse) for hTERT. The probes were 5′-FAM-TGAGTCGGTGCTTCC-MGB-3′ for the Hs03899300_cn region and 5′-VIC-CACACCTTTGGTCACTC-MGB-3′ for hTERT. We designed these primers and probes to avoid SNPs. Regarding other CNVs, the same Copy Number Assays used in the real-time PCR were available. Reaction mixtures of 20-μL volume comprising 1× ddPCR Master Mix (Bio-Rad), forward and reverse primers and probes for a target and a reference, and DNA were prepared. PCR amplification was performed for a total of 40 cycles with an annealing temperature of 58 °C. Digital PCR was carried out using a QX100 droplet digital PCR system (BioRad) according to the manufacturer’s protocol [18].

Statistical analysis

A Fisher’s exact test, an unpaired t test, a Mann–Whitney test, linear regression analysis, and linear discriminant analysis were used to compare variables. A P value of <0.05 was considered to be significant. Data were analyzed with GraphPad Prism version 4.03, GraphPad InStat version 3.10 (GraphPad Software), and Ekuseru-Toukei 2008 (Social Survey Research Information).

Results

Using array CGH, we found four CNV regions with significant differences in the frequency of copy number changes between the patient group and the control group. The CNV positions were chr1:21,500,972-21,505,481; chr3:162,215,705-162,235,598; chr15:102,029,706-102,034,387; and chr22:37,142,958-37,147,755 (GRCh37/hg19). The CNVs detected by array CGH, however, could be false positives because a poor signal-to-noise ratio of hybridizations leads to considerable variation in the reported CGH ratio [19], and smaller CNVs are much more likely to be false positives than are large CNVs [20]. Therefore, quantitative real-time PCR with a larger cohort was carried out to confirm the CNVs associated with breast cancer susceptibility. We identified seven CNV markers related to breast cancer risk as shown in Table 1. The means of the relative copy numbers of patients with a history of breast cancer and those of women in the control group were 0.8 and 1.8 for Hs06535529_cn on 1p36.12 (P < 0.0001), 2.9 and 2.2 for Hs03103056_cn on 3q26.1 (P < 0.0001), 1.2 and 1.8 for Hs03899300_cn on 15q26.3 (P < 0.0001), 1.0 and 1.5 for Hs03908783_cn on 15q26.3 (P < 0.0001), and 1.1 and 1.7 for Hs03898338_cn on 15q26.3 (P < 0.0001), respectively (Fig. 1). The copy number of the Hs03899300_cn region on 15q26.3 by digital PCR was consistent with that by real-time PCR (Fig. 2), and the decision coefficient (r2) was 0.9801. Also, copy numbers of other CNVs by digital PCR and by real-time PCR were well correlated: r2 was 0.9201 for Hs06535529_cn, 0.8450 for Hs03908783_cn, 0.8909 for Hs03898338_cn, 0.9958 for Hs04090898_cn, and 0.9491 for Hs04093415_cn. Interestingly, nine or more copies of Hs04093415_cn on 22q12.3 were found only in eight (4.1 %) patients with a history of breast cancer and in none of the controls (P = 0.0081, Fig. 1 and Table 2). Similarly, 12 or more copies of Hs04090898_cn on 22q12.3 were found only in 7 (3.6 %) patients with a history of breast cancer and in none of the controls (P = 0.0160, Fig. 1 and Table 2). After setting a copy number threshold, we evaluated the relation between the copy number events and breast cancer susceptibility. The sensitivity and specificity were 83.9 and 41.2 % for Hs06535529_cn, 39.4 and 90.0 % for Hs03103056_cn, 76.7 and 70.0 % for Hs03899300_cn, 79.8 and 45.3 % for Hs03908783_cn, 83.4 and 65.9 % for Hs03898338_cn, 4.1 and 100.0 % for Hs04093415_cn, and 3.6 and 100.0 % for Hs04090898_cn (Table 2). Linear discriminant analysis with combination of two CNVs resulted in 80.3 % sensitivity, 80.6 % specificity, 82.4 % positive predictive value, and 78.3 % negative predictive value for the prediction of breast cancer susceptibility. The discriminant score was calculated as follows: Y = −6.9X1 + 3.2X2 + 6.1, where X1 = the copy number of Hs03899300_cn and X2 = the copy number of Hs03908783_cn.
Table 1

CNV markers related to breast cancer risk

Copy number assay ID

Sequence

Location

Gene

Copy number variation ID

(GRCh37/hg19)

(Database of genomic variants)

Hs06535529_cn

TCGCTGTGCCTGATTTCAGAGCCGGTTTCT

chr1:21,502,843

EIF4G3

None

GCGGTAAACTCATGGCAAAGCGAAGCCAC

−21,502,924

CAACCCCCCCAGAGCGGGACCGG

Hs03103056_cn

TGGCAACATCTCAATATCCRCAGAATTTTC

chr3:162,223,478

None

2483, 62120, 103483, 115882,

ATATTTATCCAGGTAGAATTGATAAACAGA

−162,223,593

32527, 37991, 30185, 50989, 2483,

AAATTCCACAAGAACCATAAATTATTTAAC

62120, 103483, 115882, 32527,

ACATACACACACACACTCAAATTTAG

37991, 30185, 50989

Hs03899300_cn

ACTGCCTGGCACTAAGGTTTAGAGTTATGA

chr15:102,028,397

PCSK6

34506, 5327, 3984

GTCGGTGCTTCCCTGTCACTTCACTTAACCC

−102,028,502

TCTGAGTGTGCAGTTTGTAGATTTGTTAACT

GCACTGAGAGGTCC

Hs03908783_cn

GCCTGCCTCCCRGCATGGGCCGCGGCCTCC

chr15:102,030,424

None

34506, 66907, 5327, 3984

GCCATGGGCTCCGTGCGGTGGTTTCTCGGG

−102,030,520

TACACGCTCGTGAGCCYGGCTGATGCGCCA

CATGCCT

Hs03898338_cn

ATCGCTGCTGGATCTCTTCTGTCATCCCTCC

chr15:102,031,024

None

34506, 5327, 3984

CAGGACCCATTGGTCCTACTGGCCCACTTC

−102,031,100

CAGAAAGCAAGCCATC

Hs04093415_cn

GTGTCGAGGCTGCTCCTTAAAYGCTTCTTG

chr22:37,143,784

None

36022, 36023, 7346, 110470, 36024,

CCTGCACGCTGTGCGTGGAAACCCAAAGA

−37,143,858

22687, 103172, 23103, 115199,

AGTGAGAGACGCGAGG

62002, 6148, 115197, 59075, 79571,

22687, 103172, 23103, 115199,

62002, 6148, 115197, 59075, 79571,

110470, 36024, 36022, 36023, 7346

Hs04090898_cn

CTCCTAGTGGGATCCTACAACTCTCAGAAC

chr22:37,145,991

None

36022, 36023, 36024, 22687, 103172,

AACAGGGTCCCCCTGGACTGTGAGCACAGT

−37,146,097

23103, 62002, 6148, 115197, 59075,

AGAACCAGCTCTTTCTTGGGATTTTAAGAA

91054, 7347, 79570, 91053, 36022,

AACAGACAAGCTTCGCG

36023, 36024, 22687, 103172,

23103, 62002, 6148, 115197, 59075,

91054, 7347, 79570, 91053

https://static-content.springer.com/image/art%3A10.1007%2Fs13277-012-0630-x/MediaObjects/13277_2012_630_Fig1_HTML.gif
Fig. 1

Distribution of copy numbers in patients with a history of breast cancer and in women in the control group. Each sample is indicated by an open circle. The horizontal lines represent the mean copy number in each group

https://static-content.springer.com/image/art%3A10.1007%2Fs13277-012-0630-x/MediaObjects/13277_2012_630_Fig2_HTML.gif
Fig. 2

Comparison of Hs03899300_cn copy number between real-time PCR and digital PCR evaluation. Dark and light gray bars represent the copy numbers evaluated by real-time PCR and by digital PCR, respectively

Table 2

Relation between CNVs and breast cancer susceptibility

Copy number assay ID

Copy number threshold

Breast cancer

Control

Odds ratio

P value

(n = 193, %)

(n = 170, %)

Hs06535529_cn

<1.5

162 (83.9)

100 (58.8)

3.7

<0.0001

Hs03103056_cn

3.5≤

76 (39.4)

17 (10.0)

5.8

<0.0001

Hs03899300_cn

<1.5

148 (76.7)

51 (30.0)

7.7

<0.0001

Hs03908783_cn

<1.5

154 (79.8)

93 (54.7)

3.3

<0.0001

Hs03898338_cn

<1.5

161 (83.4)

58 (34.1)

9.7

<0.0001

Hs04093415_cn

9.0≤

8 (4.1)

0 (0.0)

15.6

0.0081

Hs04090898_cn

12.0≤

7 (3.6)

0 (0.0)

13.7

0.0160

Discussion

In the current study, we identified CNV loci associated with breast cancer susceptibility. Our results, however, contrast with the study of Craddock et al. [10], who reported that there was no association between CNVs and breast cancer risk. This discrepancy is likely caused by the differences in the array-CGH platforms and analytic tools used. Different calling algorithms in the analytic tools give substantially a different quantity and quality of CNV calls even when identical raw data are used as the input [21]. Differences in preprocessing, labeling, and hybridization protocols, which were performed according to the various manufacturers’ specifications, could contribute to the occurrence of false-negative and false-positive calls [22]. Therefore, comparison of data sets resulting from different platforms and/or different analytic tools will cause problems in association analysis and can create false association signals [21]. To evaluate a copy number exactly, it is necessary to follow a validation study using a different methodology such as that of real-time PCR [22].

In the current study, we found that the copy numbers of Hs03899300_cn, Hs03908783_cn, and Hs03898338_cn, which are located close to each other on 15q26.3, were similar by real-time PCR. These findings were also observed between Hs04093415_cn and Hs040908898_cn on 22q12.3. Furthermore, the copy number of six CNVs including Hs06535529_cn, Hs03899300_cn, Hs03908783_cn, Hs03898338_cn, Hs04090898_cn, and Hs040904315_cn evaluated by digital PCR confirmed the accuracy of the data from the real-time PCR. Thus, false positives and negatives from the real-time PCR could be excluded. To our knowledge, this is the first report to show a distinct relation between CNVs and breast cancer risk.

Interestingly, 9 or more copies of Hs04093415_cn and 12 or more copies of Hs040908898 were observed only in patients with a history of breast cancer, and odds ratios for breast cancer susceptibility were 19.8 and 17.4, respectively. Such high odds ratios suggest strong oncogenic effect in these regions. Because mutations of high-penetrance genes for breast cancer (BRCA1, BRCA2, TP53, and PTEN) have not been tested, and familial history was not available in the present study, further studies are required to elucidate the association of the CNVs and hereditary breast cancer syndromes.

In the current study, some of the CNV regions related to breast cancer susceptibility contained genes such as EIF4G3 and PCSK6. Eukaryotic initiation factor 4 gamma 3 (EIF4G3) is a protein critical for initiation of protein translation [23]. To date, no relation of EIF4G3 with cancer development has been reported. We hypothesize that the decrease in the germline copy number of EIF4G3 may lead to a reduction or failure in translation of some transcripts and possibly give malignant potential to cells. Further examination will be required to elucidate this speculation. Proprotein convertase subtilisin/kexin type 6 (PCSK6) is a member of the protease family of proprotein convertases that activate precursor proteins by cleaving at the specific recognition sequence RXK/RR [24]. The relation between PCSK6 expression and carcinogenesis is controversial. Some investigations reported that overexpression of PCSK6 in immortalized nontumorigenic or papilloma-derived keratinocytes increased their invasiveness [25], whereas other studies linked absent or reduced PCSK6 expression levels to ovarian cancer [26]. Regarding breast cancer, overexpression of prosegment ppPCSK6 resulted in significant enhancement in cell motility, migration, and invasion of collagen in vitro [27]. However, because the effect of the reduced copy number of PCSK6 on normal mammary gland cells has not yet been investigated, further examination will be required to understand the function of PCSK6 in the neoplastic process.

The fact that no genes were mapped to the rest of the CNV regions raises a question as to how such CNVs affect breast cancer development. A possible explanation is that new gene transcripts may exist within the CNVs. Indeed, Diskin et al. found a new gene transcript related to neuroblastoma within the 1q21.1 CNV region where no known genes had been mapped [6]. Another hypothesis is that noncoding RNAs may be involved, such as long intergenic noncoding RNAs that regulate chromatin states and epigenetic inheritance, but knowledge of the molecular mechanisms of their function are still lacking [28]. Because the function of the CNVs is still unknown, further examinations will be required.

In summary, we found several unique CNVs associated with breast cancer. These CNVs may be feasible markers for assessment of the risk of breast cancer. However, as we cannot exclude the possibility that some women without a history of breast cancer may develop breast cancer in the future because the lifetime risk of developing breast cancer in Japan is 6 % [29], confirmatory studies using independent data sets are needed to support our findings.

Acknowledgments

This work was supported by Grants-in-Aid for Science Research from the Ministry of Education, Culture, Sports, Science and Technology of Japan; the Ministry of Economy, Trade and Industry; and the Japan Science and Technology Agency.

Conflicts of interest

None

Copyright information

© The Author(s) 2012

Open Access This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.