20.1 Introduction

Oral cancer, predominantly oral squamous cell carcinoma (OSCC), is the sixth most common human cancer affecting over 300,000 people worldwide annually (Greenlee et al. 2000; Parkin et al. 2005). The American Cancer Society estimates that 35,310 new cases of oral cancer were diagnosed in 2008, and 7,590 people died from this disease (Oral cancer 2008). The main risk factors for oral cancer include tobacco use, alcohol consumption, and human papilloma virus infection. Despite the treatment advances in surgery, chemotherapy and radiotherapy, the survival rates for patients with oral cancer have not been significantly improved in the past few decades. The high mortality rate of oral cancer can be attributed to factors including limited understanding of the disease mechanism, lack of clinical tools for early cancer detection, and nonresponsiveness to therapeutic treatments.

OSCC evolves through a series of histopathological stages, including hyperplasia, dysplasia of varying degrees, carcinoma in situ, and eventually invasive SCC (Brinkman and Wong 2006). At the molecular level, the development of OSCC is initiated by chemical/biological insults to the normal oral epithelium by carcinogens such as tobacco and alcohol or human papilloma virus infection. This results in increasing genetic instability and a number of genetic alterations such as loss of heterozygosity, gene inactivation by methylation, and gene amplification. Many regulatory proteins such as p16, p53, Rb, cyclin D1, epidermal growth factor receptor, and transforming growth factor-alpha are also aberrantly altered (Deshpande and Wong 2008). These important signaling pathways promote cell proliferation, cell survival, and/or transformation capabilities, leading to the formation of preneoplastic lesions, subsequent loss of cellular organization, and eventually invasive penetration through the basement membrane. Nevertheless, our understanding of the related molecular mechanism is far from complete due to the cellular and molecular heterogeneity of OSCC development. Meanwhile, the fact that there are many additional genes potentially involved in oral carcinogenesis emphasizes the importance of studying gene alterations in a global scale by genomics and proteomics. In this book chapter, we aim to summarize modern genomics and proteomics technologies that can significantly facilitate our understanding of the molecular events underlying the development of OSCC. Understanding the molecular and genetic alterations in the pathogenesis of OSCC will help elucidate the mechanisms involved in tumor formation as well as identify potential targets for improved treatment of OSCC.

20.2 Genomic Analysis at Chromosomal Level

Genetic aberrations at chromosomal levels can be analyzed using several high-throughput genomic techniques, including chromosome banding (also known as karyotyping), loss of heterozygosity (LOH), comparative genomic hybridization (CGH), digital karyotyping (Wang et al. 2002), fluorescence in situ hybridization, restriction landmark genome scanning (Imoto et al. 1994), and representational difference (Lisitsyn and Wigler 1993). These analyses enable the identification of a broad range of chromosomal abnormalities in cancer.

20.2.1 Comparative Genomic Hybridization (CGH) − Systematic Copy Number Analysis

CGH technique was developed to detect gene copy-number changes (amplifications and deletions) between normal and neoplastic tissue or cells (Kallioniemi et al. 1992). In a typical CGH experiment, the DNA from test (disease) and reference (normal) samples are differentially labeled with different fluorescence dyes, and then cohybridized to the normal metaphase chromosomes to generate fluorescence ratios along the length of chromosomes. This ratio provides a cytogenetic representation of the relative DNA copy-number variation. CGH was the first effective tool to examine the entire genome for variations in DNA copy-number changes (Pinkel and Albertson 2005a,b). However, this earlier version metaphase chromosome-based CGH has a limited mapping resolution (∼20 Mb). Array-based CGH is the second generation CGH in which fluorescence ratios on arrayed DNA elements provide a locus-by-locus gene copy-number measure (Pinkel et al. 1998; Ishkanian et al. 2004). While this approach increases mapping resolution, most array-CGHs utilize large genomic clones (e.g., bacterial artificial chromosomes) which limit spatial sensitivity. Furthermore, using large genomic clones may also lead to reduced specificity as a result of their inclusion of repeats (e.g., Alu) and segments of extensive sequence similarity (e.g., pseudogenes) (Mantripragada et al. 2004). Recently, several novel CGH platforms have become available with the completion of the human genome sequence. These include cDNA array-based CGH (Pollack et al. 1999; Zhou et al. 2004a), oligonucleotide array-based CGH (Brennan et al. 2004; Lucito et al. 2003), tiling array-based CGH (Ishkanian et al. 2004), and copy number analysis using high-density SNP microarrays (Bignell et al. 2004; Zhao et al. 2004, 2005; Zhou et al. 2004c). The tiling array and the SNP array-based approaches have drawn more attention due to their remarkable mapping resolution. Tiling arrays have the potential to detect small chromosomal gains and losses (resolution ∼40 kb, almost at single gene level) that might be overlooked by marker-based arrays (Ishkanian et al. 2004; Davies et al. 2005). We envision that in the near future, it will be possible to survey copy number changes at bp resolution using tiling arrays that contain billions of overlapping probes covering the entire genome. The SNP array-based CGH provides the unique advantage of combining CGH and LOH analysis in one single experiment, which will be discuss in the following ­section (Zhao et al. 2004; Zhou et al. 2004c).

20.2.2 Loss of Heterozygosity (LOH) − Systematic Allelic Imbalance Analysis

Chromosomal aberrations such as allelic losses, which are caused by mitotic recombination, gene conversion, or nondisjunction, cannot be detected by CGH. These allelic imbalances can be detected based on LOH at polymorphic loci. This approach is based on the Knudson two-hit hypothesis (Knudson 1971, 1996) for tumor-suppressor genes. Examples include the discovery of the first tumor suppressor gene, RB1 (Friend et al. 1986), in which a recessive mutation was discovered in one allele, and the loss of the other wild-type allele was detected by LOH. Traditionally, polymorphic markers (e.g., restriction fragment length polymorphisms [RFLPs] and microsatellite markers) have been employed for LOH analysis (Vogelstein et al. 1989). However, these approaches are labor intensive and require large amount of DNA, allowing only a modest number of markers to be screened. The recent advances in human genome projects lead to the identification of millions of SNP loci (http://www.ncbi.nlm.nih.gov/SNP/), which makes them ideal markers for various genetic analyzes, including LOH. SNP markers have significant advantages over RFLPs and microsatellite markers in terms of abundance, spacing, and stability across the genome. Several high-density SNP arrays have recently been developed to support large-scale high throughput SNP genotyping (Wang et al. 1998). The LOH patterns generated by SNP array analysis have a high degree of concordance with previous microsatellite analyses of the same cancer samples (Lindblad-Toh et al. 2000), and have been utilized in a number of studies for the molecular classification of various types of cancers (Zhou et al. 2004b,c; Janne et al. 2004; Wang et al. 2004; Hoque et al. 2003; Lieberfarb et al. 2003). One unique advantage of this SNP array-based approach is that the intensity of sample hybridization to the array probes can also be used to infer copy number changes (similar to CGH) (Bignell et al. 2004; Zhao et al. 2004; Zhou et al. 2004c). This unique feature has been explored by algorithms implemented in several independent bioinformatics/statistical software packages, including dChipSNP (Zhao et al. 2004), Copy Number Analysis Tool (Huang et al. 2004), and FASeg (Yu et al. 2007). Based on these novel data analysis tools, we are able to perform concurrent copy number analysis and LOH analysis with a single experiment (Zhou et al. 2004c).

20.2.3 Cytogenetic-Based Approaches − The Chromosome Staining Techniques

Cytogenetics is a set of relatively old techniques which are based on the chromosome-banding method introduced in 1969 (Caspersson et al. 1969a,b). One major drawback of these approaches is the requirement of in vitro culture and metaphase preparation of the cells. Nevertheless, cytogenetic approaches will always have their place in the genomic studies because they provide direct visualization of chromosomal abnormalities. Furthermore, these cytogenetic techniques complement the high-throughput techniques (e.g., CGH and LOH) by providing information on chromosomal structural rearrangements that are not readily resolved by DNA copy number analyses. For example, translocations are common genomic abnormalities in cancer (Futreal et al. 2004), but they cannot be detected by CGH or LOH. An experienced cytogeneticist, however, can easily detect many forms of chromosomal translocations using classical cytogenetic techniques, such as chromosome banding technique (also known as karyotyping). A typical karyotyping analysis involves blocking the cells in mitosis, staining the chromosomes with Giemsa dye (which stains AT rich regions of chromosomes and produces dark bands), and visualizing under a light microscope. Karyotype analyses are performed as part of standard clinical tests for prenatal and postnatal screening, as well as for the diagnosis of specific types of cancers (e.g., hematological malignancies). However, many cancer cells have complex karyotypes, and are difficult to interpret base on standard ­karyotyping images. Recently, a number of new ­cytogenetic labeling methods have been developed, including spectral karyotyping, multicolor fluorescence in situ hybridization, cross-species color banding, and multicolor chromosome banding. These techniques permit the simultaneous visualization of all chromosomes in different colors, and thus considerably improve the detection of translocations or deletions. With the introduction of these techniques (Schrock et al. 1996; Liyanage et al. 1996; Speicher et al. 1996), the comprehensive analysis of complex chromosomal rearrangements present in tumor karyotypes has been greatly improved.

20.3 Transcriptome Profiling Techniques

The genome-wide transcriptome profiling became a reality when several epoch-making genomic techniques were introduced, including DNA microarray and Serial Analysis of Gene Expression (SAGE). Both DNA microarray and SAGE are powerful tools targeted at global gene expression. While the microarray technology requires prior knowledge of the sequence of the genes to be analyzed, SAGE technology can analyze gene expression in organisms with uncharacterized genomes. The obvious advantage of the microarray is the ability to measure gene expression in cell and tissue samples, and commercial platforms (e.g., GeneChip from Affymetrix, Inc.) are available for flexible research design. With the Human Genome Project completed at the beginning of the new millennium, the microarray takes center stage in investigating genome-wide gene expression in all aspects of human cancer.

A DNA microarray is typically a small solid support (e.g., a glass microscope slide) on which known sequences of tens of thousands of genes are immobilized. Commonly used immobilization methods include “ink-jet” printing, pin-spotting, and direct synthesis. The parallel presence of so many genes (often covering the whole genome of an organism) on a single microarray has allowed genomic studies to be performed in a high-throughput fashion. For example, expression changes of all genes on a whole genome can be monitored simultaneously. Doing this “one gene at a time” would be unthinkable.

20.3.1 Transcriptome Profiling-Based Analysis on OSCC Metastasis – A Promising Example

Microarray expression profiling has proven to be a powerful approach for characterizing the genome-wide expression changes associated with progression of OSCC, such as metastasis. To gain a better understanding of the underlying molecular biological processes that dictate the observed expressional changes, it may be more fruitful to focus on a higher level of biological information (e.g., alterations of group of genes, certain pathways, or biological processes in lymph node ­metastasis of OSCC), rather than focusing on specific genes. The following sections highlight the recent advances in the understanding of OSCC metastasis based on this genome-wide system biological approach.

20.3.1.1 Matrix Metalloproteases and Tissue Inhibitors of Metalloproteases

Hyperactivation of matrix metalloproteases (MMPs) is a hallmark of invasive cancers for which it constitutes a mechanistic prerequisite for the degradation of the basement membrane and extracellular matrix (ECM) thus allowing tumor cells to leave the primary tumor site and enter blood or lymphatic vessels for dissemination (Bachmeier et al. 2005). The role of MMPs in metastasis of OSCCs is well established (de Vicente et al. 2005a,b; 2007; Kato et al. 2005; Kim et al. 2006; Lyons and Jones 2007; Patel et al. 2005; Roy et al. 2007; Ziober et al. 2006) and highlighted by a number of transcriptome profiling studies (Chiang et al. 2008; Lin et al. 2004; Ito et al. 2003; Kashiwazaki et al. 2008; Kondoh et al. 2008; Nagata et al. 2003; Roepman et al. 2005; Zhou et al. 2006).

The biological activities of MMPs are regulated by four endogenous protease inhibitors of MMP: TIMP1, TIMP2, TIMP3, and TIMP4 (Jiang, et al. 2002). Since specific MMPs can promote cancer progression, it is reasonable to hypothesis that high levels of endogenous TIMPs would prevent cancer progression, and consequently, tumors with high TIMPs levels would have a better prognosis than those with low TIMPs levels. In OSCCs, several studies have shown that elevated expression of TIMPs, especially TIMP1 and TIMP2, in metastatic carcinoma have a good prognosis (Baker et al. 2006; Ikebe et al. 1999; Kurahara et al. 1999). However, contradictory findings have also been reported (de Vicente et al. 2005a; Nakamura et al. 2005; Katayama et al. 1984). These different findings may be because TIMPs are multifunctional proteins and that their effects on tumor progression are context- and concentration-dependent. Therefore, further studies are warranted to fully explore the roles of TIMPs in OSCC metastasis.

The expressions of MMPs and their endogenous inhibitors are regulated by a variety of cytokines, growth factors, and transcription factors that participate in tissue remodeling. TIMP2 is an essential factor for efficient activation of pro-MMP2. TIMP2 accomplishes this activation by acting as a bridge between MMP2 and membrane type 1-MMP (MT1-MMP) on the cell membrane. This trimolecular complex allows a second MT1-MMP molecule to cleave the pro-domain of MMP2 (Lander et al. 2001). In two independent studies, MT1-MMP was found to be up-regulated by laminin 5 (Yamamoto et al. 1986) and E1AF (an ets-oncogene family transcription factor) (Izumiyama et al. 2005) and therefore, activated the expression of MMP2 in tumor cells. Using a xenograft model, Miyazaki et al. (2008) demonstrated that both MMP2 and MMP9 levels were increased under hypoxic condition. MMP2 was predominantly expressed in the hypoxic region of tumor tissue, while MMP9 was mainly detected in neighboring stromal tissues containing blood vessels. Interestingly, the actions of MMPs and their inhibitors also depend on their concentrations. Baker et al. (2006) reported that tissue concentrations of a subset of these factors correlated with tumor progression, suggesting that it is the balance between MMPs and their corresponding TIMPs that control tissue degradation at each stage of tumor invasion and metastasis. These findings support the hypothesis that specific TIMPs, under specific conditions and at concentrations founded in vivo, may play a role in promoting rather than inhibiting cancer progression. Additional studies are currently underway to investigate the role of the MMP/TIMP system in tumor invasion and metastasis.

20.3.1.2 Urinary Plasminogen Activator and its Receptor

Recently, the critical roles of urinary plasminogen activator (PLAU) (also known as urokinase-type plasminogen activator, uPA) in ECM remodeling, tumor invasion and metastasis have become evident. PLAU is a serine protease that binds to a surface-anchored receptor (PLAUR) (also known as uPAR), which localizes its proteolytic activity to the pericellular milieu. Furthermore, PLAU and PLAUR interact with a number of transmembrane proteins to regulate multiple signal transduction pathways and influence a wide variety of cellular behaviors, including cell adhesion, migration, chemotaxis and tissue remodeling (Blasi 1996, 1997; Shi and Stack 2007). The PLAUR expression levels in tumor tissues and their prognostic values have been studied in a number of cancer types, including breast (Han et al. 2005), lung (Volm et al. 1999), prostate (Shariat et al. 2007), ovarian (Begum et al. 2004) and colorectal cancers (Seetoo et al. 2003). Results from these studies have shown that elevated PLAUR expression correlates with poor prognosis, thereby making it a potential biomarker for molecular classification of cancers. Enhanced expression of PLAU and PLAUR has also been found in OSCC, and correlates with tumor differentiation grade, lymph node metastasis and prognosis (Shi and Stack 2007; Bacchiocchi et al. 2008; Baker et al. 2007; Hundsdorfer et al. 2005; Li et al. 2006). Functional study has shown that silencing the endogenous PLAUR expression in highly malignant OSCC cells resulted in a dramatic reduction of tumor cell proliferation, adhesion, migration and invasion in vitro (Weng et al. 2008). Recently, genome-wide profiling studies have identified the PLAU as a strong biomarker for predicting poor disease outcome of OSCC using a “gene signature” approach (Ziober et al. 2006; Nagata et al. 2003). Furthermore, Ghosh et al. demonstrated that PLAU expression and PLAUR relocalization are regulated by α3β1 integrin-activated Src/MEK/ERK signaling pathway in oral keratinocytes (Ghosh et al. 2000). Conversely, blocking PLAUR-α3β1 integrin interaction results in significant inhibition of PLAU expression, suggesting the functional relevance of PLAUR-α3β1 integrin association in protease regulatory pathways (Ghosh et al. 2006). Collectively, these works implicate an important role of the PLAU-PLAUR system in invasion and metastasis of OSCC.

20.3.1.3 SDF-1/CXCR4 Signaling Axis

It has been demonstrated that G-protein-coupled seven-span transmembrane ­receptor CXCR4 is expressed in numerous types of embryonic cells and the α-chemokine stromal-derived factor 1 (SDF-1) has chemoattractant effects on these cells (Knaut et al. 2003; McGrath et al. 1999; Rehimi et al. 2008). Animal models in which SDF-1/CXCR4 signaling have been interrupted exhibit a number of phenotypes that can be explained by inhibition on SDF-1-mediated chemoattraction of stem/progenitor cells (Ma et al. 1998; Mizuno et al. 1994; Tachibana et al. 1998; Zou et al. 1998). Furthermore, the expression patterns for both SDF-1 and CXCR4 are highly consistent with the possibility that they have shifted developmental patterns in the formation of many different tissues. These observations suggest a crucial role for the SDF-1/CXCR4 signaling axis in regulating the migration of different types of stem/progenitor cells. It is believed that cancer stem cells, much like normal stem/progenitor cells, can give rise to tumor cells in primary tumors and can also metastasize to seed tumors in a second site. In this case, one may postulate that the SDF-1/CXCR4 signaling axis may influence the biology of tumors and direct the metastasis of CXCR4-expressing tumor cells by chemoattracting them to organs that express high levels of SDF-1 (e.g., lung, liver, bones, and lymph nodes). Supporting this notion, it has been recently reported that several CXCR4-expressing cancers, including breast, prostate, ovarian cancer and neuroblastoma (Geminder et al. 2001; Gerber et al. 2003; Muller et al. 2001; Kikuchi et al. 2003), metastasize to specific organs in a SDF-1-dependent manner. The role for the SDF-1/CXCR4 signaling axis involved in lymphatic metastasis of OSCC was also investigated in the past several years (Almofti et al. 2004; Delilbasi et al. 2004; Ishikawa et al. 2006; Oliveira-Neto et al. 2008; Onoue et al. 2006; Uchida et al. 2003, 2004, 2007). SDF-1α expression was detected mainly in the stromal cells, but also occasionally in the tumor cells metastasized to the regional lymph nodes (Uchida et al. 2003). CXCR4 expression in metastatic cancer tissues was significantly higher than that in nonmetastatic cancer tissues, and its expression was strongly associated with invasion, recurrence, and lymph node metastasis. Additional studies have shown that SDF-1α rapidly activates extracellular signal-regulated kinase (ERK) 1/2, Akt/protein kinase B (PKB) and Src family kinases in CXCR4-expressing cancer cells (Onoue et al. 2006; Uchida et al. 2003). More importantly, recombinant SDF-1α stimulates in vitro invasiveness and scattering in CXCR4-expressing OSCC cells, and induces metastasis of these cells to the cervical lymph node in an orthotopic nude mice model (Uchida et al. 2004). Taken together, these results indicate that SDF-1/CXCR4 signaling mediates the lymph node metastasis in OSCC via ERK1/2 and/or Akt/PKB pathway.

20.3.1.4 Epithelial-Mesenchymal Transition (EMT)

EMT, in which epithelial cells lose their polarity and become motile mesenchymal cells, occurs during the development process and is also a key step in the tumor progression towards metastasis, including metastasis of OSCC (Onoue et al. 2006; Kudo et al. 2006; Takayama et al. 2009; Takkunen et al. 2008). Accumulating evidence supports that EMT can contribute to metastasis by changing the adhesive properties of tumor cells and promoting their motility, thereby increasing their invasiveness. More strikingly, a variety of EMT markers, including downregulation of Cadherin 1 (CDH1) (also known as E-cadherin) and cytokeratin (Onoue et al. 2006), increased expression of Cadherin 2 (CDH2) (also known as N-cadherin) (Pyo et al. 2007), MMPs (Kashiwazaki et al. 2008; Roepman et al. 2005; Zhou et al. 2006; Higashikawa et al. 2008) and transcription factors such as Snail 1 (Snail) (Sun et al. 2008), SIP1/ZEB2 (Maeda et al. 2005), NF-κβ (Hu et al. 2007), and occludins (Bello et al. 2008), have also been found in lymph-node metastatic OSCC cells. Moreover, cadherin switching, which has been known to play a central role in the EMT, has also been implicated in OSCC metastasis (Pyo et al. 2007). The presence of the EMT markers in tumor tissues indicates an important role for EMT in promoting invasion and metastasis of OSCCs.

The continuously evolving microarray technology has the potential to revolutionize the clinical practice. Physicians in the future may be empowered with a handheld device to monitor health status in real time during a routine physical examination to detect any health problems at an early stage and even to suggest the best treatment options based on the characteristics of an individual’s genome. With the powerful microarray technology for molecular profiling, we are witnessing the beginning of the personalized medicine era.

20.4 Small RNA and MicroRNA Profiling Technologies

MicroRNAs are newly recognized, non-coding, regulatory RNA molecules, about 22 nucleotides in length. It is estimated that the human genome have approximately 800–1,000 microRNAs (Bentwich et al. 2005). While not involved directly in protein coding, microRNAs are believed to control the expression of more than one third of the protein-coding genes in the human genome (Lewis et al. 2003, 2005; Xie et al. 2005) Each microRNA can target and regulate the mRNA transcripts of hundreds of genes. One microRNA can have multiple target sites in the mRNA transcript of a gene, while one mRNA can be targeted by multiple microRNAs. Therefore, microRNAs act as a newly recognized level of regulation of gene expression. They are pivotal regulators of diverse cellular processes including proliferation, differentiation, apoptosis, survival, motility, and morphogenesis. High-throughput microRNA profiling is a technical challenge. The short length the microRNA render many conventional tools ineffective – very small RNA molecules are difficult to reliably amplify or label without bias. There are three common approaches for microRNA profiling: hybridization based methods, PCR-based detection, and cloning methods. Here, we will provide an overview of the technologies, and will also highlight the recent progresses in microRNA profiling.

20.4.1 Hybridization-Based MicroRNA Profiling − Microarray

The common hybridization based microRNA detection methods include Northern blotting, in situ hybridization, bead-based flow-cytometry, and more recently, microarray. The majority of the published studies reporting microRNA profiling analysis were performed using different microarray technologies. The differences in these microarray platforms are mainly in their probe design, probe immobilization chemistry, sample labeling, and signal detection methods (see (Yin et al. 2008) for comprehensive review on array-based microRNA profiling). Similar to the early mRNA microarrays, most of the early stage microRNA arrays were custom made. With the recent introduction of several commercially available microRNA array platforms, the study design and data analysis became more streamlined.

While the currently available commercial microRNA arrays make profiling studies on microRNA much easier for biomedical investigators, new developments in the biotech field have emerged as potential opportunities to further improve the microRNA microarrays. The locked nucleic acid (LNA) has recently emerged as a popular tool in various biological and biomedical studies due to its high affinity and specificity to the complementary RNA. LNA is a conformational analogue of the RNA molecule that contains at least one LNA monomer. The unprecedented thermal stability between LNA molecules and their target RNAs enables visualization of microRNA by in situ hybridization. In addition, the LNA molecules are highly metabolic stable, which makes them ideal tools for novel therapeutic approaches by targeting cancer-associated microRNAs. The LNA-based probes have also been used in the design of microarrays for microRNA profiling (Castoldi et al. 2006, 2007, 2008), which appears to improve the mismatch discrimination. This array has recently been used in studying several malignancies, including chronic myeloid leukemia and breast cancer (Venturini et al. 2007; Sempere et al. 2007). A recent review by Stenvang el al. provided a comprehensive review on recent advances in LNA-based microRNA detection in cancer (Stenvang et al. 2008).

Other attempts to improve the microarray-based microRNA profiling include the RNA-primed array-based Klenow enzyme (RAKE) assay (Nelson et al. 2004) and the modified versions of RAKE assay (Berezikov et al. 2006). The RAKE assay is based on the ability of an RNA molecule to function as a primer for Klenow polymerase-dependent extension when fully base-paired with a single-stranded DNA molecule. Combining with the microarray technology, RAKE assay appears to provide better specificity than other conventional microarray platforms. It has been reported that with this RAKE assay, microRNAs isolated from formalin-fixed paraffin-embedded tissue can be used to ­generate optimal quality microRNA profiles (Nelson et al. 2004, 2006), which leads new opportunities for analyses of small RNAs from archival clinical tissue samples.

20.4.2 Quantitative Real-Time PCR (qRT-PCR)-Based MicroRNA Profiling

While the microRNA microarrays described above provide excellent throughput and high coverage, these methods do not amplify the microRNA and thus often compromise the sensitivity. The qRT-PCR technology provides unparalleled sensitivity and specificity. However, it is technically challenging to amplify and quantify mature microRNA because the mature microRNA is only around 22 nucleotides in length, roughly the size of a typical PCR primer. Therefore, earlier versions of qRT-PCR assays are usually designed to quantify microRNA precursors. While the relative level of most mature microRNAs may be projected based on the level of corresponding precursors, additional tests will be needed to ensure that the levels of the mature microRNAs are reflected by the level of their precursors.

Recently, the second generation of qRT-PCR assays has been developed to directly quantify the mature microRNA. These assays typically incorporate a target specific stem-loop, reverse transcription primer. This innovative design addresses a fundamental challenge in microRNA quantification: the short length of mature microRNAs (∼22 nucleotides). The stem-loop structure provides specificity for the mature microRNA target and forms a RT primer/mature microRNA-chimera that extends the 3′ end of the microRNA. The resulting longer RT product presents a template amenable to standard real-time PCR-based quantification using TaqMan Assays. These qRT-PCR assays are now commercially available (e.g., TaqMan MicroRNA Assay from Applied Biosystems). To improve the throughput, these qRT-PCR assays have been packaged into convenient, pre-configured micro fluidic cards that contain up to 384 unique TaqMan assays and they are compatible with most of the common qPCR instruments.

20.4.3 Cloning and Deep Sequencing-Based MicroRNA Profiling

The microRNA profiling methods described above rely on primers or probes designed to detect known microRNAs. They can only detect known microRNA species that previously identified by sequencing or homology search. Moreover, the huge range of microRNA level from tens of thousands to just few molecules per cell complicates the detection of microRNAs expressed at low copy numbers. Therefore, many undetected microRNA may exist even in well-explored species. The cloning and deep sequencing based microRNA profiling approach allows both the quantification of expression levels and identification of new microRNAs at high speed and sensitivity and low cost.

This approach is developed by combining aspects of microRNA cloning and SAGE technology, which lead to its original name − miRAGE (Cummins et al. 2006). Similar to traditional cloning approaches, miRAGE starts with the isolation of 18- to 26-base RNA molecules to which specialized linkers are ligated, and reverse-transcribed into cDNA. However, subsequent steps, including amplification of the complex mixture of cDNAs using PCR, tag purification, concatenation, cloning, and sequencing, have been performed by using SAGE methodology optimized for small RNA species.

SAGE was originally designed to characterize gene expression profiles. It has a potential to be a high-throughput gene expression profiling tool. Over the years, much improvement has been made to increase sequencing efficiency and reduce input RNA amount requirement (Datson 2008; Matsumura et al. 2008; So et al. 2004; de Hoon and Hayashizaki 2008; Torres et al. 2008; Hene et al. 2007). Although it is not as popular as microarrays and qRT-PCR due to technological and economical challenges, this technology has the unique advantage of combining discovery and quantification. The introduced “next-generation” sequencing technologies, such as massively parallel signature sequencing and more recently the Roche/454 and Illumina’s GAII systems, offer inexpensive increases in throughput. With the added depth of sequencing now possible, we have an opportunity to identify low abundance microRNAs or those exhibiting modest expression differences between samples, which may not be detected by hybridization-based or qRT-PCR-based methods. The continuation of advances in the sequencing technologies, coupled with the unique features of microRNA (e.g., short length, difficult to amplify and label without introducing bias), tends to suggest that deep sequencing may be the optimal approach for high-throughput profiling of microRNA (and other small RNA).

20.5 Mass Spectrometry-Based Proteomics

Proteomics is a novel molecular technology that may significantly accelerate oral cancer research. In fact, cellular functions are mainly performed by proteins and the majority of anticancer drugs are targeting at proteins. A promising application of oral cancer proteomics is to reveal key target proteins and signaling pathways underlying the development of oral cancer. The study may also identify novel therapeutic targets and discover protein biomarkers for cancer diagnosis and prognosis.

Modern proteomics is primarily driven by mass spectrometry (MS), an exquisite analytical technology which measures the mass-to-charge ratio of ionized molecules. In early MS-based proteomics studies, most of the applications were focused on identification of proteins of interest. This can be done using either peptide mass fingerprinting (PMF) or tandem MS (MS/MS). In PMF, an isolated, unknown protein is cleaved using a proteolytic enzyme and the resulting peptides are usually measured by matrix-assisted laser desorption/ionization with time-of-flight MS (MALDI-TOF MS) (Pappin et al. 1993). The premise of PMS is that every unique protein will have a unique set of peptides and hence unique peptide masses. Identification is accomplished by matching the observed peptide masses to the theoretical masses derived from a sequence database. This technique is well suited for identification of proteins in two-dimensional gel spots where the protein purity is high. However, PMF protein identification can run into difficulties with a mixture of proteins, which typically requires the use of tandem MS to achieve confident identification.

Tandem MS, also known as MS/MS, involves multiple stages of MS analysis, with some form of fragmentation occurring in between the stages (Aebersold and Mann 2003). It can be done using physically separated mass analyzers with a collision cell between these elements for molecule fragmentation. For example, one mass analyzer can isolate a peptide ion from many entering a mass spectrometer. The peptide is then broken into smaller fragments in the collision cell by collision-induced dissociation (CID) and a second mass analyzer can measure the fragments produced from the peptide precursor. Tandem MS can also be done using ion trap or Fourier transform ion cyclotron resonance mass spectrometers, where precursor or fragment ions are trapped in a single mass analyzer with multiple MS steps taking place over time. Similar to the PMF approach, the obtained masses (including precursor and fragment ions) from tandem MS are then in silico compared to either a proteome or genome database to find the best matched protein. This is achieved by using a computer program for database searching (e.g., Mascot or Sequest), which calculates the absolute masses of theoretical peptides and fragments from each protein in the database and then compares the corresponding masses of the unknown protein to those of each protein in the database. Currently, large-scale identification of proteins in a specific proteome (e.g., plasma or saliva proteomes) mainly relies on tandem MS and database-searching algorithms for peptide and protein identification (Hu et al. 2005, 2006; Denny et al. 2008).

As proteomics tools evolve, quantitative analysis/profiling of proteins in defined biological or disease samples (quantitative proteomics) becomes a central application of proteomics. A commonly used quantitative proteomics approach is based on the use of stable isotope labeling of proteins/peptides, followed by tandem MS to compare the relative abundance of the proteins in different samples. Stable isotope labeling with amino acids in cell culture is a straightforward approach for in vivo incorporation of isotope tags into cellular proteins for MS-based quantitative proteomics. The method relies on metabolic incorporation of amino acids with substituted stable isotopes (e.g., 13C, 15 N), and is particularly useful when studying cell line models (Ong et al. 2002). As for clinical samples such as tissue or body fluids from disease patients, quantitative proteomic analysis can be performed using tandem MS coupled with stable isotope labeling techniques such as isotope-coded affinity tagging (ICAT) (Gygi et al. 1999), isotope tagging for relative and absolute quantitation (iTRAQ) (Ross et al. 2004), isotope coded protein labeling (Schmidt et al. 2005) or proteolytic 18O labeling (Miyagi and Rao 2007). These isotope tags either label proteins or proteolytic peptides from different samples for comparative analysis. If the intact proteins get tagged (e.g., ICAT), the labeled samples are subsequently combined, digested with trypsin, and then analyzed with LC-MS/MS. Because isotope-labeled peptide pairs are chemically identical, they coelute during LC separation. The relative quantitation can be determined by the ratio of ion intensities from coeluting isotope-labeled peptides in the MS survey scan, which defines the ratio between parent proteins in the starting samples. Meanwhile, MS/MS analysis of the peptides allows the identification of the protein based on sequence database searching. If the proteolytic peptides are labeled with tandem mass tags, then both quantitation and identification rely on MS/MS spectra. For instance, iTRAQ utilizes isobaric tags that can be cleaved during CID to yield an isotope series (reporter ions) representing the quantity of a peptide from different samples. Because the peptide remains attached to the isobaric tags until CID is conducted, the resulted MS/MS spectrum allows for simultaneous identification (based on fragment ions) and quantitation (based on reporter ions) of the peptide.

By using MS-based proteomics to investigate global protein alterations in patients with OSCC or OSCC-derived cell lines, a number of tumor-associated proteins have been identified (Hu and Wong 2007). Although the function of these proteins in oral carcinogenesis remains unclear, some of them indeed show a ­regulatory role in the development of OSCC and may have clinical or therapeutic implications for OSCC (Weng et al. 2008; Ralhan et al. 2009; Hu et al. 2008; Patel et al. 2008; Wang et al. 2008). The extensive protein alterations observed from these studies also indicate that multiple cellular and etiological pathways are involved in the process of oncogenesis, and suggest that multiple protein molecules should be simultaneously targeted as an effective strategy to counter the disease (Chen et al. 2004). Proteomics may play a significant role in anti-cancer drug discovery because this technology can be used to discover and validate therapeutic targets, to assess drug efficacy and toxicity, and to identify disease subgroups for targeted therapy. It is also promising for identifying protein targets of anti-cancer drug action, and therefore providing further insight for new drug development (Lee et al. 2006; Sung et al. 2006).

20.6 Summary

Merely 20 years ago, the prevalent mode of biomedical research was centered on the “one gene at a time” model: cloning and characterizing a single gene or a few closely related genes. This had been the gold standard until the mid-1990s, when the “genomics era” began with the establishment of several epoch-making genomic techniques (e.g., DNA microarray). Together with the completion of the human genome project at the beginning of the new millennium, the resulting exponential boom of new knowledge brought us into the “post-genomics” era. Emerging genomics and proteomics technologies are rapidly reshaping cancer research, allowing a transition from the traditional genetic studies to a new paradigm based on systems biology. Compared with traditional studies, the systems biology approach allows us investigate the complex biological networks as a whole. Based on integration of data from multidimensional (genomic/transcriptomic/proteomic) analyses, the systems biology approach will also increase the reliability of discovering causative genes for complex diseases (Hu et al. 2009). Considering that multifactorial etiology and heterogeneity of oncogenic pathways of OSCC, a systems biology approach by integrating genomic and proteomic data may be necessary in order to have a more profound understanding the molecular mechanism underlying oral carcinogenesis. We are now witnessing an exciting new era in cancer research. By applying and translating this newfound knowledge, we are developing more efficacious treatments for OSCC that will be of great relevance to other cancers as well.