Connectivity mapping uncovers small molecules that modulate neurodegeneration in Huntington’s disease models

Abstract Huntington’s disease (HD) is a genetic disease caused by a CAG trinucleotide repeat expansion encoding a polyglutamine tract in the huntingtin (HTT) protein, ultimately leading to neuronal loss and consequent cognitive decline and death. As no treatments for HD currently exist, several chemical screens have been performed using cell-based models of mutant HTT toxicity. These screens measured single disease-related endpoints, such as cell death, but had low ‘hit rates’ and limited dimensionality for therapeutic detection. Here, we have employed gene expression microarray analysis of HD samples—a snapshot of the expression of 25,000 genes—to define a gene expression signature for HD from publically available data. We used this information to mine a database for chemicals positively and negatively correlated to the HD gene expression signature using the Connectivity Map, a tool for comparing large sets of gene expression patterns. Chemicals with negatively correlated expression profiles were highly enriched for protective characteristics against mutant HTT fragment toxicity in in vitro and in vivo models. This study demonstrates the potential of using gene expression to mine chemical activity, guide chemical screening, and detect potential novel therapeutic compounds. Key messages Single-endpoint chemical screens have low therapeutic discovery hit-rates. In the context of HD, we guided a chemical screen using gene expression data. The resulting chemicals were highly enriched for suppressors of mutant HTT fragment toxicity. This study provides a proof of concept for wider usage in all chemical screening. Electronic supplementary material The online version of this article (doi:10.1007/s00109-015-1344-5) contains supplementary material, which is available to authorized users.


Drosophila compound feeding and assays
For drug feeding experiments, maize media was heated until liquid and distributed into vials. Deferoxamine or chlorzoxazone (Sigma Aldrich, UK) were freshly prepared in DMSO as 1000 X stocks and added to the media. Newly emerged HTT93Q exon 1 flies were transferred to vials containing control or treated food, which was changed daily for 7 days. At day 7, flies were anaesthetized with CO 2 , their heads removed and mounted face-up on microscope slides. A Nikon Optiphot-2 microscope at 40 X magnification was used for counting rhabdomeres from approximately 100 ommatidia per fly, and 12 flies per treatment.

Microarray analysis
Microarray data for diagnosed HD and control postmortem brain samples [1] was obtained from ArrayExpress (www.ebi.ac.uk/arrayexpress) and build 2 of the cMap database. CEL files were imported into ArrayTrack software [2] and summarized using aspects of the MAS5 algorithm, where the background fluorescence is subtracted and each mismatch probe intensity is subtracted from its respective perfect match probe [3]. The resulting expression data was normalized using a quantile scaling method [4] and samples compared using a Welch t-test. The entire dataset was filtered to remove genes statistically unchanged any genes with a mean channel intensity (MCI) lower than 50, as a feature of the MAS5 summarization is high false positives at low MCIs due to the subtraction of sometimes high mismatch probe intensities [3]. Visualization and hierarchical clustering of the dataset was carried out by Euclidian distance (Figure 1) or Pearson correlation ( Figure 5) using GENE-E software (www.broadinstitute.org). A gene expression signature for HD was defined by selecting the most changed genes from grade 2 caudate nucleus of HD versus control. The top 100 most differentially expressed genes were selected by absolute log 2 fold change once the dataset was filtered for significance (P < 0.05). This method of gene selection was chosen in order to mimic best the manner in which genes are ordered in cMap reference database. The gene signature was used to query build 2 of the cMap dataset [5]. The algorithm used to carry out the query was that produced by Zhang and Gant [6,7]. Equal weighting is applied to each gene in the gene signature, which is compared to around 3000 profiles in the cMap database, within which each gene is ranked in a linear manner according to absolute fold change, where an upregulation is positive and a down-regulation is negative. The similarity score is determined by adding the scores for each gene in the gene signature and dividing it by the maximum possible score. Perfect positive and negative scores are therefore 1 and -1 respectively.

Cell viability
Cell viability was measured using an MTS assay kit (Promega, UK) according to the manufacturer's instructions. Cells were seeded in a 96-well plate (Greiner, UK) at a density of 5 x 10 3 cells per well in 100 μl of medium. The cells were allowed to adhere for 24 hours prior to treatment with several concentrations of each chemical dissolved in DMSO (final concentration 0.1 %) for 72 hours. Following treatment, the media was aspirated, and pre-warmed fresh media added to each well. 20 μl of MTS assay reagent was added to each well and incubated at 37 °C and 5 % CO 2 for 3 hours before the absorbance at 490 nm was measured using a Wallac Victor 2 spectrophotometer. Values were normalized to the DMSO control.

Caspase 3/7 assay
Caspase activation was determined using a Caspase-Glo 3/7 assay kit (Promega, UK) according to the manufacturer's instructions. PC12 cells were seeded in a white 96well plate (Greiner, UK) at a density of 5 x 10 3 cells per well in 100 μl of medium.
The cells were allowed to adhere for 24 hours prior to treatment. Induced and uninduced cells were treated with a range of sub-cytotoxic chemical concentrations.
Induced cells were concomitantly treated with 5 µM ponasterone A for 72 hours.
Following treatment, 50 μl of Caspase-Glo reagent was added to each well. The plate was shaken for 30 seconds at 300 rpm and incubated in the dark at room temperature for 1 hour. The luminescence of each sample was measure using a BMG Labtech FLUOstar Omega luminometer / spectrophotometer. Assays were performed in triplicate and values normalized to uninduced (minimum) and induced (maximum) controls. Ebselen, an antioxidant with documented protective effects [8] was included as a positive control on every plate.

Cellomics assay
PC12 cells were seeded in a 24-well plate at 1 x 10 4 cells per well and allowed to adhere for 24 hours prior to treatment. The cells were treated with a range of subcytotoxic chemical concentrations, and induced with ponasterone A for 48 hours. The cell media was removed and the cells fixed using 4 % paraformaldehyde in PBS for 10 minutes at room temperature. The paraformaldehyde was removed and the cells stained with a 50 μg/ml solution of Hoechst 33342 in BSA for 10 minutes at room temperature. The Hoechst solution was removed, 1 ml of PBS added, and samples stored at 4°C until analysis. The plate was transferred to the Cellomics machine and the 'Spotcount' bioapplication used to detect HTT aggregates. Hoechst-stained nuclei were detected in channel 1 with a 386 nm filter, and a mask fitted to define the area of the nucleus and predict the cell boundary. GFP labeled aggregates above an intensity threshold within the cytoplasmic area were counted in channel 2 with a 485 nm filter. 4000 cells/well were counted in randomly selected fields containing >50 cells, with a maximum of 100 fields per well. Values were normalized to the induced control.

Statistical Analyses
For all non-microarray-based experiments, statistical significance was measured using a one-way ANOVA and the results were a product of 3 biological replicates (N = 3) unless otherwise stated. The P value threshold for statistical significance for all experiments was 0.05.
Chemical Concentration (μM) Figure S1. Chemicals that induced transcriptional changes which correlated or were inverse to the HD gene signature were tested for cytotoxicity in PC12 cells. The cells were exposed to the chemicals shown for 72 hours (10 nM -100 μM). Cell viability was determined by MTS assay. The data represents mean ± SEM (N = 3).

Valinomycin
Cell Viability Normalised to Control (% )   Table S1. The 100 largest gene-expression changes in the caudate nucleus of Grade 2 HD patients compared to age and sex-matched controls. These data were used to create a gene signature to query the Connectivity Map. Microarray data was obtained from a publically available dataset (E-GEOD-3790). Data was normalized by mean scaling and filtered for genes with a mean fluorescent channel intensity greater than 50 and deemed significantly changed (P < 0.05). The top 100 most changed genes were selected by highest absolute fold change.