Workpackage 6: Transcriptomics




To identify genes and functional gene categories differentially expressed in target tissues from ALS patients, to improve understanding of the pathophysiological mechanisms of disease, and its systemic manifestations.


To identify clusters of fALS and sALS patients with similar gene expression profiles,reflecting a common aetiology. 


To use gene expression profiling correlated with clinical parameters to determine genomic indices of fast and slow disease progression in ALS.


To determine whether differentially expressed genes identified map to previously identified quantitative trait loci, which will direct further investigation of potential candidate genes.


To isolate panels of differentially expressed genes in muscle that can distinguish ALS in the early stages from related motor neuron diseases, and determine the specificity of candidate biomarker muscle Nogo-A expression in discriminating between ALS and related motor neuron diseases.


To interrogate cellular and mouse models of new genetic variants of ALS (TDP-43 and FUS) to gain an understanding of the pathobiology of these newly identified subtypes of disease. 


Resources underpinning the GEP activities  Gene expression profiling will be undertaken in a dedicated Microarray Core Facility with Affymetrix and Agilent platforms (Partners 14 and 1). A Veritas laser capture microdissection system will be used for extraction of motor neurons from tissue samples. Bioinformatics expertise specifically for Microarray analysis is available through partners 1 (Jansen) and 14 (Rattray, Laurence and Milo).

Patients: RNA samples from whole blood are already available from 400 ALS patients and 400 controls. Additional samples will be taken from the prospective cohort of 400 ALS patients and 400 controls (Partners 1, 8, 9, 10) as descibed in the Workpackage "Clinical coordination". Where possible patients will be recruited at the time of diagnostic investigation to ensure that they are drug naïve. Familial cases will have been screened for known genetic variants of ALS, including mutations in SOD1, TDP-43 and FUS. Blood samples are taken following an overnight fast, and controls are age and gender matched to minimise variability from factors other than disease state (Radich et al, 2004; Eady et al, 2005).

Optimally prepared cryopreserved, snap-frozen CNS tissue is already available from 150 ALS cases and 25 control cases (Partner 14).  New material will be collected during years 1 to 3 of the programme (Partners 1, 2, 3, 5, 8, 10, 12, 14). Muscle biopsy tissue (quadriceps) is already available from 50 subjects (10 ALS; 10 FTD; 10 ALS/FTD; 20 controls) and further samples (n=100) will be collected by partners 3 and 4.

Collection of samples and RNA extraction: Blood will be collected into PAXgene blood tubes and extracted following the PAXgene Blood RNA Isolation System protocol (BD). Globin RNA will be removed using GlobinClear (Ambion), which improves the % present calls and detection of low level transcripts. For CNS samples 10µm sections of lumbar (L3-5) spinal cord undergo rapid protocol staining with toluidene blue and the MN isolated from the surrounding material using the Veritas Laser Capture Microdissection System. 1000 MN will be collected from each case.  RNA will be extracted using the PicoPure RNA Isolation Kit for MN (Molecular Devices) and FastRNA Pro Green kit for muscle tissue (Qbiogene). RNA quality and quantity will be assessed at each stage using the Agilent Bioanalyser and Nanodrop Spectrophotometer, respectively.

Induced Pluripotent Stem Cells (iPS cells) generated from patients with ALS and controls: iPS cells derived from control donors or ALS patients will be used to generate motor neurons and glia cells in vitro, which will in turn be used to study the differential properties of ALS cells. iPS cells will be generated in WP8. In short, cells will be generated by lentiviral transduction of patient fibroblasts (skin biopsy) with four “reprogramming” factors; OCT4, SOX2, KLF4 and cMYC. This results in fibroblast reprogramming and emergence of pluripotent iPS cells. The cells will be treated with an agonist of the sonic hedgehog (SHH) signaling pathway and retinoic acid (RA). Next, we will introduce an HB9-GFP reporter construct into these iPS cells to allow the identification of developing motor neurons from the iPS cells (Di Giorgio et al, 2008). In vitro differentiation of the iPS cell lines will result in the induction of GFP-positive putative motor neurons which will be further characterized using molecular and immunofluorescnce microscopy to confirm their motor neuron status.

Microarray hybridisation and data analysis: The Whole Transcript Sense Target Labelling Assay (Affymetrix) will be used to generate biotin-labelled cDNA fragments which hybridise to the Exon 1.0 ST Arrays overnight. The arrays contain 1 million known or predicted exons with a redundancy of 4 probes per exon and an average of 40 probes per gene. This allows mapping of splice variants within the human genome as well as improved sensitivity of gene-level expression analysis. Following stringency washes and quality control, the background level of hybridisation is subtracted and the signal intensities from each of the probes are imported into Array Assist for gene-level and exon-level analysis. For each sample type, the expression profiles from the genetic variants of ALS will be compared against those from classical sporadic ALS as well as disease and normal controls. We will be interested both in changes in gene expression and alternatively spliced transcripts that are specific for all samples of ALS, as well as those which subgroup the heterogeneous familial and sporadic ALS cases. Currently, it is unknown how many subgroups these samples will divide into. Analytical tools such as hierarchical clustering, principal component analysis and self-organising maps, will be used to establish key genes/transcripts that clearly distinguish the different subgroups of ALS.

For mRNA isolated from mouse models and murine cell lines we will use Mouse Exon 1.0 ST Arrays. 

The results will be analyzed initially using GeneChip Operating Software (GCOS) to automatically acquire and analyse image data and compute an intensity value for each transcript, and subsequently using ArrayAssist Software 5.5 (Iobion) to determine an accurate statistical analysis of the relative gene expression in each sample. The analysis will be performed using the probe logarithmic intensity error (PLIER) algorithm. Transcripts will be defined as differentially expressed between samples if there is a 1.5 fold or greater difference in gene expression level, plus a p-value ≤0.05. The statistical test applied by the ArrayAssist version 4.0 programme is an unpaired two-tailed t-test. Differentially expressed probes will be classified according to GeneOntology terms which allows classification of genes according to their molecular function, biological process, cellular component and chromosomal localization. In order to identify alterations in specific pathways PathwayArchitect (Iobion), GenMAPP 1.1, Biocarta and DAVID (http://david.abcc.ncifcrf.gov/) programmes will be used initially (Huang da et al, 2009). Further details of the data analysis methods are given in section 3 of this WP.

Validation: Quantitative PCR will be used to confirm differentially expression of key selected genes which clearly differentiate ALS from controls or subgroups of ALS. For differentially expressed alternatively spliced transcripts, standard reverse transcriptase PCR with primers designed to both the normal and alternate transcript will be designed. Following confirmation, primers specific to each transcript will be used for Q-PCR to determine a quantitative measure between the two. Those genes/transcripts which show robust and distinctive expression in subtypes of ALS will be validated in a second cohort of ALS patients. 

Those genes/transcripts which are most informative (give greatest sensitivity and reliability) will then be selected and included on an Agilent Custom Array, which allows the simultaneous hybridisation of 8 RNA samples per array. This will allow high-throughput analysis of the validation cohort. In addition, once the individual cases have been subdivided according to their transcriptional profiles, clinical and pathological data will be interrogated to determine which parameters cross-correlate.

Task 1: Establish protocols for sample collection

A standardised protocol for collecting samples and extracting the RNA will be established in the first 2 months which will then be used by Partners across the consortium collecting samples, to ensure high quality, reproducible microarray data is obtained. For blood samples, it is necessary that samples are collected after an overnight fast and parameters such as gender and age are matched in the control samples (Whitney et al, 2003; Radich  et al, 2004; Eady et al, 2005; Kim et al, 2007). Use of GlobinClear removes interference from beta-globin transcripts to improve the microarray data (Field et al, 2007). For post-mortem samples, CNS tissue should be cryopreserved by snap freezing, to preserve the RNA quality. Tissue pH will be measured in CNS samples prior to LCM as pH has been shown to be a good indicator of RNA integrity (Harrison et al, 1995; Kingsbury et al, 1995). Quality control of all RNA samples will also be measured using the RNA Integrity Number (RIN), which is calculated by the Agilent Bioanalyser.

Task 2: Transcriptomics of whole blood from ALS and control patients

Partner 1 will lead on in these analyses. Blood samples from sALS and age and sex-matched controls (n=400 of each for discovery cohort and validation cohort) will be collected from the participating centres across Europe. These data will be made available to the Workpackage "ALS systems biology model development", to allow for causal anchoring of the data, as descibed there.

Differential expression will be analyzed using a parametric approach, using the residuals after regressing the original data on gender, laboratory of origin, ancestry (based on the genome wide SNP genotyping that will be performed in the Genomics Workpackage), Riluzole use and an RNA quality parameter (RIN). In addition we will apply co-expression analyses (WGCNA). In brief, the first step in WGCNA is to calculate a standard correlation matrix between all pairs of probes in the residuals of the original data. The correlation matrix will be transformed into an n × n adjacency matrix. To group genes with coherent expression profiles into modules, we will use average linkage hierarchical clustering on the topoly overlap measures derived from the adjacency matrix (Zhang and Horvath, 2005). Hub genes are defined by an intramodular connectivity values of two times the standard deviation above the mean module connectivity (Webster et al, 2009).

Finally, prediction analysis for microarrays (PAM) will be performed in order to indentify the set of genes that best discriminated between ALS and controls in the population based discovery cohort and validation cohort.

Task 3: Transcriptomics of lymphoblastoid cell lines from ALS and control patients Partners 12 and 14 have established from 2003 -2010 an ALS  DNA Bank which to date includes more than 1200 sALS cases, 110 non-SOD1 fALS cases, and 1247 controls collected and stored as frozen DNA, and as EBV-transformed lymphocytes in the European Collection of Cell Cultures and including a detailed data-base of recording relevant clinical parameters. The discovery cohort will consist of samples from 250 SALS; 120nonSOD1 FALS; 100 controls and the validation cohort 250 SALS and 100 controls.  This bioresource will be used by partner 14  to:

1.    Identify changes in gene expression which correlate with fast (< 2 year survival) and slow (>5years survival) disease progression in sporadic ALS.

2.    Identify gene expression profiles which distinguish genetic subgroups of non-SOD1 fALS patients and clustered groups of SALS cases.

Task 4: Transcriptomics on human post-mortem samples:

CNS post-mortem material will be collected and isolated MN’s analysed from ALS and controls (n=50 ALS and 25 controls for the discovery cohort and n=50 ALS and 25 controls for the validation cohort) (Partners1, 3, 5, 14). Partner 14 will undertake this analysis.

Task 5: Transcriptomics on human muscle biopsy material (Partners 3 and 4)

For the identification of multi-gene markers microarray analysis using Affymetrix Human Exon 1.0 ST Arrays will be used as described above. For the confirmation of Nogo-A specificity Western blot and quantitative RT-PCR will be employed to determine amounts of Nogo-A at protein and mRNA level.

Task 6: Cellular models of fALS: Cultured NSC34 cell lines, transfected with TARDBP and FUS normal and mutant constructs are available from Partners 2 and 7. RNA will be extracted in triplicate from WT, mutant and vector only transfected cells. RNA will be extracted using the RNeasy Kit (Qiagen). Following QC, the RNA will be labelled and hybridised to the Mouse Exon 1.0 ST Arrays as described above. Partner 14 will undertake this analysis. 

Task 7: Transcriptomics of primary motor neurons isolated from murine models of ALS: Transgenic mouse models of ALS, carrying either the normal or mutant forms of the new ALS-related genes TARDBP, FUS, as well as mice deficient for Unc13a are available to Euro-MOTOR collaboration (Partners 1, 3,7). Gene expression profiling of primary MN cultures from the 3 sets of mice will allow evaluation of the earliest response to the presence of disease causing mutations at the developmental stage and also interrogation of gene expression in the axonal compartment. Triplicate samples will be used. RNA will be extracted and following QC, the RNA will be labelled and hybridised to the Mouse Exon 1.0 ST Arrays. Partner 7 will undertake this analysis.

Task 8: Transcriptomics on iPS cells generated from patients with ALS and controls
For the identification of multi-gene markers, microarray analysis using Affymetrix Human Exon 1.0 ST Arrays will be used as described above. Hb9 expressing MN will be isolated from the mixed cultures obtained from iPS cells using an established LCM protocol.  iPS derived motor neurons will be derived from 40 ALS patients and 10 controls. Partner 1 will undertake this analysis.

Task 9: Integration of transcriptomics data with other level -omics data

These analyses will be performed in the Workpackage "ALS systems biology model development". The primary objective of the current Workpackage is to make available for integrative analysis as soon as possible, the various transcriptomic data that will be generated, including data derived from: human CNS tissue from autopsies, whole blood, in vitro and in vivo ALS models, and the iPS cells derived from patients and controls.