You do not need to use RStudio to answer Any of the following coursework questions.
Â
(i.) Gene microarray systems comprise either âone colourâ or âspottedâ arrays. State which of these approaches can be used to study two samples simultaneously on the same microarray (1 mark).
Â
What readout results from this array approach? (1 mark)Â
Â
The following Table appears in a research publication showing data for 4 genes from a one-colour microarray experiment where the authors have compared a drug responsive control cancer cell line A with a drug resistant cell line B:Â
Â
(ii.) Explain what is meant by the term âprobesetâ (Column 4 in the Table) in the one-colour microarray procedure. (4 marks)
Â
(iii.) Use the open-access resource Gene Cards to determine the three missing gene descriptions in Column 9 of the Table. (3 marks)
Â
(iv.) The researchers used biological triplicate samples from each of the two cancer cell lines to generate the mean intensity data for the Table (Columns 1, 2). How does a biological replicate differ from a technical replicate? Â (2 marks)
Â
(v.) What Quality Control (QC) step can be performed to view such triplicate array data in order to check it is of a good enough standard for differential expression analysis? (1 mark)
Â
(vi.) State the two processes that are used to correct microarray raw intensity data in order to eliminate non-biological differences (such as  brightness bias or spatial bias) between microarrays and to allow graphical display of non-skewed expression data within a manageable scale. (2 marks)
Â
(vii.) Using their microarray intensity data the researchers obtained the list of genes in the Table  that were differentially-expressed between the two cell lines by using a statistical test to generate a p value for every gene, as well as by analysing their fold increase.
Â
Give an example of a statistical test that is applicable to compare gene expression in these two cell lines using their triplicate intensity data (1 mark).
Â
Why is it very important to adjust p value results for multiple testing in microarray analysis? (2 marks)
Â
 Provide a named example of such a  value correction approach (1 mark).
Â
(viii.) In this instance, the absolute fold changes in gene expression have been displayed in Column 6 of the Table, but why do bioinformaticians often prefer to display fold change as a log2 value? (2 marks)
Â
(ix.) Using all the various data provided in the Table, state which genes you consider to be convincingly altered in expression in the drug resistant cell line and explain why. Â (4 marks)
Â
(x.) State a laboratory technique that could be used to explore whether any of these genes had an important function in drug resistant growth using cell line B. (1 mark)
Â
(i.) Clustering of gene microarray data using bioinformatic tools such as Morpheus and studying the associated dendrograms are often used when profiling clinical or experimental cancer samples.
Â
What is clustering (1 mark) Â and what is a dendrogram? (1 mark)
Â
(ii.) What step should initially be performed with microarray data before clustering so that the clustering results are more manageable and easier to view? Â (1 mark)
Â
(iii.) The following Heatmap shows the expression levels of 10 genes (A-J) in lung cancer samples from six patients. The Heatmap colour scale runs from bright red (high expression), through black, down to bright green (low expression).
Â
What evidence is there that these expression data have undergone clustering? Â (2 marks)
Â
What does the sample dendrogram tell us about tumours 1-6? Â (3 marks)
Â
State 2 ways in which the large coloured âblocksâ of genes (e.g. Â the cluster of genes A-F) displayed on the heatmap could be useful to cancer biologists or clinicians. (2 marks).