skip to content


Residuals of a Poor Quality Microarray Hybridisation Data After Model Fit

While genes themselves are hardwired, their degree of expression depends on the type and condition of the organ cellular conditions and chemical environment. Comprehensive information on gene expression is therefore key to diagnosis and prognosis of complex genetic diseases such as cancers, cardio vascular diseases and brain disorders.

High-throughput measurement technologies for gene expression have opened up new avenues for biomedical research. However, based on assessing the concentration of fragile macromolecules in chemical assays, the data is typically noisy and biased, which has led to irreproducible scientific results undermining the credibility of the new technologies.

To reach its full potential, statistical challenges related to the size and complexity of these new types of data sets need to be tackled. This work on statistical quality assessment and visualisation of gene expression data quality has provided scientists with tools to rate the quality of their data and to detect concrete reasons for poor quality, such as certain lab conditions.

Statistics can serve as guides to increase the reproducibility of future experiments. An example for the impact of these new statistical quality assessment tools is their role in the Microarray Quality Control project, an FDA initiative to establish quality standards for high-throughput gene expression data. Another example is their role in the development of a diagnostic tool for thyroid cancer that has hugely reduced the number of unnecessary surgeries.