Abstract

Large Scale Statistical Analysis of GEO Datasets

Bernard Ycart, Konstantina Charmpi, Sophie Rousseaux and Jean-Jacques Fournié

The problem addressed here is that of simultaneous treatment of several gene expression datasets, possibly collected under different experimental conditions and/or platforms. Using robust statistics, a large scale statistical analysis has been conducted over 20 datasets downloaded from the Gene Expression Omnibus repository. The differences between datasets are compared to the variability inside a given dataset. Evidence that meaningful biological information can be extracted by merging different sources is provided.