ASSESS ~ Analysis of Sample Set Enrichment Scores |
Examples |
Leukemia Example
Expression data | Classes | Gene sets | Scores | Heat map
A very natural consequence of obtaining enrichment scores for each sample in the data set is that classification
and clustering can now be preformed in the space of gene sets rather than individual genes. In this example,
ASSESS was applied to an expression data set with acute myelogenous leukemia (AML) and acute lymphoblastic
leukemia (ALL) samples on a database of 377 gene sets.
Enrichment scores can be found here and shown graphically
here.
A matrix factorization method (NMF) was applied to this space of enrichment scores. With k=2, clustering obtained from the enrichment space accurately differentiated ALL from AML samples. Clustering with k=3 (see image on right) accurately selected three subsets of the samples: ALL-T, ALL-B, and AML all had greater confidence than that from the raw expression data.
Gender Example
The enrichment scores from affymetrix gene expression data with probes from 22,283 human genes with gender
labels was computed in 8 gene sets. Gene sets were defined by cytogenetic bands and pathway or functional
properties. As expected for males chromosome Y as well as its two bands (Yp11 and Yq11) and the gene set
corresponding to genes enriched in male reproductive tissue (testis) were over expressed.
For females two gene sets of genes that escape X-inactivation were over expressed in addition
to a gene set corresponding to genes enriched in female reproductive tissue (uterus).
The Myc gene set was used as a control in that it is not expected to be enriched with respect
to the male/female distinction and indeed this is the case. The heatmap (see picture below) produced by ASSESS are provided here.