A visual data mining tool that facilitates reconstruction of transcription regulatory networks.
Additional Document Info
BACKGROUND: Although the use of microarray technology has seen exponential growth, analysis of microarray data remains a challenge to many investigators. One difficulty lies in the interpretation of a list of differentially expressed genes, or in how to plan new experiments given that knowledge. Clustering methods can be used to identify groups of genes with similar expression patterns, and genes with unknown function can be provisionally annotated based on the concept of "guilt by association", where function is tentatively inferred from the known functions of genes with similar expression patterns. These methods frequently suffer from two limitations: (1) visualization usually only gives access to group membership, rather than specific information about nearest neighbors, and (2) the resolution or quality of the relationships are not easily inferred. METHODOLOGY/PRINCIPAL FINDINGS: We have addressed these issues by improving the precision of similarity detection over that of a single experiment and by creating a tool to visualize tractable association networks: we (1) performed meta-analysis computation of correlation coefficients for all gene pairs in a heterogeneous data set collected from 2,145 publicly available micorarray samples in mouse, (2) filtered the resulting distribution of over 130 million correlation coefficients to build new, more tractable distributions from the strongest correlations, and (3) designed and implemented a new Web based tool (StarNet, http://vanburenlab.medicine.tamhsc.edu/starnet.html) for visualization of sub-networks of the correlation coefficients built according to user specified parameters. CONCLUSIONS/SIGNIFICANCE: Correlations were calculated across a heterogeneous collection of publicly available microarray data. Users can access this analysis using a new freely available Web-based application for visualizing tractable correlation networks that are flexibly specified by the user. This new resource enables rapid hypothesis development for transcription regulatory relationships.