Computational analysis of gene expression to determine both the sequence of

Computational analysis of gene expression to determine both the sequence of lineage choices made by multipotent cells and to identify the genes influencing these decisions is challenging. set of key genes whose expression patterns reflect these relationships. DOI: http://dx.doi.org/10.7554/eLife.20488.001 and (and or only which reached a RG7112 clear minimum in cell type (for A, B, and C unrelated cell types), and for each triplet, we computed the entropy =??in those cell types. We determined the probability of any possible topological relationship between the cell types =??{of each gene being a marker gene (denoted by =?1), i.e. gene showing the clear maximum pattern, and the probability of it being a transition gene denoted by =?1, i.e. gene being a transition gene and having a unique minimum. The term is the probability that the mean of the distribution of the expression levels of gene?i in the root cell type is less than the mean in the other two cell types. The odds contains the only free parameter in our analysis implicitly, the prior odds against the cell type in which its mean expression is minimal being the root. Further, this vote is weighted by the odds ??of gene being a transition gene. Thus, genes with a clearer minimum pattern get larger votes in determining which cell type is not the root. In practice, these quantities are computed numerically (Materials and methods). We note further that if a substantial number of genes cast votes against each of the cell types, the probability of the null topology then ? increases. We computed the probability of obtaining the null topology among the 150 related triplets and 100 unrelated triplets from our training set. The distribution of the probability of obtaining the null topology was considerably different between the related triplets and the unrelated triplets, with an AUC of 0.96 (Figure 1figure supplement 3BCC). Application to hematopoietic gene expression data We used our statistical framework to recreate the lineage of early hematopoietic differentiation. We considered 11 early hematopoietic progenitors from the ImmGen Consortium microarray data set (Heng et al., 2008) (Figure 2source data 1). These cell types and their associated relationships were not included in the data set used earlier to study the correlations of the two patterns and lineage topologies. RG7112 Several RG7112 features of the early hematopoietic lineage tree are debated (Adolfsson et al., 2005; Akashi and Iwasaki, 2007) (Figure 2figure supplement 2A). Given only the gene expression data for these different subpopulations of cells, we determined the lineage relationships and the key factors associated with each lineage decision. We calculated the probabilities of topology and marker and transition genes for the possible triplets of cell types using our statistical BMP6 framework (Figure 2source data 2). To illustrate our method, we first described the analysis of the expression data from two such triplets of cell types: CMP/ST/MPP and MEP/GMP/FrBC (Figure 2BCG). We then assembled the triplets to form an undirected lineage tree (Figure 3; Video 1). Video 1. the topology for which is the maximum, versus the odds ??of that gene being a transition gene (Figure 2B). We find two groups of genes that are much more likely to be transition genes than any of the other genes, with values of or and vote against =?(cell type CMP is the intermediate) or against =?(cell type MPP is the intermediate). Together these genes that have a high odds of being transition genes appear to most support topology =?(Figure 2C; Figure 2figure supplement 2B). Although gene (Figure 2B, italic font) is strongly downregulated in ST and is expressed at higher levels in CMP and MPP (Figure 2C), we note that its signal is overwhelmed by the large number of genes downregulated in either CMP or MPP, illustrating the statistical nature of the framework. For each triplet, we evaluated each genes probability of being a transition or marker gene (Figure 2source data 3). Figure 2D shows the names and associated probabilities of the 12 genes most likely to be transition genes for the triplet CMP???ST???MPP. The transition genes fall into two groups, corresponding to the two groups in Figure 2B. One group, which includes genes =?1?|?{and and (consistent with [Goossens et al., 2011; Kurotaki et al., 2013; Ragu et al., 2010; Robert-Moreno et al.,.

Leave a Reply

Your email address will not be published. Required fields are marked *