Background Frequent pattern mining analysis applied on microarray dataset appears to

Background Frequent pattern mining analysis applied on microarray dataset appears to be a promising strategy for identifying relationships between gene expression levels. using previous literature and analyzed by a Gene Ontology enrichment method. Conclusions In this scholarly study, the proposed method was evaluated in 2 available time course microarray datasets with 2 different experimental CX-4945 conditions publicly. Both datasets identified potential itemsets with co-expressed genes evaluated from the literature and showed higher accuracies compared to the 2 corresponding control methods: i) performing without considering the gene expression differentiation between 2 different experimental conditions and with a constant for each gene. Our proposed method found that several new gene regulations involved in these itemsets were useful for biologists and provided further insights into the mechanisms underpinning biological processes. The Java source code and other related materials used in this study are available at http://websystem.csie.ncku.edu.tw/TIIM_Program.rar. Background Identification of relationships between gene regulatory events is one of the main methods through which the biological effects of stimuli or changes in the environment are revealed. Microarrays are a highly efficient way to measure the expression of massive numbers of genes simultaneously. In these respects, multiple microarrays could be further used to quantify the expression of each gene during time course experiments. However, analysis and proper presentation of biological insights into these large-scale datasets is a big challenge. Currently, frequent pattern-based mining analysis is widely used to identify groups of genes that are frequently co-expressed in most biological conditions in a microarray dataset. These methods include using the apriori algorithm [1], half-spaces [2], relational-based analysis [3], gene annotation integrated method [4], row enumeration-based method [5], column enumeration-based method [6], temporal-based method [7], rule induction [8], and FP-tree algorithm [9]. {A gene itemset {and upregulation of frequently occur at the same time.|A gene itemset and upregulation of occur at the RGS13 same time frequently. is defined as the proportion of transactions in the data set that contain the itemset. Only gene itemsets with their values no less than a user-set can be defined as value could have CX-4945 a high probability of becoming an interactome within a biological process. Although methods for traditional frequent pattern-based mining have been proposed in previously published studies successfully, these methods give the same weight to each gene during the execution process. In other words, these methods assume all genes have similar importance, which is not in true in actual applications often. Based on these challenges, some preceding studies on utility mining [10-17] have become predominant topics for solving these problems in the field of CX-4945 data mining. The value of an itemset is the summation of each item multiplied by its matched weight/importance in the co-expression transactions. An itemset is called a as long as its value is not less than a user-specified could not ensure that the items contained in a individually possess high values, since a longer itemset containing more items would have a higher value than shorter itemsets. To tackle this nagging problem, a newer algorithm for mining average utility itemsets [18,19] was proposed; the discovered would be normalized with the true number of items within the itemset. The resulting itemsets would be preserved, namely values were not less than a user-specified (from time course comparative gene expression datasets. The proposed method only requires specifying a user-desired number to explore the most significantly differential gene itemsets between 2 experimental conditions on a microarray dataset. For each gene, the summation of frequencies at the same time point was defined as the and with most significant changes in gene expression can be efficiently explored. An considered more than just the node degrees (i.e., number of neighboring genes in the GRN) of each gene contained in the itemset. First, the (transformed from the gene expression values) of each gene contained in an itemset was used as an important reference to calculate the of the itemset. Second, only the number (even if they.

Leave a Reply

Your email address will not be published. Required fields are marked *