, where
is the measured expression level of gene
in the
th sample,
and
represent the total number of genes and samples, respectively. Each row in the expression table corresponds to one particular gene and each column to a sample [17]. However, for most gene expression data, the number of training samples is still very small compared to the large number of genes involved in the experiments [17]. For example, the colon cancer data set consists of 62 samples and 2,000 genes and the leukemia data set contains 72 samples and 7,129 genes. The number of samples is likely to remain small for many areas of investigation, especially for human data, due to the difficulty of collecting and processing microarray samples [17]. When the number of genes is significantly greater than the number of samples, it is possible to find biologically relevant correlations of gene behavior with the sample categories [37].
-information measure-based method has been reported in [43] for selection of discriminative genes from microarray data using the mRMR criterion. In this regard, it should be noted that the mRMR criterion is also used in [23] and [44] for gene selection, based on the concepts of neighborhood mutual information and fuzzy-rough sets, respectively.
-test,
-test [10, 34], entropy, information gain, mutual information [10, 55], normalized mutual information [39], and
-information measures [43] are typically used, and the same or a different metric-like mutual information,
-information, the
distance, Euclidean distance, and Pearson’s correlation coefficient [10, 27, 55] is employed to calculate the gene-gene redundancy. However, as the
-test,
-test, Euclidean distance, and Pearson’s correlation depend on the actual gene expression values of the microarray data, they are very much sensitive to noise or outlier of the data set [10, 22, 27, 55]. On the other hand, as information measures depend only on the probability distribution of a random variable rather than on its actual values, they are more effective to evaluate both gene-class relevance and gene-gene redundancy [18, 39, 55].
-information (respectively,
-divergence) measures [43, 56, 66]. In this chapter, several
-information measures are compared with mutual information by applying them to the selection of genes from microarray data. The performance of different information measures is studied using the predictive accuracy of naive Bayes classifier, K-nearest neighbor rule, and support vector machine. The effectiveness of different
-information measures, along with a comparison with mutual information, is demonstrated on three cancer microarray data sets, namely, breast cancer, leukemia, and colon cancer data sets.
-information measures. A few case studies and a comparison among different
-information measures are reported in Sect. 5.3. Concluding remarks are given in Sect. 5.4.5.2 Gene Selection Using
-Information Measures
5.2.1 Minimum Redundancy-Maximum Relevance Criterion
be the set of
genes of a given microarray gene expression data set and
is the set of selected genes. Define
as the relevance of the gene
with respect to the class label
while
as the redundancy between two genes
and
. The total relevance of all selected genes is, therefore, given by

of relevant and nonredundant genes from the whole set
of
genes is equivalent to maximize
and minimize
, that is, to maximize the objective function
, where

.
of each gene
.
as the most relevant gene that has the highest relevance
. In effect,
and
.
and each of the remaining genes of
.
, select gene
that maximizes
and
.5.2.2
-Information Measures for Gene Selection
-information measures are reported to compute both gene-class relevance and gene-gene redundancy for selection of genes from microarray data. The
-information measures calculate the distance between a given joint probability
and the joint probability when the variables are independent
. In the following analysis, it is assumed that all probability distributions are complete, that is,
.
-divergence measures [56, 66]. For two discrete probability distributions
and
, the
-divergence is defined as
are that
;
is continuous and convex on
;
; and
.
-divergence for the two cases for which (5.6) is not defined:
-information measures. These are defined similarly to
-divergence measures, but apply only to specific probability distributions; namely, the joint probability of two variables
and their marginal probabilities’ product
. Thus, the
-information is a measure of dependence: it measures the distance between a given joint probability and the joint probability when the variables are independent [56, 66]. The frequently used functions that can be used to form
-information measures include
-information,
-information,
-information, and
-information. On the other hand, the Renyi’s distance measure does not fall in the class of
-divergence measures as it does not satisfy the definition of
-divergence. However, it is divergence measure in the sense that it measures the distance between two distributions and it is directly related to
-divergence.5.2.2.1 V-Information
, which results in the
-information [56, 66]
,
, and
represent two marginal probability distributions and their joint probability distribution, respectively. Hence, the
-information calculates the absolute distance between joint probability of two variables and their marginal probabilities’ product.5.2.2.2
-Information
5.2.2.3
-Information
-information, defined by Matusita [56, 66], is as follows:
-information measure, the resulting
-information measures are
. These constitute a generalized version of
-information. That is, the
-information is identical to
-information for
.5.2.2.4
-Information
-information measures, proposed by Liese and Vajda [66], is as follows:
, this function equals to the
function. The
-information and
-information measures are, therefore, also identical for
. For
-information can be written as
5.2.2.5 Renyi Distance
[56, 66], can be defined as
. It reaches its minimum value when
and
are identical, in which case the summation reduces to
. As complete probability distribution is assumed, the sum is one and the minimum value of the measure is, therefore, equal to zero. The limit of Renyi’s measure for
approaching 1 equals
, which is the mutual information.5.2.3 Discretization
-information measures [43], the continuous expression values of a gene are divided into several discrete partitions. The a prior (marginal) probabilities and their joint probabilities are then calculated to compute both gene-class relevance and gene-gene redundancy using the definitions for discrete cases. In this chapter, the discretization method reported in [10, 43, 55] is employed to discretize the continuous gene expression values. The expression values of a gene are discretized using mean
and standard deviation
computed over
expression values of that gene: any value larger than
is transformed to state 1; any value between
and
is transformed to state 0; any value smaller than
is transformed to state
. These three states correspond to the over-expression, baseline, and under-expression of genes.5.3 Experimental Results
-information measures is extensively compared with that of mutual information and normalized mutual information. Based on the argumentation given in Sect. 5.2.2, the following information measures are chosen to include in the study:
– and
-information measures for
and
;
– and
-information);
-information measure for
;
-information measure for
-information based mRMR (
-mRMR) algorithm [43], written in C language, is available at http://www.isical.ac.in/~bibl/results/fmRMR/fmRMR.html. All the information measures are implemented in C language and run in LINUX environment having machine configuration Pentium IV, 3.2 GHz, 1 MB cache, and 1 GB RAM.
-information measures, the experimentation is done on three microarray gene expression data sets. The major metric for evaluating the performance of different measures is the classification accuracy of support vector machine (SVM) [67], K-nearest neighbor (K-NN) rule [12], and naive Bayes (NB) classifier [12].5.3.1 Gene Expression Data Sets
-information measures are compared using following binary-class data sets.5.3.1.1 Breast Cancer Data Set
5.3.1.2 Leukemia Data Set
5.3.1.3 Colon Cancer Data Set
5.3.2 Class Prediction Methods
-information measures. A brief introduction of the SVM is reported in Chaps. 3 and 4. In this work, linear kernels are used in the SVM to construct the nonlinear decision boundary. On the other hand, descriptions of both K-NN rule and NB classifier are reported next.5.3.2.1 K-Nearest Neighbor Rule
5.3.2.2 Naive Bayes Classifier
th sample
with
gene expression levels
for the
genes, the posterior probability that
belongs to class
is
are conditional tables or conditional density estimated from training examples.5.3.3 Performance Analysis
-information measures using the NB and SVM, respectively, while Tables 5.3, 5.6 and 5.9 shows the results using the K-NN rule. The values of
for
-information measures investigated are 0.2, 0.5, 0.8, 1.5, 2.0, 3.0, and 4.0. Some measures resemble mutual information for
(
and
) and some resemble another measure (
and
equal
). To compute the prediction accuracy of the NB, SVM, and K-NN, the leave-one-out cross-validation is performed on each gene expression data set. The number of genes selected ranges from 2 to 50 and each data set is preprocessed by standardizing each sample to zero mean and unit variance. -Information measures | Number of selected genes | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | 5 | 8 | 10 | 15 | 20 | 25 | 30 | 35 | 40 | 45 | 50 | |
| 95.9 | 98.0 | 98.0 | 98.0 | 100 | 100 | 100 | 100 | 100 | 100 | 98.0 | 98.0 |
| 95.9 | 98.0 | 98.0 | 98.0 | 100 | 100 | 100 | 98.0 | 98.0 | 98.0 | 98.0 | 98.0 |
| 95.9 | 100 | 95.9 | 98.0 | 98.0 | 98.0 | 95.9 | 93.9 | 91.8 | 91.8 | 89.8 | 89.8 |
| 95.9 | 98.0 | 95.9 | 100 | 98.0 | 93.9 | 93.9 | 89.8 | 87.8 | 87.8 | 87.8 | 87.8 |
| 95.9 | 98.0 | 95.9 | 93.9 | 93.9 | 91.8 | 91.8 | 89.8 | 85.7 | 83.7 | 83.7 | 81.6 |
| 95.9 | 95.9 | 95.9 | 93.9 | 91.8 | 91.8 | 91.8 | 87.8 | 87.8 | 83.7 | 83.7 | 81.6 |
| 95.9 | 95.9 | 95.9 | 93.9 | 91.8 | 91.8 | 89.8 | 87.8 | 87.8 | 83.7 | 83.7 | 81.6 |
| 95.9 | 95.9 | 95.9 | 91.8 | 91.8 | 89.8 | 87.8 | 87.8 | 85.7 | 83.7 | 81.6 | 81.6 |
| 85.7 | 95.9 | 95.9 | 98.0 | 100 | 100 | 100 | 100 | 98.0 | 98.0 | 98.0 | 98.0 |
| 95.9 | 98.0 | 98.0 | 98.0 | 100 | 100 | 100 | 98.0 | 98.0 | 98.0 | 98.0 | 98.0 |
| 95.9 | 93.9 | 95.9 | 98.0 | 93.9 | 91.8 | 91.8 | 87.8 | 85.7 | 85.7 | 85.7 | 79.6 |
| 87.8 | 89.8 | 83.7 | 85.7 | 89.8 | 87.8 | 87.8 | 83.7 | 85.7 | 85.7 | 83.7 | 83.7 |
| 95.9 | 98.0 | 95.9 | 98.0 | 93.9 | 89.8 | 89.8 | 85.7 | 83.7 | 81.6 | 79.6 | 79.6 |
| 95.9 | 95.9 | 95.9 | 93.9 | 91.8 | 91.8 | 91.8 | 87.8 | 87.8 | 83.7 | 83.7 | 81.6 |
| 95.9 | 95.9 | 95.9 | 93.9 | 93.9 | 93.9 | 93.9 | 89.8 | 87.8 | 85.7 | 83.7 | 83.7 |
| 95.9 | 98.0 | 100 | 95.9 | 95.9 | 93.9 | 93.9 | 89.8 | 85.7 | 85.7 | 85.7 | 85.7 |
| 95.9 | 98.0 | 98.0 | 98.0 | 100 | 100 | 100 | 100 | 100 | 100 | 98.0 | 98.0 |
| 95.9 | 98.0 | 98.0 | 98.0 | 100 | 100 | 100 | 98.0 | 98.0 | 98.0 | 98.0 | 98.0 |
| 95.9 | 100 | 95.9 | 95.9 | 98.0 | 98.0 | 95.9 | 93.9 | 91.8 | 91.8 | 89.8 | 89.8 |
| 95.9 | 98.0 | 95.9 | 100 | 98.0 | 93.9 | 93.9 | 89.8 | 87.8 | 87.8 | 87.8 | 87.8 |
| 95.9 | 98.0 | 95.9 | 93.9 | 91.8 | 91.8 | 91.8 | 89.8 | 87.8 | 83.7 | 83.7 | 83.7 |
| 95.9 | 91.8 | 95.9 | 95.9 | 91.8 | 91.8 | 91.8 | 89.8 | 85.7 | 83.7 | 83.7 | 81.6 |
| 93.9 | 89.8 | 93.9 | 93.9 | 93.9 | 91.8 | 91.8 | 91.8 | 89.8 | 85.7 | 83.7 | 79.6 |
| 93.9 | 93.9 | 91.8 | 91.8 | 91.8 | 91.8 | 89.8 | 89.8 | 87.8 | 83.7 | 83.7 | 81.6 |
| 95.9 | 98.0 | 98.0 | 100 | 95.9 | 93.9 | 93.9 | 91.8 | 91.8 | 89.8 | 89.8 | 89.8 |
-Information measures | Number of selected genes | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | 5 | 8 | 10 | 15 | 20 | 25 | 30 | 35 | 40 | 45 | 50 | |
| 81.6 | 100 | 95.9 | 98.0 | 98.0 | 100 | 95.9 | 95.9 | 98.0 | 98.0 | 98.0 | 95.9 |
| 81.6 | 100 | 100 | 100 | 95.9 | 95.9 | 100 | 95.9 | 95.9 | 95.9 | 98.0 | 98.0 |
| 81.6 | 98.0 | 100 | 100 | 98.0 | 95.9 | 95.9 | 98.0 | 98.0 | 95.9 | 98.0 | 95.9 |
| 81.6 | 98.0 | 100 | 100 | 98.0 | 95.9 | 95.9 | 93.9 | 93.9 | 93.9 | 95.9 | 95.9 |
| 85.7 | 91.8 | 98.0 | 100 | 98.0 | 100 | 95.9 | 95.9 | 95.9 | 95.9 | 93.9 | 93.9 |
| 85.7 | 95.9 | 98.0 | 100 | 100 | 100 | 95.9 | 95.9 | 95.9 | 93.9 | 93.9 | 93.9 |
| 85.7 | 95.9 | 98.0 | 100 | 100 | 95.9 | 95.9 | 95.9 | 95.9 | 95.9 | 93.9 | 93.9 |
| 85.7 | 89.8 | 100 | 98.0 | 100 | 95.9 | 95.9 | 95.9 | 95.9 | 95.9 | 95.9 | 95.9 |
| 77.6 | 95.9 | 91.8 | 89.8 | 87.8 | 93.9 | 93.9 | 95.9 | 95.9 | 95.9 | 95.9 | 98.0 |
| 81.6 | 100 | 100 | 100 | 95.9 | 95.9 | 100 | 95.9 | 95.9 | 95.9 | 98.0 | 98.0 |
| 85.7 | 89.8 | 93.9 | 89.8 | 93.9 | 95.9 | 93.9 | 93.9 | 93.9 | 91.8 | 93.9 | 93.9 |
| 83.7 | 81.6 | 87.8 | 91.8 | 87.8 | 83.7 | 83.7 | 83.7 | 85.7 | 83.7 | 87.8 | 85.7 |
| 85.7 | 87.8 | 91.8 | 89.8 | 93.9 | 91.8 | 95.9 | 95.9 | 93.9 | 93.9 | 93.9 | 93.9 |
| 85.7 | 95.9 | 98.0 | 100 | 100 | 100 | 95.9 | 95.9 | 95.9 | 93.9 | 93.9 | 93.9 |
| 85.7 | 89.8 | 100 | 95.9 | 98.0 | 95.9 | 98.0 | 93.9 | 93.9 | 93.9 | 93.9 | 93.9 |
| 85.7 | 91.8 | 100 | 100 | 98.0 | 95.9 | 95.9 | 95.9 | 95.9 | 95.9 | 95.9 | 95.9 |
| 81.6 | 100 | 95.9 | 98.0 | 98.0 | 98.0 | 95.9 | 95.9 | 95.9 | 98.0 | 98.0 | 98.0 |
| 81.6 | 100 | 100 | 100 | 95.9 | 95.9 | 100 | 95.9 | 95.9 | 93.9 | 98.0 | 98.0 |
| 81.6 | 98.0 | 100 | 100 | 98.0 | 95.9 | 95.9 | 98.0 | 98.0 | 95.9 | 98.0 | 95.9 |
| 81.6 | 98.0 | 100 | 100 | 98.0 | 95.9 | 95.9 | 93.9 | 93.9 | 93.9 | 95.9 | 95.9 |
| 85.7 | 91.8 | 98.0 | 100 | 98.0 | 100 | 95.9 | 95.9 | 95.9 | 95.9 | 93.9 | 93.9 |
| 85.7 | 89.8 | 95.9 | 95.9 | 98.0 | 100 | 95.9 | 95.9 | 95.9 | 95.9 | 93.9 | 93.9 |
| 87.8 | 87.8 | 100 | 100 | 93.9 | 95.9 | 93.9 | 95.9 | 95.9 | 95.9 | 95.9 | 95.9 |
| 87.8 | 89.8 | 89.8 | 93.9 | 98.0 | 100 | 100 | 98.0 | 98.0 | 95.9 | 95.9 | 93.9 |
| 81.6 | 98.0 | 100 | 100 | 98.0 | 95.9 | 98.0 | 95.9 | 95.9 | 95.9 | 95.9 | 93.9 |
-Information measures | Number of selected genes | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | 5 | 8 | 10 | 15 | 20 | 25 | 30 | 35 | 40 | 45 | 50 | |
| 89.8 | 93.9 | 93.9 | 95.9 | 98.0 | 95.9 | 95.9 | 93.9 | 95.9 | 98.0 | 98.0 | 98.0 |
| 89.8 | 93.9 | 95.9 | 95.9 | 98.0 | 98.0 | 95.9 | 95.9 | 98.0 | 98.0 | 95.9 | 98.0 |
| 89.8 | 98.0 | 95.9 | 95.9 | 98.0 | 95.9
Stay updated, free articles. Join our Telegram channel
Full access? Get Clinical Tree
Get Clinical Tree app for offline access
| ||||||






























































