A Quantum Statistic Algorithm for Missing Value Estimation in Gene Expression Matrices By Wei Jing University Scholars Program National University of Singapore & Department of Computer Science School of Computing National University of Singapore & Department of Philosophy Faculty of Arts and Social Sciences National University of Singapore Project Title: A Quantum Statistic Algorithm for Missing Value Estimation in Gene Expression Matrices Project Supervisor: Dr Kuldip Singh Abstract Gene expression matrices are essential for mining gene functions and information. However, expression matrices obtained from gene microarray experiments are sometime incomplete. Thus missing values estimation becomes an important issue for retrieving the gene expression matrices. Several statistic methods for estimating missing values have been developed, such as K-Nearest Neighbor (KNN) and Singular Value Decomposition (SVD). Different from the existing methods, this essay introduces a quantum statistic algorithm (qStas) for approximating the missing values in gene expression matrices. This new algorithm is an integration of quantum theory and statistic method. Key words: quantum method, qStas algorithm, microarrays, missing value estimation. 2 Acknowledgements Dr Kuldip Singh, Lecturer of UIT2205: Quantum Computing Prof Wong Limsoon, Lecturer of CS2220: Introduction to Computational Biology UIT2205 and CS2220 classmates University Scholars Program, National University of Singapore Department of Computer Science, School of Computing, National University of Singapore 3 Table of Contents Abstract………………………………………….………………..……………..…..…i Acknowledgements…………………………………….………..………………..…...ii 1 Introduction………………………………………….………………..……….…..1 2 Inspiration………………………….…………………………………..……..……2 2.1 A Simplified Model of Gene Expression Matrix….………………..……...…..2 2.2 SVD Algorithm……………………………………………………..……….…3 2.3 Quantum Theory………………………………………………...……………..4 3 Quantum Statistic Algorithm……..………………………………………...……...6 3.1 Idea……………………………………………….……………………………6 3.2 Workflow………………………………………………………..……………..8 3.3 Analysis………………………………………………………..………………10 4 Results and Discussion…………………………………………………………….12 5 Conclusion…………………………………………………………………………15 Appendix A…………………………………………………………………………...16 References…………………………………………………………………………..…19 4 1 Introduction In bioinformatics, gene expression matrices, which contain gene expression values are important for gene analysis. However, the expression matrices obtained from microarray experiments sometimes have a few missing values, due to the diverse reasons. For example, some cells of the microarrays might be covered by dust. Unfortunately, many gene analysis algorithms, like C4.5 decision tree algorithm and K-clustering algorithm, require the expression matrices to be complete. Otherwise, the accuracy and the effectiveness of algorithms will be weakened. Thus retrieving the expression matrices become a significant issue. Bioinformatics, statistics and artificial intelligence researchers have developed several effective methods for approximating the missing expression values, for instance, Singular Value Decomposition (SVD) and K-Nearest Neighbor (KNN). Most of these methods are statistic methods, which make use of the relations between the rows and columns of the expression matrices. This paper introduces a new algorithm—quantum statistic algorithm (qStas in short), which integrates the idea of a statistic method (SVD) and the concepts developed by quantum mechanics. In this paper, the Inspiration section briefly describes a simplified model of gene expression matrix, SVD algorithm and quantum theory. After that, the Algorithm section discusses in detail the idea, workflow and analysis of qStas algorithm. Some experiment results are presented in the Result section. 1 2 Inspiration 2.1 A Simplified Model of Gene Expression Matrix A gene expression matrix (denoted as A) contains the data obtained from gene microarray experiments, where each row of the matrix contains the expression values of a gene under different conditions and each column contains the expression values of different genes under a condition. In other words, each entry A[i][j] represents the expression value of the i-th gene under the j-th condition. However, an actual expression matrix is usually huge in size, which involves hundreds or thousands of genes and tens of conditions. The values in different rows are often of different scales. To make analysis easier, the experiments involved in this paper are conducted on simplified models. A simplified gene expression matrix contains seven rows and four columns (Table 1). The values in all columns are also tuned to the similar scale. However, if the algorithm works for the simplified model, it can also be applied to actual gene expression matrices. \ Condition 1 Condition 2 Condition 3 Condition 4 Gene 1 1.00 3.00 9.00 13.00 Gene 2 1.20 3.39 *(8.80) 12.59 Gene 3 1.50 4.00 8.49 12.00 Gene 4 2.00 5.01 8.01 *(11.00) Gene 5 2.40 5.81 7.61 *(10.20) Gene 6 3.00 7.01 6.99 9.02 2 4.00 Gene 7 8.99 *(6.00) 6.98 Table 1. A simplified gene expression matrix. The (i, j)-entry represent the expression value of the i-th gene under the j-th condition. Missing values are represented as *, with the corresponding true value in the brackets. 2.2 SVD Algorithm Singular Value Decomposition (SVD) algorithm seeks for mutually orthogonal gene expressions which can be linearly combined to approximate the expressions of all genes in the matrix. These mutually orthogonal genes are termed as eigengenes, which can be obtained from the following equation In the equation above, A is the gene expression matrix which contains the expression value of m genes under n conditions. U and V are orthogonal matrices, and D is a diagonal matrix. Each row of VT is an eigengene, whose corresponding eigenvalue is a diagonal entry in D. Given The following equation can be derived ( ) ( ) Thus V can be obtained by orthogonalizing ATA. Then SVD algorithm chooses k eigengenes in VT which correspond to k largest eigenvalues in D (DTD). After that, it 3 estimates the missing expression values by linearly regressing the genes against the k eigengenes. That is, by computing the solution (X) to (0) Where UJ is a new matrix derived from U after choose J<p eigengenes.1 The workflow of SVD algorithm is Step 1: Fill the missing entries with the average value of the corresponding row Repeat Step2: Obtain eigengenes and eigenvalues Step3: Choose k eigengenes which corresponds to k largest eigenvalues Step4: Estimate the expression values by linear regressing the genes against the k eigengenes. Until the estimated values converge (convergence test) 2.2 Quantum Theory Quantum Theory claims that a state (represented by a vector) collapses to an eigenstate (eigenvector) of a Hermitian operator (a Hermitian matrix), when a measurement is made. ̂ ⟩ → 1 ⟩ For detailed explanation of SVD algorithm and the equation, please refer to Alter et al.’s and Troyanskaya et al.’s Paper. 4 Let the eigenvectors of a Hermitian operator ̂ be corresponding eigenvalues be to eigenstate , , ,…, ⟩, ⟩, ⟩ ,…, ⟩ and the . The probability that a state ⟩ collapses ⟩ is given by 𝑃( 𝑥⟩ → 𝑒𝑖 ⟩) where the sandwiched term ⟩⟨ ⟨𝑥 𝑒𝑖 ⟩⟨𝑒𝑖 𝑥⟩ (1) is called a projector. The summation of all projectors is the identity matrix, since the eigenvectors are complete. 𝑛 𝑒𝑖 ⟩⟨𝑒𝑖 𝐼 𝑒 ⟩⟨𝑒 𝐼 𝑖= (2) 𝑛 And the operator can be written as the summation of the projectors weighted by the corresponding eigenvalues. 𝑛 𝑄̂ 𝑐𝑖 𝑒𝑖 ⟩⟨𝑒𝑖 (3) 𝑖= Equations (1), (2), and (3) are key concept in quantum theory and are used for developing and analyzing the quantum statistic algorithm (qStas). 5 3 Quantum Statistic Algorithm 3.1 Idea Inspired by SVD algorithm and quantum theory, Quantum Statistic Algorithm (qStas) is a hybrid between the two. It adopts the similar schema as SVD: gradually adjusting the values of missing entries and terminating the process by convergence test. However, instead of choosing eigengenes and do classical linear regression, i.e. Step 2 to Step 4 using Equation (0), it adopts the quantum probability theory, which is Equation (1), to approximate the missing values. The operator ̂ in qStas algorithm is simply ATA, whose Hermitian property can be proven: For any ≤ i, j ≤ n, the (i, j) entry of 𝑄̂ =ATA is 𝑄̂ 𝑖 𝑗 = 𝑛 𝑘= 𝐴𝑇 𝑖 𝑘 = 𝑛 𝑘= 𝐴𝑘 𝑗 𝐴𝑇 𝑖 𝑘 = 𝑛 𝑘= 𝐴𝑇 𝑗 𝑘 𝐴𝑘 𝑖 𝐴𝑘 𝑗 = 𝑄̂ 𝑗 𝑖 Thus 𝑄̂ =ATA is a symmetric matrix. In other words, 𝑄̂ =ATA is a Hermitian matrix in which the imaginary component of every entry is 0. 6 The eigenvectors can be obtained by orthogonalizing ̂ . The same as SVD, the eigenvectors of ̂ are termed as eigengenes, each of which corresponds to a column of an orthogonal matrix V. Let ⟩ be a gene, whose corresponding bra vector ⟨ is represented as a row in the expression matrix. Let ̂ =ATA be the Hermitian operator whose eigenstates are ⟩. If ̂ is operated on state (gene) ⟩,…, an eigenstate (eigengene) ⟩ is Therefore, the expected state that ( ⟩, the probability that ⟩→ ⟩) ⟨ ⟩⟨ ⟩ ⟩, ⟩, ⟩ collapses to by Equation (1). ⟩ collapses to when operated by ̂ is 𝑛 Exp 𝑄, 𝐺𝑥 ⟩ ⟨𝐺𝑥 𝑒𝑖 ⟩⟨𝑒𝑖 𝐺𝑥 ⟩ 𝑒𝑖 ⟩ (4) 𝑖= by accumulating over all eigenstates and corresponding probabilities (this is slightly different from quantum mechanics tradition which uses the eigenvalues as possible outcomes). However, Equation (4) exhibits a problem that the probabilities do not sum up to 1. This is because ⟩ is not normalized (the norm of A solution to this is to normalize ⟩ is not guaranteed to be 1). ⟩ before projecting it to the eigenstates, and retrieve its norm after all the projections. Let ⟩ ⟩⁄‖ ⟩ ‖. The expected state that ⟩ collapses to can be given by 𝑛 Exp 𝑄, 𝐺𝑥 ⟩ ‖ 𝐺𝑥 ⟩ ‖ ⟨𝐺 𝑥 𝑒𝑖 ⟩⟨𝑒𝑖 𝐺 𝑥 ⟩ 𝑒𝑖 ⟩ 𝑖= 7 (5) By substituting in ⟩ ⟩⁄‖ ⟩ ‖, Equation (5) can be simplified to 𝑛 Exp 𝑄, 𝐺𝑥 ⟩ ‖ 𝐺𝑥 ⟩ ‖ ⟨𝐺𝑥 𝑒𝑖 ⟩⟨𝑒𝑖 𝐺𝑥 ⟩ 𝑒𝑖 ⟩ (6) 𝑖= The qStas algorithm relies on Equation (6) to approximate the missing values, by operating the Hermitian operator on every gene ⟩ which has expression values initially missing. Like SVD algorithm, qStas also starts with filling the missing entries with corresponding row averages, and terminates when the estimated values converge. 3.2 Workflow The work flow of qStas algorithm is similar to that of SVD, except that it uses Equation (6) derived from quantum theory to perform approximation. 8 Step1: Fill the missing values with corresponding row averages Repeat: Step2: Obtain the Hermitian operator 𝑄̂ and the eigenstates (eigengenes) Step3: Operate 𝑄̂ on the genes which have missing values, using Equation (6) Step4: Replace the missing values with the newly estimated values, while other entries remain unchanged Until the estimated values converge *Note: missing values stand for values that are missing in the initial matrix. The algorithm can also be treated as a black box which contains a Hermitian operator ̂ and an evaluation function f. f Expression matrix A Converge Output matrix A’ Not Converge The black box shown above simulates the performance of qStas algorithm: 1) receives a gene expression matrix A; 2) operates on A and evaluates the resulting matrix A’ using f; 3) outputs A’ if the estimated values converge; 4) otherwise feed A’ back into the box again, and go to 1). 9 3.3 Analysis Optimality and termination are considered when analyzing whether a general algorithm is effective. For an estimation algorithm, its optimality or accuracy depends on its approximation method. Hence the accuracy of qStas relies on (6), which is derived from quantum theory. Therefore, qStas would be a plausible estimation algorithm if it does terminate after a finite number of steps. The following section proves its termination using a physical model. 3.3.1 Physical model of qStas algorithm In this model, the eigengenes and the genes with missing values are modeled as electric bars, which lie in an n-dimensional space. Every electric bar has one end fixed at the origin. The electric bars of the eigengenes are always orthogonal to each other, and they try to repel the electric bars of the genes. For simplification, we explain this with a case in a 2-dimensional space, or a plane. But this can be theoretically analogized to higher dimensions. Let 𝑁𝐺𝑥 ⟩ Let p 𝐺𝑥 ⟩ ‖ 𝐺𝑥 ⟩‖ be the normalized form of gene 𝐺𝑥 ⟩. ⟨𝑁𝐺𝑥 𝑒𝑖 ⟩ be the projection of 𝑁𝐺𝑥 ⟩ on eigengene 𝑒𝑖 ⟩. Clearly, ‖𝑝‖ ≤ . After projecting 𝑁𝐺𝑥 ⟩ to 𝑒𝑖 ⟩⟨𝑒𝑖 , ⟨𝑁𝐺𝑥 𝑒𝑖 ⟩⟨𝑒𝑖 𝑁𝐺𝑥 ⟩ 𝑒𝑖 ⟩ New projection of 𝑁𝐺𝑥 ⟩ on 𝑒𝑖 ⟩ is p′ 𝑝 ⟨𝑒𝑖 𝑒𝑖 ⟩ 10 𝑝 𝑒𝑖 ⟩ 𝑝 ≤ ‖𝑝‖ Suppose p ≥ 0. Since the projection of 𝑁𝐺𝑥 ⟩ is decreased, the angle between 𝑁𝐺𝑥 ⟩ and 𝑒𝑖 ⟩ will be enlarged if 𝑁𝐺𝑥 ⟩ remains the similar length. In p < 0 case, the angle between 𝑁𝐺𝑥 ⟩ and In summary, 𝑒𝑖 ⟩ or 𝑒𝑖 ⟩ will be enlarged. 𝑒𝑖 ⟩ repels the genes such that the angle 𝑁𝐺𝑥 ⟩ in between tends to 90°. p2 p 𝑒𝑖 ⟩ Therefore, the eigengenes make the genes to rotate in each operation, while also gently adjusting themselves as a whole after the operation. Classical physics has proven that perpetual motion machine does not exist. Thus the electric bars, as a closed system (the operators are generated from the system itself), must rest after finite steps of rotation. In other words, the estimated values converge and the algorithm terminates after a finite number of operations. 3.3.2 Time complexity The running time of qStas relies on two factors: the size of the matrix and the number of iterations before termination. Since the highest time complexity for matrix operations involved in the algorithm is O(m2n) (when computing ATA), the overall time complexity is O(km2n), where k is the number of operations performed before the estimated values converge. This complexity is the same as that of SVD algorithm. 11 4 Results and Discussion The qStas algorithm is tested on a simplified gene expression matrix as show in Table 2. Matrix operations are conducted using MATLAB 7.0.1, and the discrepancy threshold for determining convergence is set as 10-3. \ Condition 1 Condition 2 Condition 3 Condition 4 Gene 1 1.00 3.00 9.00 13.00 Gene 2 1.20 3.39 *(8.80) 12.59 Gene 3 1.50 4.00 8.49 12.00 Gene 4 2.00 5.01 8.01 *(11.00) Gene 5 2.40 5.81 7.61 *(10.20) Gene 6 3.00 7.01 6.99 9.02 Gene 7 4.00 8.99 *(6.00) 6.98 Table 2. A simplified gene expression matrix. The (i, j)-entry represent the expression value of the i-th gene under the j-th condition. Missing values are represented as *, with the corresponding true value in the brackets. In the initiation step, the missing values are filled with the corresponding row averages: A[2][3]=5.73, A[4][4]=5.00, A[5][4]=5.27 orthogonalizing ̂ =ATA, the eigengenes ⟩, and ⟩, A[7][3]=6.66. ⟩, 0. 0. 0. 0 0. 0. 0. 0 0. 0. 009 12 that, by ⟩ are obtained and each corresponds to a column in the orthogonal matrix V1. 0.90 0. 0 ( 0.0 0.009 After 0. 0. 0. 0. 9 ) 9 After the first operation of on the genes using Equation (6), the missing values are estimated to be: A[2][3]=7.7107, A[4][4]=6.4547, A[5][4]=6.6801, and A[7][3]=7.2017. Compared to the initial values, three of the estimated values (A[2][3], A[4][4], A[5][4]) get closer to the true value, while one of them (A[7][3]) crosses over the expected value. However, in average, the missing values are approaching the corresponding expected values. After replacing the missing values by the estimated values, the expression matrix A is fed back for a second operation. At this time, the eigengenes obtained are 0. 99 0. 9 ( 0.0 9 0.00 9 0. 0. 0 0 0. 0. 0 0. 0. 0 0.0 0. 0. 0. 0. 0. 9 ) By operating the new Hermitian operator on A, new estimations are obtained: A[2][3]=8.3673 , A[4][4]=7.6975, A[5][4]=7.8773, A[7][3]=7.1293. In this round, all the missing values change towards the true values. However, they do not converge yet. After 8 operations, the approximated values are already very close to the expected values: A[2][3]=8.2280(8.8000), A[4][4]=10.6039(11.0000), A[5][4]=10.8094(10.2000) and A[7][3]=6.3903(6.0000). Although the results are still not converged, the changing rate is smaller compared to that in the previous operations. Another important observation is that the eigengenes in one round become very similar to those in its previous round. For example, the eigengenes in the ninth iteration are 13 0.90 0. ( 0.0 0.0 0 0.0 0. 00 0. 0 0. 0. 9 0. 0.0 0. 0 9 0. 0. 0. 0. 0. 9 0. 0.0 9 0. 09 0. 0. 0. 0. 0 ) and the eigengenes in the eighth iteration are 0.90 0. ( 0.0 0.0 0.0 0 0. 0. 9 0. ) This implies that the eigengenes, as well as the estimations for missing values are converging. The intermediate expression matrices for the test example can be found in Appendix A. Unsurprisingly, the estimated values converge at the 14th operation where the estimation is A[2][3]=8.0838, A[4][4]=10.9689, A[5][4]=11.1889 and A[7][3]=6.2527. Compared to the 13th operation where A[2][3]=8.0937, A[4][4]=10.9443, A[5][4]=11.1632 and A[7][3]=6.2621, the discrepancy is only 0.09%, which is smaller than the threshold 10-3. Compared to the true values, the root-mean-square difference of the estimation is 0.6236, and the coefficient of variance is 6.93%. Thus the precision of the estimation is 93.07%, which is very high for an approximation algorithm. Having proven its O(km2n) time complexity, qStas algorithm is scalable to gene expression matrices of actual size. 14 5 Conclusion Missing value estimation in expression matrices is an important issue for gene analysis. The quantum statistic algorithm (qStas) adopts a probability theory in quantum mechanics to approximate the missing expression values. Its termination has been proven using a classical physical model, and its time complexity is O(km2n). This new algorithm works effectively in the simplified model and results in estimations of high precision. Predictably, it would work well when applied to actual gene expression matrices of greater size. Quantum mechanics has developed a set of state of the art theorems and methods. These methods are useful not only for quantum physics itself, but also for other disciplines like statistics, computer science and analytic philosophy. The qStas algorithm developed in this paper is an example of applying quantum theory in bioinformatics. 15 Appendix A The intermediate results when operating qStas on a simplified gene expression matrix. For operation 1…9, both the eigengenes and the estimated expression matrices are recoded. Only the estimated matrices are recorded for operation 10…14. Expected expression matrix Exp= 1.0000 1.2000 1.5000 2.0000 2.4000 3.0000 4.0000 3.0000 3.3900 4.0000 5.0100 5.8100 7.0100 8.9900 9.0000 8.8000 8.4900 8.0100 7.6100 6.9900 6.0000 13.0000 12.5900 12.0000 11.0000 10.2000 9.0200 6.9800 9.0000 5.7300 8.4900 8.0100 7.6100 6.9900 6.6600 13.0000 12.5900 12.0000 5.0000 5.2700 9.0200 6.9800 1.5000 2.0000 2.4000 3.0000 4.0000 3.0000 3.3900 4.0000 5.0100 5.8100 7.0100 8.9900 8.4900 12.0000 8.0100 6.4547 7.6100 6.6801 6.9900 9.0200 7.2017 6.9800 V2 = 0.8993 -0.4349 0.0449 0.0019 Original matrix A0 = 1.0000 1.2000 1.5000 2.0000 2.4000 3.0000 4.0000 4.0000 5.0100 5.8100 7.0100 8.9900 -0.1854 0.3657 0.1519 -0.3010 0.7606 0.3764 0.8162 0.0655 0.5723 -0.4570 -0.5324 0.7125 A2 = 1.0000 1.2000 1.5000 2.0000 2.4000 3.0000 4.0000 [Both the eigengenes and the estimated matrix are recorded for iteration 1…9] 3.0000 3.3900 4.0000 5.0100 5.8100 7.0100 8.9900 9.0000 13.0000 8.3673 12.5900 8.4900 12.0000 8.0100 7.6975 7.6100 7.8773 6.9900 9.0200 7.1293 6.9800 V3 = V1 = 0.9017 -0.4308 0.0346 0.0091 A1= 1.0000 1.2000 0.8994 -0.4345 0.0468 -0.0003 -0.2243 0.3346 0.1569 -0.4124 0.7024 0.3885 0.8056 0.1831 0.5623 -0.3614 -0.6009 0.7129 3.0000 3.3900 -0.1539 0.3809 0.1490 -0.2299 0.7886 0.3693 0.8209 -0.0014 0.5691 -0.4995 -0.4826 0.7194 A3 = 1.0000 1.2000 1.5000 9.0000 13.0000 7.7107 12.5900 16 3.0000 3.3900 4.0000 9.0000 13.0000 8.5241 12.5900 8.4900 12.0000 2.0000 2.4000 3.0000 4.0000 5.0100 5.8100 7.0100 8.9900 8.0100 7.6100 6.9900 6.8609 8.6301 8.7796 9.0200 6.9800 A6 = 1.0000 1.2000 1.5000 2.0000 2.4000 3.0000 4.0000 V4 = 0.9001 0.1303 -0.3888 0.1471 -0.4333 0.1827 -0.8036 0.3648 0.0460 -0.8245 0.0430 0.5623 -0.0003 0.5194 0.4485 0.7274 3.0000 3.3900 4.0000 5.0100 5.8100 7.0100 8.9900 9.0000 13.0000 8.5132 12.5900 8.4900 12.0000 8.0100 9.3985 7.6100 9.5690 6.9900 9.0200 6.7294 6.9800 A7 = 1.0000 1.2000 1.5000 2.0000 2.4000 3.0000 4.0000 0.9009 0.1134 -0.3927 0.1457 -0.4320 0.1604 -0.8107 0.3613 0.0413 -0.8280 0.0618 0.5557 0.0025 0.5251 0.4299 0.7345 9.0000 8.4405 8.4900 8.0100 7.6100 6.9900 6.6179 3.0000 3.3900 4.0000 5.0100 5.8100 7.0100 8.9900 9.0000 8.2860 8.4900 8.0100 7.6100 6.9900 6.4479 13.0000 12.5900 12.0000 10.4429 10.6428 9.0200 6.9800 V8 = A5 = 3.0000 3.3900 4.0000 5.0100 5.8100 7.0100 8.9900 13.0000 12.5900 12.0000 10.2158 10.4086 9.0200 6.9800 0.9028 0.0875 -0.3956 0.1442 -0.4291 0.1466 -0.8164 0.3576 0.0267 -0.8332 0.0760 0.5470 0.0117 0.5259 0.4138 0.7430 V5 = 1.0000 1.2000 1.5000 2.0000 2.4000 3.0000 4.0000 9.0000 8.3584 8.4900 8.0100 7.6100 6.9900 6.5233 V7 = A4 = 1.0000 1.2000 1.5000 2.0000 2.4000 3.0000 4.0000 3.0000 3.3900 4.0000 5.0100 5.8100 7.0100 8.9900 0.9032 0.0808 -0.3963 0.1438 -0.4284 0.1431 -0.8178 0.3566 0.0228 -0.8349 0.0793 0.5442 0.0141 0.5253 0.4097 0.7456 13.0000 12.5900 12.0000 9.8881 10.0715 9.0200 6.9800 A8 = 1.0000 1.2000 1.5000 2.0000 2.4000 3.0000 4.0000 V6 = 0.9020 0.0989 -0.3946 0.1448 -0.4304 0.1512 -0.8142 0.3591 0.0337 -0.8310 0.0708 0.5508 0.0073 0.5262 0.4199 0.7394 V9 = 17 3.0000 3.3900 4.0000 5.0100 5.8100 7.0100 8.9900 9.0000 8.2280 8.4900 8.0100 7.6100 6.9900 6.3903 13.0000 12.5900 12.0000 10.6039 10.8094 9.0200 6.9800 0.9033 0.0774 -0.3967 0.1435 -0.4281 0.1400 -0.8188 0.3560 0.0213 -0.8360 0.0816 0.5421 0.0150 0.5248 0.4069 0.7475 2.0000 2.4000 3.0000 4.0000 A9 = 1.0000 1.2000 1.5000 2.0000 2.4000 3.0000 4.0000 5.0100 5.8100 7.0100 8.9900 8.0100 10.8656 7.6100 11.0812 6.9900 9.0200 6.2921 6.9800 3.0000 3.3900 4.0000 5.0100 5.8100 7.0100 8.9900 9.0000 8.1072 8.4900 8.0100 7.6100 6.9900 6.2748 13.0000 12.5900 12.0000 10.9109 11.1284 9.0200 6.9800 3.0000 3.3900 4.0000 5.0100 5.8100 7.0100 8.9900 9.0000 8.0937 8.4900 8.0100 7.6100 6.9900 6.2621 13.0000 12.5900 12.0000 10.9443 11.1632 9.0200 6.9800 3.0000 3.3900 4.0000 5.0100 5.8100 7.0100 8.9900 9.0000 8.0838 8.4900 8.0100 7.6100 6.9900 6.2527 13.0000 12.5900 12.0000 10.9689 11.1889 9.0200 6.9800 A12 = 3.0000 3.3900 4.0000 5.0100 5.8100 7.0100 8.9900 9.0000 8.1836 8.4900 8.0100 7.6100 6.9900 6.3473 13.0000 12.5900 12.0000 10.7198 10.9296 9.0200 6.9800 1.0000 1.2000 1.5000 2.0000 2.4000 3.0000 4.0000 A13 = [Only the estimated matrix is recorded for iteration 10…14] 1.0000 1.2000 1.5000 2.0000 2.4000 3.0000 4.0000 A10 = 1.0000 1.2000 1.5000 2.0000 2.4000 3.0000 4.0000 3.0000 3.3900 4.0000 5.0100 5.8100 7.0100 8.9900 9.0000 8.1503 8.4900 8.0100 7.6100 6.9900 6.3155 13.0000 12.5900 12.0000 10.8040 11.0171 9.0200 6.9800 3.0000 3.3900 4.0000 9.0000 13.0000 8.1255 12.5900 8.4900 12.0000 A14 = 1.0000 1.2000 1.5000 2.0000 2.4000 3.0000 4.0000 A11 = 1.0000 1.2000 1.5000 18 References Alter, O, & Brown, P. 0, & Botstein, D. (2000). Singular Value Decomposition for Genome-wide Expression Data Processing and Modeling. Proc. Natl Acad. Sci. USA, 97, 10101-10106 Li, J, & Wong, L. (2004). Techniques for Analysis of Gene Expression Data. January 29, 2004 2:46 WSPC/Trim Size: 9in x 6in for Review Volume Singh, K. (2011). UIT2205 Quantum Computing Lecture Notes. National University of Singapore. Troyanskaya, O, & Cantor, M, & Sherlock, G,& Brown, P, & Hastie, T, & Tibshirani, R, & Botstein D, & Altman R, B. (2000). Missing Value Estimation Methods for DNA Microarrays. Bioinformatics Vol. 17 No.6 2011, 520-525. Yu, T, & Peng H, & Sun Wei. (2011). Incorporating Nonlinear in Missing Value Imputation. IEEE/ACM Transactions on Computational Biology and Bioinformatics, VOL. 8, NO. 3, May/June 2011 19
© Copyright 2026 Paperzz