| G. W. O'BRIEN, Information Management Tools for Updating an SVD-Encoded Indexing Scheme, Master's thesis, The University of Knoxville, Tennessee, Knoxville, TN, 1994. |
.... operations such as parsing document texts, creating a term by document matrix, computing the truncated SVD of this matrix, creating the LSI database of singular values and vectors for retrieval, matching user queries to documents, and adding new terms or documents to an existing LSI databases [5, 24]. The bulk of LSI processing time is spent in computing the truncated SVD of the large sparse term by document matrices. Section 2 is a review of basic concepts needed to understand LSI. Section 3 uses a constructive example to illustrate how LSI represents terms and documents in the same ....
....Four terms are defined below to avoid confusion when discussing updating. Updating refers to the general process of adding new terms and or documents to an existing LSI generated database. Updating can mean either folding in or SVD updating. SVD updating is the new method of updating developed in [24]. Folding in terms or documents is a much simpler alternative that uses an existing SVD to represent new information. Recomputing the SVD is not an updating method, but a way of creating an LSI generated database with new terms and or documents from scratch which can be compared to either updating ....
[Article contains additional citation context not shown here]
G. W. O'BRIEN, Information Management Tools for Updating an SVD-Encoded Indexing Scheme, Master's thesis, The University of Knoxville, Tennessee, Knoxville, TN, 1994.
....the first one is mainly for handling sparse matrix cases, we choose the second algorithm assuming we will be dealing with more general cases. Performing a SVD computation is expensive, thus making a quick SVD update method without recomputing the whole thing attractive. Several methods exist [O B94, BDO95, ZS97] for doing SVD updates to approximate the effect of a SVD re computation, but none of them have given a solid and interactive metric for making run time decision regarding the loss of accuracy due to these approximation. We proposed a simple metric for this and the detail will be ....
....we have seen in the previous sections, SVD is still considered an expensive computation. A SVD update method which requires no re computation of the SVD when encountering new data (row data or column data) will be favored. So far, there are, at least three such methods being proposed [DDL 90, O B94, ZS97] In this project, we choose the simplest one [DDL 90] from the three for its low complexity and possibility for interactive accuracy estimation, while the other two [O B94, ZS97] claim to be more accurate but fail to provide a run time loss of accuracy estimation easily. The method we ....
[Article contains additional citation context not shown here]
G.W. O'Brien. Information management tools for updating a svd-encoded indexing scheme. 1994. Master Thesis.
....if the word usage in the new documents is different from that in the documents that already are in the training set. In this case, the new word usage data may potentially be lost or misrepresented. A third method, SVD updating that deals with this problem has recently been developed [20]. However, SVD updating requires slightly more time and memory than the folding in approach, meaning that neither approach appears to be uniformly superior over the other. Norwegian Computing Center, P.B. 114 Blindern, N 0314 Oslo, Norway Tel. 47) 22 85 25 00 Fax: 47) 22 69 76 60 Chapter 3 ....
G. W. O'Brien, Information Management Tools for Updating an SVD-Encoded Indexing Scheme, Master's thesis, The University of Knoxwille, Tennessee, Knoxwille, 1994.
....into the reduced feature spaces. It may be pointed out that since the training set was selected so that it is representative of the documents available on the web and since the size of training set is considerable, it may not require to retrain the system. Standard techniques have been reported in [2] for updating SVD based indexing schemes. After the training step, we indexed a set of 10,000 images disjoint with the training set and stored them in the database. A subset of 100 images in the database was selected (randomly) to be retrieved. Two subjects were asked to find each of the images ....
G. Brien. Information Management Tools for Updating an SVD-Encoded Indexing Scheme. TR UT-CS-94-259, U. Tenn., 1994.
....sparse matrices, but this is only a one time expense. Computational comparisons with the SVD are given in Section 8.4. In many information retrieval settings, the document database is constantly being updated. Much work has been done on updating the SVD approximation to the term document matrix [4, 49], but it can be as expensive as computing the original SVD. Efficient algorithms for updating the SDD are given in Section 8.5. 8.2 LSI via the SVD LSI is an improvement on the vector space model. In LSI, we can use a matrix approximation to the term document matrix generated by the SVD. The SVD ....
....the document collection to be dynamic: new documents are added, and old documents are removed. Thus, the list of terms might also change. In this section, we will focus on the problem of modifying a SDD decomposition when the document collection changes. SVD updating has been studied by O Brien [49]. He reports that updating the SVD takes almost as much time as re computing it, but that it requires less memory. His methods are similar to what we do in Method 1 in the next section. 8.5.1 Adding or Deleting Documents Suppose that we have an SDD approximation for a document collection and ....
Gavin O'Brien. Information management tools for updating an SVD-encoded indexing scheme. Master's thesis, University of Tennessee, Knoxville, TN 37996-1301, 1994.
.... to perform operations such as parsing document texts, creating a term by document matrix, computing the truncated SVD of this matrix, creating the LSI database of singular values and vectors for retrieval, matching user queries to documents, and adding new terms or documents to a database [9,19]. However, the bulk of LSI processing time can be spent in computing the truncated SVD of the large sparse term by document matrices, especially when several new terms or documents are to be added to the database. The SVD is the most common example of a two sided (or complete) orthogonal ....
.... used in the solution of unconstrained linear least squares problems, matrix rank estimation, and canonical correlation analysis [2] Although the SVD provides very accurate subspace information, it is computationally demanding and difficult to update for either dense [5] or sparse matrices [1,19]. This can be a drawback for recursive procedures which require simple matrix updates (e.g. appending or deleting a row or column) Alternatively, rank revealing QR (RRQR) algorithms such as those by Foster [15] Chan [6] and modifications [4] can be used to obtain subspace information from ....
[Article contains additional citation context not shown here]
G. W. O'Brien. Information Management Tools for Updating an SVD-Encoded Indexing Scheme. Master's thesis, The University of Knoxville, Tennessee, Knoxville, TN, 1994.
....AND M.W. BERRY document collection each time it changes, updating (adding documents to the collection) and downdating (removing documents from the collection) would require less time and memory, especially for rapidly changing collections. A prototype of an LSI updating algorithm was described in [20], although it was never fully integrated into the LSI system. Downdating the SVD is an interesting problem that has not yet been attempted with LSI. With efficient updating and downdating facilities, LSI will become useful for large, rapidly changing document collections like the World Wide Web ....
G. O'Brien. Information management tools for updating an SVD-encoded indexing scheme. Master's thesis, University of Tennessee, Knoxville, Tennessee, December 1994.
.... operations such as parsing document texts, creating a term by document matrix, computing the truncated SVD of this matrix, creating the LSI database of singular values and vectors for retrieval, matching user queries to documents, and adding new terms or documents to an existing LSI databases [4, 23]. The bulk of LSI processing time is spent in computing the truncated SVD of the large sparse term by document matrices. Section 2 is a review of basic concepts needed to understand LSI. Section 3 uses a constructive example to illustrate how LSI represents terms and documents in the same semantic ....
....Four terms are defined below to avoid confusion when discussing updating. Updating refers to the general process of adding new terms and or documents to an existing LSI generated database. Updating can mean either folding in or SVD updating. SVD updating is the new method of updating developed in [23]. Folding in terms or documents is a much simpler alternative that uses an existing SVD to represent new information. Recomputing the SVD is not an updating method, but a way of creating an LSI generated database with new terms and or documents from scratch which can be compared to either updating ....
[Article contains additional citation context not shown here]
G. W. O'Brien, Information Management Tools for Updating an SVD-Encoded Indexing Scheme, Master's thesis, The University of Knoxville, Tennessee, Knoxville, TN, 1994.
No context found.
O'Brien, G. W. (1994). Information management tools for updating an svd-encoded indexing scheme. Technical Report UT-CS-94-259, University of Tennessee.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC