Show simple item record

dc.contributor.advisorMitra, Suman K.
dc.contributor.authorKhalpada, Vaidehi S.
dc.date.accessioned2019-03-19T09:30:53Z
dc.date.available2019-03-19T09:30:53Z
dc.date.issued2018
dc.identifier.citationKhalpada, Vaidehi S. (2018). Document Representation using Extended Locality Preserving Indexing. Dhirubhai Ambani Institute of Information and Communication Technology, vii, 36 p. (Acc. No: T00711)
dc.identifier.urihttp://drsr.daiict.ac.in//handle/123456789/745
dc.description.abstractThe main purpose of web search is to obtain the relevant information pertaining to our need from the documents available on the Internet. Each term (word) in a document contributes to a dimension. It is challenging to process this high dimensional data. Not all terms convey important meaning, some terms are related to each other, some are synonyms. This redundancy in the document collection increases the dimensionality of the document space. Processing this high dimensional document collection to obtain useful information from it requires a lot of storage space and time for computation. Dimensionality reduction plays an important role here to reduce the data dimension so that computation can be done fast and the storage required is also less. These documents are represented as vectors in high dimensional space. Our main aim is to obtain the representation of documents in this reduced subspace so that the relation among the documents in the subspace does not get changed from the one in original vector space. So, the accuracy of the similarity measure of the documents obtained in the subspace is evaluated. Document representation in terms of term document matrix is an important step in document indexing. Document indexing is the process to obtain an index which helps in retrieving relevant documents effectively, analogous to the index of a book. Latent Semantic Indexing (LSI) is a global structure preserving approach while Locality Preserving Indexing (LPI) is a local structure preserving approach. LPI assigns weights to the neighbours to obtain the reduced representation while preserving local structure. However, it does not retain any information about nonneighbours. A new approach Extended Locality Preserving Indexing (ELPI) is proposed which preserves the topology of the document space by modifying the weighing scheme. Experiments for evaluating document similarity and for classification show small but encouraging improvement using ELPI as compared to LPI.
dc.publisherDhirubhai Ambani Institute of Information and Communication Technology
dc.subjectComputer applications
dc.subjectLatent semantic indexing
dc.subjectLocalitity preserving indexing
dc.subjectk- Nearest neighbour Classifier
dc.subjectInformation retrieval
dc.subjectInformation storage
dc.subjectAutomatic indexing
dc.classification.ddc025.30285 KHA
dc.titleDocument representation using extended locality preserving indexing
dc.typeDissertation
dc.degreeM. Tech
dc.student.id201611020
dc.accession.numberT00711


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record