@MISC{_effectivenessand, author = {}, title = {EFFECTIVENESS AND ROBUSTNESS OF HETEROGENEOUS WEB DOCUMENTS}, year = {} }
Share
OpenURL
Abstract
Now-a-days Internet [1][2] plays major role in our daily life such as e-commerce, e-seva etc. Usage of internet is increasing drastically more and more. World Wide Web (WWW) is widely used to publish and access information on the Internet. The web pages in many websites are automatically populated by using common templates with contents. It increase the bottleneck on the end users system. It reduce the system performance and speed of the operation. To reduce bottleneck on the web pages, a novel approach was presented from heterogeneous web documents. Represent the documents and path using matrix and it uses the MDL principle to manage the unknown no. of clusters. The Min Hash technique [6] to speed up the clustering process. It reduce the cost and maintenance.