| D. Embley and L. Xu. Record location and reconfiguration in unstructured multiple-record Web documents. In ACM International Workshop on the Web and Databases (WebDB), 2000. |
No context found.
D.W. Embley and L. Xu. Record location and reconfiguration in unstructured multiple-record Web documents. In Proceedings of the Third International Workshop on the Web and Databases (WebDB2000), pages 123--128, Dallas, Texas, May 2000.
....We also retrieve any useful linked information to build a complete record. We also remove any extraneous advertisement to obtain a list of cleaned records. We do some of these processes automatically and others manually; automating this process is not the focus of this project. See [17] and [19] for descriptions about how to automate this document preprocessing work. http: www.cia.gov cia publications factbook index.html 24 Figure 4.1 shows the sequence of the pre processing stages for the training documents. First we clean the HTML body by removing all irrelevant parts like ....
D.W. Embley and L. Xu. Record location and reconfiguration in unstructured multiple-record Web documents. In Proceedings of the Third International Workshop on the Web and Databases (WebDB2000.
No context found.
D. Embley and L. Xu. Record location and reconfiguration in unstructured multiple-record Web documents. In ACM International Workshop on the Web and Databases (WebDB), 2000.
Online articles have much greater impact More about CiteSeer.IST Add search form to your site Submit documents Feedback
CiteSeer.IST - Copyright Penn State and NEC