@MISC{Ullah_sem12at, author = {Md. Zia Ullah and Masaki Aono}, title = {SEM12 at the NTCIR-10 INTENT-2 English Subtopic Mining Subtask}, year = {} }
Share
OpenURL
Abstract
Users express their information needs in terms of queries in search engines to find some relevant documents on the Internet. However, search queries are usually short, am-biguous and/or underspecified. To understand user’s search intent, subtopic mining plays an important role and has at-tracted attention in the recent years. In this paper, we de-scribe our approach to identifying, and then ranking user’s intents for a query (or topic) from query logs, which is an english subtopic mining subtask of the NTCIR-10 Intent-2 task. We extract subtopics that are semantically and lexi-cally related to the topic, and measure their weights based on co-occurrence of a subtopic across search engine query logs, and edit distance between a topic and a subtopic. These weighted subtopic strings are ranked to represent themselves as the candidates of subtopics (or intents). In the experi-ment section, we show the revised subtopic mining results of our method evaluated by the organizers. The best perfor-