Title page for 974203008


[Back to Results | New Search]

Student Number 974203008
Author Tsung-han Yang(楊宗翰)
Author's Email Address No Public.
Statistics This thesis had been viewed 575 times. Download 291 times.
Department Information Management
Year 2009
Semester 2
Degree Master
Type of Document Master's Thesis
Language zh-TW.Big5 Chinese
Title A Query Expansion and Document Re-Ranking System Based on Single User Profile
Date of Defense 2010-06-21
Page Count 48
Keyword
  • Document Re-ranking
  • Information Filtering
  • Query Expansion
  • User Profile
  • Abstract With the rapid growing information on Internet, the issue of helping user to filter useless information and to reduce the burden of browsing has become important. Therefore, a query expansion and document re-ranking system based on single user profile is proposed to accomplish personalized information retrieval. Collecting the Web pages of the user’s past browsing via web crawler to build a single user profile with multi-topic, and resolve the traditional problem of building multiple profiles for each topic of user’s interest. When user submits a query, the system will recommend personalized expansion words based on the user profile to filter uninterested documents, and just need to compare one single user profile so that expansion words can be produced. Comparing to the traditional process that must first determine which interest user profile is belonged to the query and then expansion words being produced; the proposed system can improve the efficiency. And re-ranking the search results based on user profile to rapidly help user to find personalized search results which are useful and interested and the burden of user browsing can be reduced.
    The experimental results prove that automatic query expansion based on single user profile can improve the retrieval performance, and after re-rank the search results, the retrieval performance is significantly improved. The proposed query expansion and document re-ranking system based on single user profile provides better efficiency in personalized query expansion, and can recommend expansion words according to user’s interests to filter irrelevant Web documents to acquire the actual needed information, and the burden of user browsing is effectively reduced.
    Table of Content 圖目錄 ....................................................................................................................... v
    表目錄 ...................................................................................................................... vi
    第一章 緒論 ............................................................................................................. 1
    1.1 研究動機.................................................................................................... 1
    1.2 研究目的.................................................................................................... 1
    1.3 研究限制.................................................................................................... 2
    1.4 論文架構.................................................................................................... 2
    第二章 文獻探討 ..................................................................................................... 3
    2.1 資訊檢索.................................................................................................... 3
    2.2 使用者興趣檔 ............................................................................................ 3
    2.2.1 傳統建置方法 ................................................................................ 3
    2.2.2 Nootropia演算法—建構單一的多主題使用者興趣檔 .................. 4
    2.3 查詢擴展.................................................................................................... 7
    2.4 文件重排序 ................................................................................................ 9
    2.4.1 傳統文件重排序方法 ..................................................................... 9
    2.4.2 使用Nootropia演算法進行個人化文件重排序 ............................ 9
    第三章 系統分析與設計 ........................................................................................ 12
    3.1 系統架構.................................................................................................. 12
    3.2 文件前處理 .............................................................................................. 14
    3.3 詞語萃取.................................................................................................. 15
    3.4 興趣檔建構 .............................................................................................. 16
    3.4.1 計算詞語權重 .............................................................................. 17
    3.4.2 計算詞語間的關聯度 ................................................................... 18
    3.4.3 根據詞語權重排序 ....................................................................... 20
    3.5 擴展字詞推薦 .......................................................................................... 20
    3.6 搜尋結果重排序 ...................................................................................... 21
    第四章 系統實作與驗證 ........................................................................................ 23
    4.1 實驗評估準則 .......................................................................................... 23
    4.2 實驗設計、結果與分析 .......................................................................... 24
    4.2.1 實驗設計 ...................................................................................... 24
    4.2.2 實驗結果 ...................................................................................... 27
    4.2.3 實驗結果分析 .............................................................................. 31
    第五章 結論與未來研究方向 ................................................................................ 33
    5.1 結論與貢獻 .............................................................................................. 33
    5.2 未來研究方向 .......................................................................................... 34
    參考文獻 ................................................................................................................. 36
    Reference [1] P. Maes, Agents that Reduce Work and Information Overload, Communications ACM, Vol. 37, 1994, pp. 30-40.
    [2] Z. Zhu, J. Xu, X. Ren, Y. Tian, and L. Li, Query Expansion Based on a Personalized Web Search Model, Proceedings of the Third International Conference on Semantics, Knowledge and Grid, IEEE Computer Society, 2007, pp. 128-133.
    [3] N. Nanas, V. Uren, and A.D. Roeck, Building and Applying a Concept Hierarchy Representation of a User Profile, Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Toronto, Canada: ACM, 2003, pp. 198-204.
    [4] N. Nanas, V. Uren, A. de Roeck, and J. Domingue, Multi-topic Information Filtering with a Single User Profile, Methods and Applications of Artificial Intelligence, 2004, pp. 400-409.
    [5] Dae-Won Kim and K. Lee, A New Fuzzy Information Retrieval System Based on User Preference Model, 10th IEEE International Conference on Fuzzy Systems, pp. 127-130.
    [6] R.A. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval, Addison-Wesley Longman Publishing Co., Inc., 1999.
    [7] A. Pretschner and S. Gauch, Ontology Based Personalized Search, Proceedings of the 11th IEEE International Conference on Tools with Artificial Intelligence, IEEE Computer Society, 1999, p. 391.
    [8] N. Nanas, V.S. Uren, and A. de Roeck, Nootropia: A User Profiling Model Based on a Self-Organising Term Network, Artificial Immune Systems, 2004, pp. 146-160.
    [9] G. Salton and M.J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, Inc., 1986.
    [10] S.E. Robertson and K.S. Jones, Relevance Weighting of Search Terms, Document Retrieval Systems, Taylor Graham Publishing, 1988, pp. 143-160.
    [11] F. Sebastiani, Machine Learning in Automated Text Categorization, ACM Computer Surveys, Vol. 34, 2002, pp. 1-47.
    [12] G. Amati, D. D'Aloisi, V. Giannini, and F. Ubaldini, A Framework for Filtering News and Managing Distributed Data, Jounrnal of Universal Computer Science, Vol. 3, 1997, pp. 1007-1021.
    [13] M. Pazzani, J. Muramatsu, and D. Billsus, Syskill & Webert: Identifying Interesting Web Sites, In Proceedings of the Thirteenth National Conference on Artificial Intelligence, 1996, pp. 54-61.
    [14] B. Krulwich and C. Burkey, The InfoFinder Agent: Learning User Interests through Heuristic Phrase Extraction, IEEE Intelligent Systems, Vol. 12, 1997, pp. 22-27.
    [15] L.B. Doyle, Semantic Road Maps for Literature Searchers, Journal of the ACM, Vol. 8, 1961, pp. 553-578.
    [16] H. Sorensen, A.O. Riordan, and C.O. Riordan, Profiling with the INFOrmer Text Filtering Agent, Journal of Universal Computer Science, Vol. 3, 1997, pp. 988-1006.
    [17] M. Sanderson and B. Croft, Deriving Concept Hierarchies from Text, Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Berkeley, California, United States: ACM, 1999, pp. 206-213.
    [18] A. Maedche and S. Staab, Ontology Learning for the Semantic Web, IEEE Intelligent Systems, Vol. 16, 2001, pp. 72-79.
    [19]R. Forsyth and R. Rada, Machine Learning: Expert Systems and Information Retrieval, Ellis Horwood, London: 1986.
    [20] B.J. Jansen, A. Spink, and T. Saracevic, Real Life, Real Users, and Real Needs: A Study and Analysis of User Queries on the Web, Information Processing Management, Vol. 36, 2000, pp. 207-227.
    [21] A. Spink, D. Wolfram, M.B.J. Jansen, and T. Saracevic, Searching the Web: The Public and Their Queries, Journal of the American Society for Information Science and Technology, Vol. 52, 2001, pp. 226-234.
    [22] C. Buckley, Automatic Query Expansion Using SMART: TREC 3, In Proceedings of the Third Text Retrieval Conference, 1994, pp. 69-80.
    [23] H.J. Peat and P. Willett, The Limitations of Term Co-Occurrence Data for Query Expansion in Document Retrieval Systems, Journal of the American Society for Information Science, Vol. 42, 1991, pp. 378-383.
    [24] J. Xu and W.B. Croft, Query Expansion using Local and Global Document Analysis, Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland: ACM, 1996, pp. 4-11.
    [25] W. Woods, Conceptual Indexing: A Better Way to Organize Knowledge, Technical Report of Sun Microsystems, 1997.
    [26] P.-.A. Chirita, C.S. Firan, and W. Nejdl, Personalized Query Expansion for the Web, Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, The Netherlands: ACM, 2007, pp. 7-14.
    [27] J. Xu and W.B. Croft, Improving the Effectiveness of Information Retrieval with Local Context Analysis, ACM Transactions Information System, Vol. 18, 2000, pp. 79-112.
    [28] Q. Youli, X. Guowei, and W. Jun, Rerank Method Based on Individual Thesaurus, Proceedings of NTCIR2 Workshop, 2002.
    [29] S. Lovic, M. Lu, and D. Zhang, Enhancing Search Engine Performance using Expert Systems, 2006 IEEE International Conference on Information Reuse Integration, Waikoloa Village, HI, USA: 2006, pp. 567-572.
    [30] M.C. D, R. Prabhakar, and S. Hinrich, Introduction to Information Retrieval, Cambridge University Press, 2008.
    [31] 中文斷詞系統,http://ckipsvr.iis.sinica.edu.tw/
    Advisor
  • Yih-chearng Shiue(薛義誠)
  • Files
  • 974203008.pdf
  • approve immediately
    Date of Submission 2010-07-12

    [Back to Results | New Search]


    Browse | Search All Available ETDs

    If you have dissertation-related questions, please contact with the NCU library extension service section.
    Our service phone is (03)422-7151 Ext. 57407,E-mail is also welcomed.