Title page for 93423039


[Back to Results | New Search]

Student Number 93423039
Author Liang-Lu Shih(¬I«G¦p)
Author's Email Address No Public.
Statistics This thesis had been viewed 1357 times. Download 803 times.
Department Information Management
Year 2005
Semester 2
Degree Master
Type of Document Master's Thesis
Language English
Title Applying Ontology to Relevant Document Discovery
Date of Defense 2006-06-28
Page Count 56
Keyword
  • Ontology
  • Ontology Extraction
  • Relevant Document Discovery
  • Abstract Research of relevant document discovery is practical and attractive to many
    researchers, and there are different solutions to this issue. Some solutions have been
    adopted in real world environments, such as electronic articles publishers. These
    publishers offer different information search options such as keywords, full-text,
    phrase, boolean expression¡Ketc, for users to retrieve documents. Most relevant
    document discovery techniques are originally from the domain of information
    retrieval. The core concept of semantic web is ontology, which has been applied in
    various domains, such as web service, agent communication, knowledge
    management¡K etc. However, there was few paper applied ontology to the research of
    relevant document discovery. Therefore, in this paper, ontology is applied to the issue
    of relevant documents discovery and a prototype system is constructed to implement
    the method proposed. With the input of a user selected document, the designed
    prototype system could generate a number of closely related documents that originally
    stored in the repository. The process of the prototype system could be mainly divided
    into the following steps: (1) transforming the input text document into OWL format (2)
    determining if the input document already exists in the ontology repository of the
    system (3) if the input document does not exist in ontology repository, then the
    program will calculate the similarity between the input ontology and the documents
    originally stored in ontology repository, and retrieving related documents with higher
    similarity values. Ontology extraction and similarity calculation are the cores that
    applied the concept of ontology to the prototype system. The objective of ontology
    extraction is to transform TXT format documents into OWL formats according to the
    characteristics of ontology. Secondly, similarity calculation is composed of two
    methods: concept similarity and instance similarity are proposed and implemented in
    the prototype system.
    Table of Content 1. Introduction................................1
    1.1 Research Background .......................1
    1.2 Research Motivation .......................2
    1.3 Purpose....................................3
    2. Literature Review ..........................5
    2.1 OWL Ontology ..............................5
    2.2 Ontology Extraction........................7
    2.3 Similarity Calculation ....................9
    3. Method of Relevant Document Discovery .....12
    3.1 SystemArchitecture ........................12
    3.2 Ontology Extraction........................14
    3.2.1 Preprocess...............................15
    3.2.2 Find the Associated Content of Schema........16
    3.2.3 Extract Instances from Content................17
    3.2.4 Constructing Ontology........................20
    3.3 Similarity Calculation .........................21
    3.3.1 Definition of Similarity Calculation............21
    3.3.2 Similarity Method 1: Concept Similarity ............22
    3.3.3 Similarity Method 2: Instance Similarity..............24
    3.3.4 Operational Definition of Instance Similarity ...........26
    3.3.5 Weights of Similarity Measures...........................27
    4. Implementation and Evaluation ...............................29
    4.1 Implementation Tools and Environment........................29
    4.2 Evaluation of Ontology Extraction ..........................29
    4.2.1 Implement Sentences as Instances ..........................30
    4.2.2 Implement multi-words as Instances.........................32
    4.3 Evaluation of the Prototype System...........................34
    4.3.1 Evaluation Method..........................................34
    4.3.2 Experiment 1: only Concept Similarity......................36
    4.3.3 Experiment 2: only Instance Similarity.....................36
    4.3.4 Experiment 3: Concept and Instance Similarity .................38
    5. Conclusion and Future Direction ...............................41
    5.1 Conclusion ..................................................41
    5.2 Contribution ...............................................41
    5.3 Limitation.................................................41
    5.4 Future Direction ..........................................42
    References ....................................................44
    Reference 1. Alani, H., Kim, S., Millard, D. E., Weal, M. J., Hall, W., Lewis, P. H. and Shadbolt,
    N. R., Automatic Ontology-Based Knowledge Extraction from Web Documents,
    IEEE Intelligent Systems, Vol. 18, No.1, pp.14-21, 2003.
    2. Baeza-Yates, R., Ribeiro-Neto,B., 1999. Modern Information Retrieval, New York:
    Addison-Wesley.
    3. Baziz, M., Boughanem, M., Aussenac-Gilles,N., Chrisment,C., Semantic Cores for
    Representing Documents in IR, Proceedings of the 2005 ACM symposium on
    Applied computing SAC '05 , pp.1011-1017, 2005.
    4. Berners-Lee Tim, Hendler James, Lassila Ora, THE SEMANTIC WEB,
    SCIENTIFIC AMERICAN, Vol. 284, Issue 5, pp. 34-44, 2001.
    5. Carmen Costilla, Juan P. Palacios, María José Rodríguez, José Cremades, Antonio
    Calleja, Raúl Fernández, Jorge Vila, Semantic Web Digital Archive Integration,
    DEXA Workshops 2004, pp. 179-185, 2004.
    6. Doan, A., Jayant, M., Pedro, D., Alon, H., ¡§Learning to map between ontologies on
    the semantic web¡¨, Proceedings of the Eleventh International WWW Conference,
    2002.
    7. Ehrig M., Haase P., Hefke M., Stojanovic N., ¡§Similarity for Ontologies - a
    Comprehensive Framework, 13th European Conference on Information Systems,
    2005.
    8. Ehrig M., Staab S., QOM - Quick Ontology Mapping, Proceedings of the Third
    International SemanticWeb Conference, pp. 683-696 , 2004.
    9. Ehrig M., Sure Y., Ontology Mapping - An Integrated Approach, Proceedings of the
    1st European Semantic Web Symposium, pp. 76-91, 2004.
    10. Goldberg D.E., 1989, Genetic Algorithms in Search, Optimization, and Machine
    Learning, ADDISON-WESLEY
    45
    11. Golgher, P.B., Laender, A.H.F., Lage, J.P., e Silva, A.S , Automatic generation of
    agents for collecting hidden web pages for data extraction, Data & Knowledge
    Engineering, Vol.19, Issue2, pp. 177-196, 2004.
    12. Hotho, A., Staab, S. Maedche A., Ontology-based Text Clustering, Workshop
    "Text Learning: Beyond Supervision", 2001.
    13. Ian H.Witten, Eibe Frank, 1999, Data Mining-Practical Machine Learning Tools
    and Techniques with Java Implementations, the Morgan Kaufmann Series in Data
    Management Systems.
    14. Kalfoglou, Y., Schorlemmer, M., Ontology Mapping: The State of the Art, the
    Knowledge Engineering Review, Vol. 18, No.1, pp. 1-31, 2003.
    15.Kenneth P. Bogart, 1990, Introductory Combinatorics, Harcourt Brace Jovanovich.
    16. Kietz, J.U., Maedche A., Volz,R., A Method for Semi-Automatic Ontology
    Acquisition from a Corporate Intranet¡¨, proc. of Workshop Ontologies and Text,
    co-located with the 12th International Workshop on Knowledge Engineering and
    Knowledge Management, 2000.
    17. Krishnamurthy V., 1986, COMBINATORICS-theory and applications, Ellis
    Horwood.
    18. Maedche, A., Motik, B., Stojanovic, L., Studer, R., Volz, R., Ontologies for
    Enterprise Knowledge Management, Intelligent Systems, IEEE, Vol. 18 , Issue 2,
    pp. 22-33, 2003.
    19. Maedche, A., Staab, S., Ontology Learning for the Semantic Web, IEEE
    INTELLIGENT SYSTEMS, Vol. 16, Issue 2, pp. 72-79, 2001.
    20. Mitra P., Noy N,F., Jaiswal A.R., OMEN: A Probabilistic Ontology Mapping Tool,
    International SemanticWeb Conference, pp. 537-547, 2005.
    21. Mitchell, T.M., 1997, MACHINE LEARNING¡¨, McGraw-Hill.
    22. Natalya F. Noy, Mark A. Musen, The PROMPT Suite: Interactive Tools For
    Ontology Merging And Mapping, International Journal of Human-Computer
    Studies, pp. 983-1024, 2003.
    46
    23. Rodriguez, M.A., Egenhofer, M.J., Determining semantic similarity among entity
    classes from different ontologies, IEEE Transactions on Knowledge and Data
    Engineering, Vol.15, Issue 2, pp. 442-456, 2003.
    24. Schlobach, S., Assertional Mining in Description Logics, Description Logics,
    pp.237-246, 2000.
    25. Sridharan, B., Tretiakov, A., Kinshuk, Application of Ontology to Knowledge
    management in Web based Learning, IEEE International Conference, pp.
    663-665, 2004.
    26. Tan, K.W., Han, H., Elmasri, R., ¡§Web Data Cleansing and Preparation for
    Ontology Extraction using WordNet¡¨,Proceedings of the First International
    Conference, Vol. 2, pp. 11-18,2000.
    27. Williams, A.B., Tsatsoulis, C., An Instance-based Approach for Identifying
    Candidate Ontology Relations within a Multi-Agent System, Fourteenth
    European Conference on Artificial Intelligence, Ontology Learning ECAI-2000
    Workshop, Berlin, 2000.
    28. http://infomesh.net/2001/swintro/
    29. http://protege.stanford.edu/plugins/owl/documentation.html
    30. http://scholar.google.com/
    31. http://wordnet.princeton.edu/
    32. http://www.daml.org/
    33. http:// www.google.com/
    34. http://www.pdfbox.org/index.html
    35. http://www.seas.gwu.edu/~simhaweb/software/jwordnet/
    36. http://www.w3.org/RDF/
    37. http://www.w3.org/2004/OWL/
    Advisor
  • Eric Y. Cheng(¾G¸Î¶Ô)
  • Files
  • 93423039.pdf
  • approve immediately
    Date of Submission 2006-07-12

    [Back to Results | New Search]


    Browse | Search All Available ETDs

    If you have dissertation-related questions, please contact with the NCU library extension service section.
    Our service phone is (03)422-7151 Ext. 57407,E-mail is also welcomed.