Information Retrieval on the World Wide Web and Active Logic: A Survey and Problem Definition

Barfourosh, A. Abdollahzadeh; Nezhad, H. R. Motahary; Anderson, M. L.; Perlis, D.

Information Retrieval on the World Wide Web and Active Logic: A Survey and Problem Definition

dc.contributor.author	Barfourosh, A. Abdollahzadeh	en_US
dc.contributor.author	Nezhad, H. R. Motahary	en_US
dc.contributor.author	Anderson, M. L.	en_US
dc.contributor.author	Perlis, D.	en_US
dc.date.accessioned	2004-05-31T23:13:04Z
dc.date.available	2004-05-31T23:13:04Z
dc.date.created	2002-08	en_US
dc.date.issued	2002-08-30	en_US
dc.description.abstract	As more information becomes available on the World Wide Web (there are currently over 4 billion pages covering most areas of human endeavor), it becomes more difficult to provide effective search tools for information access. Today, people access web information through two main kinds of search interfaces: Browsers (clicking and following hyperlinks) and Query Engines (queries in the form of a set of keywords showing the topic of interest). The first process is tentative and time consuming and the second may not satisfy the user because of many inaccurate and irrelevant results. Better support is needed for expressing one's information need and returning high quality search results by web search tools. There appears to be a need for systems that do reasoning under uncertainty and are flexible enough to recover from the contradictions, inconsistencies, and irregularities that such reasoning involves. Active Logic is a formalism that has been developed with real-world applications and their challenges in mind. Motivating its design is the thought that one of the factors that supports the flexibility of human reasoning is that it takes place step-wise, in time. Active Logic is one of a family of inference engines (step-logics) that explicitly reason in time, and incorporate a history of their reasoning as they run. This characteristic makes Active Logic systems more flexible than traditional AI systems and therefore more suitable for commonsense, real-world reasoning. In this report we mainly will survey recent advances in machine learning and crawling problems related to the web. We will review the continuum of supervised to semi-supervised to unsupervised learning problems, highlight the specific challenges which distinguish information retrieval in the hypertext domain and will summarize the key areas of recent and ongoing research. We will concentrate on topic-specific search engines, focused crawling, and finally will propose an Information Integration Environment, based on the Active Logic framework. Keywords: Web Information Retrieval, Web Crawling, Focused Crawling, Machine Learning, Active Logic (Also UMIACS-TR-2001-69)	en_US
dc.format.extent	312585 bytes
dc.format.mimetype	application/pdf
dc.identifier.uri	http://hdl.handle.net/1903/1153
dc.language.iso	en_US
dc.relation.isAvailableAt	Digital Repository at the University of Maryland	en_US
dc.relation.isAvailableAt	University of Maryland (College Park, Md.)	en_US
dc.relation.isAvailableAt	Tech Reports in Computer Science and Engineering	en_US
dc.relation.isAvailableAt	UMIACS Technical Reports	en_US
dc.relation.ispartofseries	UM Computer Science Department; CS-TR-4291	en_US
dc.relation.ispartofseries	UMIACS; UMIACS-TR-2001-69	en_US
dc.title	Information Retrieval on the World Wide Web and Active Logic: A Survey and Problem Definition	en_US
dc.type	Technical Report	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: CS-TR-4291.pdf
Size:: 305.26 KB
Format:: Adobe Portable Document Format

Download

Collections

Technical Reports from UMIACS
Technical Reports of the Computer Science Department