Personalizable Knowledge Integration

Thumbnail Image
Publication or External Link
Martinez, Maria Vanina
Subrahmanian, Venkatramanan S
Large repositories of data are used daily as knowledge bases (KBs) feeding computer systems that support decision making processes, such as in medical or financial applications. Unfortunately, the larger a KB is, the harder it is to ensure its consistency and completeness. The problem of handling KBs of this kind has been studied in the AI and databases communities, but most approaches focus on computing answers locally to the KB, assuming there is some single, epistemically correct solution. It is important to recognize that for some applications, as part of the decision making process, users consider far more knowledge than that which is contained in the knowledge base, and that sometimes inconsistent data may help in directing reasoning; for instance, inconsistency in taxpayer records can serve as evidence of a possible fraud. Thus, the handling of this type of data needs to be context-sensitive, creating a synergy with the user in order to build useful, flexible data management systems. Inconsistent and incomplete information is ubiquitous and presents a substantial problem when trying to reason about the data: how can we derive an adequate model of the world, from the point of view of a given user, from a KB that may be inconsistent or incomplete? In this thesis we argue that in many cases users need to bring their application-specific knowledge to bear in order to inform the data management process. Therefore, we provide different approaches to handle, in a personalized fashion, some of the most common issues that arise in knowledge management. Specifically, we focus on (1) inconsistency management in relational databases, general knowledge bases, and a special kind of knowledge base designed for news reports; (2) management of incomplete information in the form of different types of null values; and (3) answering queries in the presence of uncertain schema matchings. We allow users to define policies to manage both inconsistent and incomplete information in their application in a way that takes both the user's knowledge of his problem, and his attitude to error/risk, into account. Using the frameworks and tools proposed here, users can specify when and how they want to manage/solve the issues that arise due to inconsistency and incompleteness in their data, in the way that best suits their needs.