TY - BOOK
T1 - Functional Inferences Over Heterogeneous Data
AU - Nuamah, Kwabena
PY - 2018
Y1 - 2018
N2 - This thesis focuses on the rich inference framework (RIF) that responds to queries where no suitable answer is readily contained in any available data source, using a variety of inference operations. Inference enables an agent to create new knowledge from old or discover implicit relationships between concepts in the knowledge bases, provided that appropriate techniques are employed to deal with ambiguous, incomplete and sometimes erroneous data. The ever-increasing volumes of KBs on the web present an opportunity to improve the inference process in automated query answering systems. Most question answering and information retrieval systems assume that answers to queries are stored in some form in the KB, thereby limiting the range of answers they can find. We take an approach motivated by rich forms of inference using techniques, such as regression, for prediction. For instance, RIF can answer “what country in Europe will have the largest population in 2021?” by decomposing Europe geo-spatially, using regression on country population for past years and selecting the country with the largest predicted value. Our technique, which we refer to as Rich Inference, combines heuristics, logic and statistical methods to infer novel answers to queries. It also determines what facts are needed for inference, searches for them, and then integrates the diverse facts and their formalisms into the local query-specific inference tree.Our primary contribution in this thesis is the inference algorithm on which RIF works. This includes (1) the process of recursively decomposing frames in way that allows variables in the query to be instantiated by facts in KBs; (2) the use of aggregate functions to perform arithmetic and statistical operations (e.g. prediction); and (3) the estimation and propagation of uncertainty values into the returned answer based errors introduces by noise in the KBs or errors introduced by aggregate functions.We also discuss the core concepts and modules that constitute RIF. We explain the internal "frame" representation of RIF, the grammar of a simple query language that allows users to express queries in a formal way such that we avoid the complexities of natural language queries, a problem that falls outside the scope of this thesis. We evaluate the framework with datasets from open sources.
AB - This thesis focuses on the rich inference framework (RIF) that responds to queries where no suitable answer is readily contained in any available data source, using a variety of inference operations. Inference enables an agent to create new knowledge from old or discover implicit relationships between concepts in the knowledge bases, provided that appropriate techniques are employed to deal with ambiguous, incomplete and sometimes erroneous data. The ever-increasing volumes of KBs on the web present an opportunity to improve the inference process in automated query answering systems. Most question answering and information retrieval systems assume that answers to queries are stored in some form in the KB, thereby limiting the range of answers they can find. We take an approach motivated by rich forms of inference using techniques, such as regression, for prediction. For instance, RIF can answer “what country in Europe will have the largest population in 2021?” by decomposing Europe geo-spatially, using regression on country population for past years and selecting the country with the largest predicted value. Our technique, which we refer to as Rich Inference, combines heuristics, logic and statistical methods to infer novel answers to queries. It also determines what facts are needed for inference, searches for them, and then integrates the diverse facts and their formalisms into the local query-specific inference tree.Our primary contribution in this thesis is the inference algorithm on which RIF works. This includes (1) the process of recursively decomposing frames in way that allows variables in the query to be instantiated by facts in KBs; (2) the use of aggregate functions to perform arithmetic and statistical operations (e.g. prediction); and (3) the estimation and propagation of uncertainty values into the returned answer based errors introduces by noise in the KBs or errors introduced by aggregate functions.We also discuss the core concepts and modules that constitute RIF. We explain the internal "frame" representation of RIF, the grammar of a simple query language that allows users to express queries in a formal way such that we avoid the complexities of natural language queries, a problem that falls outside the scope of this thesis. We evaluate the framework with datasets from open sources.
KW - inference
KW - question-answering
KW - uncertainty
KW - Functional inferences over heterogeneous data
M3 - Doctoral Thesis
ER -