Description
We present the QUASAR system for question answering over unstructured text, structured tables, and knowledge graphs, with unified treatment of all sources. The system adopts a RAG-based architecture, with a pipeline of evidence retrieval followed by answer generation, with the latter powered by a moderate-sized language model. Additionally and uniquely, QUASAR has components for question understanding, to derive crisper input for evidence retrieval, and for re-ranking and filtering the retrieved evidence before feeding the most informative pieces into the answer generation. Experiments with three different benchmarks demonstrate the high answering quality of our approach, being on par with or better than large GPT models, while keeping the computational cost and energy consumption orders of magnitude lower.
The QUASAR system is a pipeline of four major stages, as illustrated in the Figure above. First, the input question is
analyzed and decomposed, in order to compute a structured intent (SI) representation that will pass on to the
subsequent steps, along with the original question. Second, the SI is utilized to retrieve pieces of evidence from
different sources: text, KG and tables. Third, this pool of potentially useful evidence is filtered down, with
iterative re-ranking, to arrive at a tractably small set of most promising evidence. The final stage generates
the answer from this evidence, passing back the answer as well as evidence snippets for user-comprehensible
explanation.
Further details in our
IEEE Data Engineering Bulletin paper.
Code
GitHub link to QUASAR code (coming soon)
Directly download QUASAR code (coming soon)
Contact
For feedback and clarifications, please contact: