Langanzeige der Metadaten
DC ElementWertSprache
dc.contributor.authorSaha Roy, Rishiraj-
dc.contributor.authorSchlotthauer, Joel-
dc.contributor.authorHinze, Chris-
dc.contributor.authorFoltyn, Andreas-
dc.contributor.authorHahn, Luzian-
dc.contributor.authorKüch, Fabian-
dc.date.accessioned2025-03-20T14:01:23Z-
dc.date.available2025-03-20T14:01:23Z-
dc.date.issued2025-03-10-
dc.identifier.urihttps://fordatis.fraunhofer.de/handle/fordatis/435-
dc.identifier.urihttp://dx.doi.org/10.24406/fordatis/390-
dc.description.abstractRetrieval Augmented Generation (RAG) works as a backbone for interacting with an enterprise's own data via Conversational Question Answering (ConvQA). In a RAG system, a retriever fetches passages from a collection in response to a question, which are then included in the prompt of a large language model (LLM) for generating a natural language (NL) answer. However, several RAG systems today suffer from two shortcomings: (i) retrieved passages usually contain their raw text and lack appropriate document context, negatively impacting both retrieval and answering quality; and (ii) attribution strategies that explain answer generation typically rely only on similarity between the answer and the retrieved passages, thereby only generating plausible but not causal explanations. In this work, we demonstrate RAGONITE, a RAG system that remedies the above concerns by: (i) contextualizing evidence with source metadata and surrounding text; and (ii) computing counterfactual attribution, a causal explanation approach where the contribution of an evidence to an answer is determined by the similarity of the original response to the answer obtained by removing that evidence. To evaluate our proposals, we release a new benchmark ConfQuestions: it has 300 hand-created conversational questions, each in English and German, coupled with ground truth URLs, completed questions, and answers from 215 public Confluence pages. These documents are typical of enterprise wiki spaces with heterogeneous elements. Experiments with RAGONITE on ConfQuestions show the viability of our ideas: contextualization improves RAG performance, and counterfactual explanations outperform standard attribution.en
dc.language.isoenen
dc.rights.urihttps://creativecommons.org/licenses/by-nc-nd/4.0/en
dc.subjectConversationsen
dc.subjectQuestion answeringen
dc.subjectRetrieval augmented generationen
dc.subject.ddcDDC::000 Informatik, Informationswissenschaft, allgemeine Werkeen
dc.titleEvidence Contextualization and Counterfactual Attribution for Conversational QA over Heterogeneous Data with RAG Systemsen
dc.typeTextual Dataen
dc.contributor.funderBundesministerium für Wirtschaft und Klimaschutz BMWK (Deutschland)en
dc.relation.issupplementto10.1145/3701551.3704126-
fordatis.groupInnovationsforschungen
fordatis.instituteIIS Fraunhofer-Institut für Integrierte Schaltungenen
fordatis.rawdatafalseen
fordatis.sponsorship.projectid68GX21007Den
fordatis.sponsorship.projectnameOpenGPT-Xen
fordatis.date.start2024-07-01-
fordatis.date.end2024-10-01-
Enthalten in den Sammlungen:Fraunhofer-Institut für Integrierte Schaltungen IIS

Dateien zu dieser Ressource:
Datei Beschreibung GrößeFormat 
confquestions.zip679,07 kBZIPÖffnen/Download


Diese Ressource wurde unter folgender Copyright-Bestimmung veröffentlicht: Lizenz von Creative Commons Creative Commons