Download Training Set (4966 questions) Download Dev Set (1680 questions) Download Test Set (2764 questions) The CompMix benchmark is licensed under a Creative Commons Attribution 4.0 International License.
CompMix collates the completed versions of the conversational questions in ConvMix, that are provided directly by crowdworkers from Amazon Mechanical Turk (AMT). Questions in CompMix exhibit complex phenomena like the presence of multiple entities, relations, temporal conditions, comparisons, aggregations, and more. It is aimed at evaluating QA methods that operate over a mixture of heterogeneous input sources (KB, text, tables, infoboxes). The dataset has 9,410 questions, split into train (4,966 questions), dev (1,680), and test (2,764) sets. All answers provided in the CompMix dataset are grounded to the KB (except for dates which are normalized, and other literals like names).

Further details will be provided in a dedicated write-up soon.

How was CompMix created?

CompMix collates the completed versions of the conversational questions in ConvMix, and are provided directly by the crowdworkers.

The ConvMix benchmark, on which CompMix is based, was created by real humans. We tried to ensure that the collected data is as natural as possible. Master crowdworkers on Amazon Mechanical Turk (AMT) selected an entity of interest in a specific domain, and then started issuing conversational questions on this entity, potentially drifting to other topics of interest throughout the course of the conversation. By letting users choose the entities themselves, we aimed to ensure that they are more interested into the topics the conversations are based on. After writing a question, users were asked to find the answer in eithers Wikidata, Wikipedia text, a Wikipedia table or a Wikipedia infobox, whatever they find more natural for the specific question at hand. Since Wikidata requires some basic understanding of knowledge bases, we provided video guidelines that illustrated how Wikidata can be used for detecting answers, following an example conversation. For each conversational question, that might be incomplete, the crowdworker provides a completed question that is intent-explicit, and can be answered without the conversational context. These questions constitute the CompMix dataset. We provide also the answer source the user found the answer in and question entities.

How do questions in CompMix start?

How do answers in CompMix look like?


For feedback and clarifications, please contact: Philipp Christmann (pchristm AT mpi HYPHEN inf DOT mpg DOT de), Rishiraj Saha Roy (rishiraj AT mpi HYPHEN inf DOT mpg DOT de) or Gerhard Weikum (weikum AT mpi HYPHEN inf DOT mpg DOT de).

"CompMix: A Benchmark for Heterogeneous Question Answering",
Philipp Christmann, Rishiraj Saha Roy, and Gerhard Weikum.
WWW '24 (Resource Track).
[Preprint] [Slides] [Video]

CompMix Leaderboard

Model P@1 MRR Hit@5
Lehmann et al. '24
0.655 - -
Zhang et al. '24
0.565 - -
GPT-3 (text-davinci-003)
Brown et al. '20
0.502 - -
Christmann et al. '23
0.442 0.518 0.617
Oguz et al. '22
0.440 0.467 0.494
Christmann et al. '22
0.407 0.437 0.483

* Result computed on a random sample of 200 questions.

How do questions in CompMix look like?

The sources in square brackets are the ones the respective answer can be found in.

Which movie is longer, Hamlet or Gone with the Wind?
Hamlet [KB, Infobox]
Which soccer player scored the most number of goals in the UEFA Euro 2004 tournament?
Milan Baroš [KB, Text, Infobox, Table]
How many matches has João Félix played for Portugal in 2019?
5 [Table]
Where did the Uruguay national football team play their first recorded match?
Paso del Molino [Text]
Who was the kit manufacturer of Chelsea Football Club from 1981 to 1983?
Le Coq sportif [Text, Table]
Multiple complexities
Which player was awarded the most number of Man of the match titles in the FIFA world cup of 2006?
Andrea Pirlo [KB, Text]
Author of the book To Kill a Mockingbird?
Harper Lee [KB, Text, Infobox]
In what year was André Jardine born?
1979 [KB, Text, Infobox]