The inCLINC dataset (incremental intent annotations of the CLINC dataset) contains 121 distinct utterances (queries directed to a voice assistant) in their complete form and in partial form for a total of 538 utterances, which were labeled with intent categories in a crowdsourcing study by 126 coders. The tagset consisted of 37 intent categories plus one out-of-scope category. Each utterance was annotated by 6 to 9 coders.
To refer to inCLINC in any publication, please cite the following paper:
Hrycyk, L., Zarcone, A., & Hahn, L. (2021). Not So Fast, Classifier – Accuracy and Entropy Reduction in Incremental Intent Classification. In Proceedings of the 3rd Workshop on NLP for Conversational AI (NLP4ConvAI 2021).