Reasoning, Ambiguity, and Clarification: Probing LLMs in Interactive Minecraft Tasks

We examine the capabilities and challenges of using Large Language Models (LLMs) in task-oriented dialogue settings, particularly situated dynamic Minecraft-like environments. Our work focuses on two interconnected aspects: using LLMs as Minecraft agents in builder and architect roles, and their ability to ask clarification questions in asynchronous instruction-giver/instruction-follower settings. To achieve this we prepared a new unified corpus that combines annotations for reference, ambiguity, and discourse structure, enabling systematic evaluation of clarification behavior. Through platform-based interaction and comparison with human data, we find notable differences: humans rarely ask clarification questions for referential ambiguity but often do for task uncertainty, while LLMs show the opposite tendency. We further explore whether LLMs’ question-asking behavior is influenced by their reasoning capabilities, observing that explicit reasoning increases both the frequency and relevance of clarification questions. Our findings highlight both the promise and current limitations of LLMs in handling ambiguity and improving interactive task performance.

talk-data.com

Reasoning, Ambiguity, and Clarification: Probing LLMs in Interactive Minecraft Tasks

Description