Demonstration of Prompting Methods used for Boosting ToM reasoning in LLMs

Tags
Last edited time
Feb 13, 2025 2:43 PM

Examples of 4 prompting types used to test the Theory of Mind(ToM) performance of LLMs. Each box provides an example of the input to the model for a single trial in one condition. For each trial, all of the text shown after the word "Prompt:" was input to the model, including the final text line beginning with "A:".

Zero-Shot

Prompt: Read the scenario and answer the following question:

Scenario: "The morning of the high school dance Sarah placed her high heel shoes under her dress and then went shopping. That afternoon, her sister borrowed the shoes and later put them under Sarah's bed." Question: When Sarah gets ready, does she assume her shoes are under her dress? A:

Zero-Shot + Step-by-Step Thinking

Prompt: Read the scenario and answer the following question:

Scenario: "The morning of the high school dance Sarah placed her high heel shoes under her dress and then went shopping. That afternoon, her sister borrowed the shoes and later put them under Sarah's bed." Question: When Sarah gets ready, does she assume her shoes are under her dress? A: Let's think step by step:

Two-Shot Chain of Thought Reasoning

Prompt: Read the scenario and answer the following question:

Scenario: "Anne made lasagna in the blue dish. After Anne left, Ian came home and ate the lasagna. Then he filled the blue dish with spaghetti and replaced it in the fridge." Q: Does Anne think the blue dish contains spaghetti? A: When Anne left the blue dish contained lasagna. Ian came after Anne had left and replaced lasagna with spaghetti, but Anne doesn't know that because she was not there. So, the answer is: No, she doesn't think the blue dish contains spaghetti.

Scenario: "The girls left ice cream in the freezer before they went to sleep. Over night the power to the kitchen was cut and the ice cream melted." Q: When they get up, do the girls believe the ice cream is melted? A: The girls put the ice cream in the freezer and went to sleep. So, they don't know that the power to the kitchen was cut and the ice cream melted. So, the answer is: No, the girls don't believe the ice cream is melted.

Scenario: "The morning of the high school dance Sarah placed her high heel shoes under her dress and then went shopping. That afternoon, her sister borrowed the shoes and later put them under Sarah's bed." Question: When Sarah gets ready, does she assume her shoes are under her dress? A:

Two-Shot Chain of Thought Reasoning + Step-by-Step Thinking

Prompt: Read the scenario and answer the following question:

Scenario: "Anne made lasagna in the blue dish. After Anne left, Ian came home and ate the lasagna. Then he filled the blue dish with spaghetti and replaced it in the fridge." Q: Does Anne think the blue dish contains spaghetti? A: Let's think step by step: When Anne left the blue dish contained lasagna. Ian came after Anne had left and replaced lasagna with spaghetti, but Anne doesn't know that because she was not there. So, the answer is: No, she doesn't think the blue dish contains spaghetti.

Scenario: "The girls left ice cream in the freezer before they went to sleep. Over night the power to the kitchen was cut and the ice cream melted." Q: When they get up, do the girls believe the ice cream is melted? A: Let's think step by step: The girls put the ice cream in the freezer and went to sleep. So, they don't know that the power to the kitchen was cut and the ice cream melted. So, the answer is: No, the girls don't believe the ice cream is melted.

Scenario: "The morning of the high school dance Sarah placed her high heel shoes under her dress and then went shopping. That afternoon, her sister borrowed the shoes and later put them under Sarah's bed." Question: When Sarah gets ready, does she assume her shoes are under her dress? A: Let's think step by step:

The above is from this paper by Moghaddam & Honey (2023) arXiv:2304.11490

Zero-shot theory-of-mind accuracy: -GPT-4: 80% -Humans: 87% Prompting in-context learning (few-shot / chain-of-thought reasoning): -All RLHF-trained LLMs: 80%+ -GPT-4: 100%