Loading…

We need cookies enabled to serve you. Read Our Policy

AI Wars - ChatGPT versus Bard on The Wason Selection Task

This time we decided to Google “famous logic and reasoning tests” and the first Wikipedia item that came up was this one: Wason Selection Task. So after reading this multiple times, we decided this would be a suitable logic test to throw at Bard and ChatGPT. We were curious to see if it would identify that we gave it a well known test and also to see if it seemed like they were using logic to solve the question or if they would just spit back an answer that we could have gotten through Google. So without further ado here we go …

Prompt: You are shown a set of four cards placed on a table, each of which has a number on one side and a color on the other. The visible faces of the cards show 3, 8, blue and red. Which card(s) must you turn over in order to test that if a card shows an even number on one face, then its opposite face is blue?

ChatGPT’s Response

In this logic problem, we are trying to test the rule: "if a card shows an even number on one face, then its opposite face is blue". This type of rule is known as an "if P, then Q" rule.

To test this rule, you need to confirm two things:

Cards with even numbers must be blue on the other side (direct verification of the rule, i.e., if P, then Q)
Cards that are not blue on one side must not have an even number on the other side (contrapositive of the rule, i.e., if not Q, then not P)

Hence, in this case, you would need to turn over:

The "8" card to make sure that it is blue on the other side (direct verification of the rule).
The "red" card to make sure that it does not have an even number on the other side (contrapositive of the rule).

You don't need to turn over the "3" card because the rule doesn't state anything about odd numbers. You also don't need to turn over the "blue" card, because the rule doesn't say that only even numbers can have blue on the other side.

Bard’s Response

The correct answer is to turn over the 8 card and the red card.

The rule is: If the card shows an even number on one face, then its opposite face is blue.

We can test this rule by turning over cards that could potentially invalidate it. The 3 card does not invalidate the rule, because it shows an odd number on one face. The blue card does not invalidate the rule, because it shows blue on one face. However, the 8 card could invalidate the rule, if it shows an even number on one face but is not blue on the other face. The red card could also invalidate the rule, if it shows an even number on one face but is blue on the other face.

Therefore, we need to turn over the 8 card and the red card in order to test the rule. If either of these cards shows an even number on one face but is not blue on the other face, then the rule is invalidated. If neither of these cards shows an even number on one face, then the rule is supported.

The Results of the Battle

Well, neither of the responses mention that this is a famous logic question which leads us to believe that both Bard and ChatGPT did solve this question using logic. That’s not actually proper proof because they could easily have found this answer using cosine similarity assuming that Wikipedia was a source for one or both models (which also may not be a valid assumption).

The bottom line is as humans we cannot tell how Bard or ChatGPT came to the correct solution but we are interested to hear what all of you think. Do you see anything in either answer that indicates that Bard and ChatGPT used logic and reasoning versus just finding the answer through cosine similarity or similar algorithms? Let us know on Twitter @ebookfairs.

You must sign in to be able to post a comment.

and the same classical test is not good for humans and other intelligent systems! (sorry, but there is a 200 word limit, I dont know why!?)

mersereg

It's no wonder, since chatGPT has already encountered this test during training. This confirms that it is necessary to develop comparative psychology,...

mersereg

For this, ChatGPT gave me a good answer, but its reasoning was wrong. After that, I asked if he knew what kind of cognitive psychological test it was, and he immediately said it was the Wason test.

mersereg

Statement: If a coin shows a round number on one side, then its opposite side shows a face of a man.

mersereg

Prompt continue: The visible sides of the coins show 5, 100, a face of a man and a face of a queen. Which coin or coins must you turn over in order to test the truth of the following statement?

mersereg

Prompt: You are shown a set of four foreign coins placed on a table, each of which has a number on one side and a face of a man or a woman on the other side.

mersereg

At the very least, it is worth reformulating the test for other objects instead of cards for a proper study. I tried the following.

mersereg

If we use the original form of the Wason test with four cards in the prompt, you can be sure that the language model recognize it, because the other name of the Wason test is the four card problem.

mersereg

Sorry, but this is not a proper study, and the conclusion is not correct!

mersereg

Why not keep reading?