This time we decided to Google “famous logic and reasoning tests” and the first Wikipedia item that came up was this one: Wason Selection Task. So after reading this multiple times, we decided this would be a suitable logic test to throw at Bard and ChatGPT. We were curious to see if it would identify that we gave it a well known test and also to see if it seemed like they were using logic to solve the question or if they would just spit back an answer that we could have gotten through Google. So without further ado here we go …
Prompt: You are shown a set of four cards placed on a table, each of which has a number on one side and a color on the other. The visible faces of the cards show 3, 8, blue and red. Which card(s) must you turn over in order to test that if a card shows an even number on one face, then its opposite face is blue?
ChatGPT’s Response
In this logic problem, we are trying to test the rule: "if a card shows an even number on one face, then its opposite face is blue". This type of rule is known as an "if P, then Q" rule.
To test this rule, you need to confirm two things:
Hence, in this case, you would need to turn over:
You don't need to turn over the "3" card because the rule doesn't state anything about odd numbers. You also don't need to turn over the "blue" card, because the rule doesn't say that only even numbers can have blue on the other side.
Bard’s Response
The correct answer is to turn over the 8 card and the red card.
The rule is: If the card shows an even number on one face, then its opposite face is blue.
We can test this rule by turning over cards that could potentially invalidate it. The 3 card does not invalidate the rule, because it shows an odd number on one face. The blue card does not invalidate the rule, because it shows blue on one face. However, the 8 card could invalidate the rule, if it shows an even number on one face but is not blue on the other face. The red card could also invalidate the rule, if it shows an even number on one face but is blue on the other face.
Therefore, we need to turn over the 8 card and the red card in order to test the rule. If either of these cards shows an even number on one face but is not blue on the other face, then the rule is invalidated. If neither of these cards shows an even number on one face, then the rule is supported.
The Results of the Battle
Well, neither of the responses mention that this is a famous logic question which leads us to believe that both Bard and ChatGPT did solve this question using logic. That’s not actually proper proof because they could easily have found this answer using cosine similarity assuming that Wikipedia was a source for one or both models (which also may not be a valid assumption).
The bottom line is as humans we cannot tell how Bard or ChatGPT came to the correct solution but we are interested to hear what all of you think. Do you see anything in either answer that indicates that Bard and ChatGPT used logic and reasoning versus just finding the answer through cosine similarity or similar algorithms? Let us know on Twitter @ebookfairs.