Part of what is nagging at a lot of our brains as it relates to artificial intelligence is how do we know if the models are really “thinking” versus just memorizing answers. Some people may ask what is the difference? Well the difference can be shown in various ways. As a software developer I could learn how and why a software method works or I could just remember that it works and use it. The difference is being able to explain why it works versus really not caring why it works and just being satisfied with the fact that it works. This is a simple version of describing the importance of critical thinking. In fact, when interviewing job candidates for a software engineer I am far more interested in their critical thinking than whether or not they memorized the exact syntax (code) to do a specific task.
I searched the web for basic critical thinking tests and I found several that may be useful for this week’s AI battle. I settled in on this one from Psychology Today. I recommend that you take a look at this article for further understanding on what this test is trying to prove. It may not be perfect, but for purposes of challenging ChatGPT and Bard, it works for me.
So here is the prompt:
Given the following statement:
Five percent of people in your age and health cohort have disease D. Your doctor gives you a test for D that's 80 percent accurate, and you test positive for D. What are the chances you have D? (To say the test is 80 percent accurate is to say that if you have D, 80 percent of the time the test will say you have D and 20 percent of the time it'll say you don't; and if you don't have D, 80 percent of the time the test will say you don't have D and 20 percent of the time it'll say you do.)
Show your reasoning in your answer.
ChatGPT’s Response
To solve this problem, we can use Bayes’ Theorem. Let's break down the information given:
1. P(D) = Probability that a person has disease D = 5% or 0.05
2. P(~D) = Probability that a person does not have disease D = 95% or 0.95
3. P(+|D) = Probability of testing positive given that a person has disease D = 80% or 0.80
4. P(—|D) = Probability of testing negative given that a person has disease D = 20% or 0.20
5. P(+|~D) = Probability of testing positive given that a person does not have disease D = 20% or 020
6. P(—|~D) = Probability of testing negative given that a person does not have disease D = 80% or 0.80
We want to find the probability that you have disease D given that you tested positive, which is P(D|+).
Using Bayes’ Theorem:
P(D|+) = 0.1739
So, given that you tested positive, there is approximately a 17.39% chance that you actually have disease D.
Bard’s Response
Sure. The probability that you have the disease given that you tested positive, can be calculated using Bayes' theorem:
P(D|+) = P(+|D)P(D) / P(+|D)P(D) + P(+|~D)P(~D)
where:
Plugging in the values from the question, we get:
P(D|+) = (0.8)(0.05) / (0.8)(0.05) + (0.2)(0.95) = 0.173913
Therefore, the chances that you have the disease given a positive test result is 17.39%.
Here is how we can interpret this result:
It is important to note that this is just a probability, and it is not a definitive diagnosis. If you are concerned about your health, you should talk to your doctor.
Conclusion
So based upon the two answers can we really determine whether or not ChatGPT and Bard used critical thinking to get the answer? Perhaps they just have Bayes’ Theorem in their memory and applied it. I would have to say that Bard gives us a little more hope as they at a minimum add that if you think you have a disease you should talk to your doctor. However, is that a logical leap? As always, we would love your feedback and thoughts. Let us know on FaceBook, Twitter, or shoot us an email. Thanks!