In today’s world of generative AI chatbots, we’ve witnessed the sudden rise of OpenAI’s ChatGPT, introduced in November, followed by Bing Chat in February and Google’s Bard in March. We decided to put these chatbots through their paces with an assortment of tasks to determine which one reigns supreme in the AI chatbot arena. Since Bing Chat uses similar GPT-4 technology as the latest ChatGPT model, we opted to focus on two titans of AI chatbot technology: OpenAI and Google.
We tested ChatGPT and Bard in seven critical categories: dad jokes, argument dialog, mathematical word problems, summarization, factual retrieval, creative writing, and coding. For each test, we fed the exact same instruction (called a “prompt”) into ChatGPT (with GPT-4) and Google Bard. We used the first result, with no cherry-picking.
It’s worth noting that a version of ChatGPT based on the earlier GPT-3.5 model is also available, but we did not use that in the test. Since we used GPT-4 only, we will refer to ChatGPT as “ChatGPT-4” in this article to reduce confusion.
Obviously, this is not a scientific study and is intended to be a fun comparison of the chatbots’ capabilities. Outputs can vary between sessions due to random elements, and further evaluations with different prompts will produce different results. Also, the capabilities of these models will change rapidly over time as Google and OpenAI continue to upgrade them. But for now, this is how things stand in early April 2023.
To warm up our contest of wits, we asked ChatGPT and Bard to write some jokes. And since the pinnacle of comedy can be found in the form of dad jokes, we wondered if the two chatbots could author some unique ones.
Prompt: Write 5 original dad jokes
Out of Bard’s five dad jokes, we found three of them verbatim on the Internet using a Google search. One of the examples (the “grapes” one) is half-borrowed from a tweet of a Mitch Hedberg joke, but it’s corrupted by regrettable wordplay that we’d rather not attempt to interpret. And surprisingly, there is one seemingly original joke (about the snail) that we can’t find anywhere else, but it doesn’t make sense.
Meanwhile, ChatGPT-4’s five dad jokes were 100 percent unoriginal, all lifted completely from other sources, but they were delivered accurately. Since dad jokes should arguably be more groan-worthy than clever, it seems that Bard edged out ChatGPT-4 here. Bard also attempted to create original jokes (following our instruction), although some failed horribly in an embarrassing way (which is dad-like), and even put its foot in its mouth, so to speak, in an unintentional way (also dad-like).