
Exploring Multimodal AI: Why Google’s Gemini and OpenAI’s GPT-4o Chose This Path | ChatCAT and the Future of Interspecies Communication | Episode 23
05/20/24 • 10 min
The recent spring updates and demos by both Google (Gemini) and OpenAI (GPT-4o) feature prominently their multimodal capabilities. In this episode, we discuss the advantages of multimodal AI versus models focused on specific modalities such as language. Via the example of chatCAT, a hypothetical AI that helps owners understand their cats, we explore multimodal’s promise for a more holistic understanding Please enjoy this episode.
For more information, check out https://www.superprompt.fm There you can contact me and/or sign up for our newsletter.
The recent spring updates and demos by both Google (Gemini) and OpenAI (GPT-4o) feature prominently their multimodal capabilities. In this episode, we discuss the advantages of multimodal AI versus models focused on specific modalities such as language. Via the example of chatCAT, a hypothetical AI that helps owners understand their cats, we explore multimodal’s promise for a more holistic understanding Please enjoy this episode.
For more information, check out https://www.superprompt.fm There you can contact me and/or sign up for our newsletter.
Previous Episode

Google Gemini Cheatsheet | Episode 22
Google recently announced, Gemini, a family of large-scale multimodal AI models: Nano, Pro, and Ultra. This podcast is a brief summary of Google's models, and the Open AI comparables e.g. GPT3, GPT4, and chatGPT. You can take Gemini for a spin at https://gemini.google.com. (Note: I am not sponsored by Google.) Long time listeners will probably notice a change to our theme music and intro. I hope you like it!
For more information, check out https://www.superprompt.fm There you can contact me and/or sign up for our newsletter.
Next Episode

Anthropic's Claude chatbot | Benchmarking LLMs | LMSYS Leaderboard | Episode 24
In this solo episode, we go beyond Google's Gemini and OpenAI's ChatGPT to take a look at Anthropic, a startup that made headlines after securing a $4 billion investment from Amazon. We'll also dive into the importance of AI industry benchmarks. Learn about LMSYS's Arena Elo and MMLU (Measuring Massive Multitask Language Understanding), including how these benchmarks are constructed and used to objectively evaluate the performance of large language models. Discover how benchmarks can help you identify promising chatbots in the market. Enjoy the episode!
Anthropic's Claude
https://claude.ai
LMSYS Leaderboard
https://chat.lmsys.org/?leaderboard
For more information, check out https://www.superprompt.fm There you can contact me and/or sign up for our newsletter.
If you like this episode you’ll love
Episode Comments
Generate a badge
Get a badge for your website that links back to this episode
<a href="https://goodpods.com/podcasts/super-prompt-the-generative-ai-podcast-252996/exploring-multimodal-ai-why-googles-gemini-and-openais-gpt-4o-chose-th-52281107"> <img src="https://storage.googleapis.com/goodpods-images-bucket/badges/generic-badge-1.svg" alt="listen to exploring multimodal ai: why google’s gemini and openai’s gpt-4o chose this path | chatcat and the future of interspecies communication | episode 23 on goodpods" style="width: 225px" /> </a>
Copy