https://slatestarcodex.com/2020/06/10/the-obligatory-gpt-3-post/
I.
I would be failing my brand if I didn’t write something about GPT-3, but I’m not an expert and discussion is still in its early stages. Consider this a summary of some of the interesting questions I’ve heard posed elsewhere, especially comments by gwern and nostalgebraist. Both of them are smart people who I broadly trust on AI issues, and both have done great work with GPT-2. Gwern has gotten it to write poetry, compose music, and even sort of play some chess; nostalgebraist has created nostalgebraist-autoresponder (a Tumblr written by GPT-2 trained on nostalgebraist’s own Tumblr output). Both of them disagree pretty strongly on the implications of GPT-3. I don’t know enough to resolve that disagreement, so this will be a kind of incoherent post, and hopefully stimulate some more productive comments. So:
OpenAI has released a new paper, Language Models Are Few-Shot Learners, introducing GPT-3, the successor to the wildly-successful language-processing AI GPT-2.
GPT-3 doesn’t have any revolutionary new advances over its predecessor. It’s just much bigger. GPT-2 had 1.5 billion parameters. GPT-3 has 175 billion. The researchers involved are very open about how it’s the same thing but bigger. Their research goal was to test how GPT-like neural networks scale.
Before we get into the weeds, let’s get a quick gestalt impression of how GPT-3 does compared to GPT-2.
Here’s a sample of GPT-2 trying to write an article:
06/12/20 • 26 min
Episode Comments
0.0
out of 5
No ratings yet
eg., What part of this podcast did you like? Ask a question to the host or other listeners...
Post
Generate a badge
Get a badge for your website that links back to this episode
<a href="https://goodpods.com/podcasts/astral-codex-ten-podcast-94268/the-obligatory-gpt-3-post-5058184"> <img src="https://storage.googleapis.com/goodpods-images-bucket/badges/generic-badge-1.svg" alt="listen to the obligatory gpt-3 post on goodpods" style="width: 225px" /> </a>
Copy