Generating Questionable Pandemic Advice with GPT-2

I have created a site that shares highly questionable advice for life during a pandemic. The advice was generated using OpenAI's GPT-2, which you may remember as the AI that was initially hyped as being too dangerous to release.

Some of the advice is reasonable:

  • Remain calm.
  • Avoid spreading germs.
  • Protect areas of the body from contamination.
  • Consider taking a long leave of absence from work.
  • When entering or exiting a venue, keep your hands off people.

On the other hand, some of the advice is definitely not a good idea:

  • Wrap your mouth with tape.
  • Drink water with table salt.
  • Avoid using any sort of soap.

And some advice is very odd:

  • Stay away from close contact with death-like objects.
  • Remove fingerprints from your hands if possible.
  • Report any unexpected sound to the appropriate authorities.

What's Going On Here?

A message from the future (October 2020): A while after writing this article, GPT-3 was revealed. It's like GPT-2, only more so. This time around the press focused on the few-shot learning thing a lot more. Keep that in mind for context as you read this outdated article.

GPT-2 is a language model, meaning that when given a chunk of English text it is trained to predict what comes next. By repeatedly predicting the next word1, the model can be used to generate text.

One of my goals with this blog is to only write about things that haven't already been covered by every other computer-science-y blogger on the internet. I won't write about GPT-2 in detail, because it's been done repeatedly, but I do want to mention the aspect that I find most interesting.

The pre-trained model released by OpenAI was trained on a massive dataset (consisting of every outgoing link on Reddit with at least 3 karma), and as a result it understands many different domains of English text. This means that I didn't need to do any "fine-tuning" (training the model on domain-specific data) to generate my pandemic advice. I simply showed the model a list of a 3 examples of pandemic advice, prefaced by the text "In the event of a global pandemic, remember:", and then asked it to predict what comes next.

This ability to perform a specific task (generating pandemic advice) without fine-tuning is very convenient, since training a model this large requires a lot of computing power, not to mention an appropriate dataset.

In fact, attempting to accomplish specific tasks without additional task-specific training is the focus of the paper introducing GPT-2 by Radford et al., which is titled "Language models are unsupervised multitask learners." They show that a general language model can perform ok-ishly on tasks including reading comprehension, text summarization, and even French-to-English translation, despite the fact that the model was only trained on English texts2.

This is known as "few-shot" learning3, and I think it's neat that language models can do this 4. I'm kinda surprised that most of the media coverage of GPT-2 didn't mention this at all, mainly focusing on how plausible the examples of generated text looked5 and the whole "too dangerous to release" angle. This is especially strange considering that they could have spun it as a big step towards general AI and "the Singularity"; then again, I guess there's a limit to how much sensationalist spin you can put on one story.

Try It Yourself

You can experiment with GPT-2 at talktotransformer.com, and if you're more technically inclined6 you can take a look at my script for generating lists of short sentences/phrases (which in turn uses huggingface's transformer library).


  1. GPT-2 actually operates on sub-word units. For instance, "-ing" or "-able" might be predicted. Specifically, Byte Pair Encoding is used. Also, the model doesn't "predict" a single next token, instead estimating the probability distribution for the next token.
  2. Of course, some English texts contain occasional French phrases, allowing the model to learn a bit of French.
  3. Or at least in the same ballpark as few-shot learning.
  4. Although a language model that can solve arbitrary tasks well would probably need to be unimaginably massive. Note from the future: GPT-3 is much better at this than GPT-2, but it does accomplish this mainly by being truly massive.
  5. In particular, most articles heavily featured a silly text about unicorns that only appears in the appendix of the paper.
  6. and have gone to the trouble of getting CUDA working properly on your system