Dec 27, 2020 - 7 ' read

A week with GPT3

Posted on December 27, 2020

On the 19th of December, I woke up bleary eyed on a Saturday after another late night of doing CTFs and reached for my phone to check the time, expecting a lazy weekend ahead to cool down and I noticed an email from OpenAI. Suffice to say that woke me faster than any amount of binge drinking Monster energy drinks would. The email just said I now have a trial access to GPT3, either through March or until I hit my quota, whichever comes first. So I jumped right out of bed and to my PC to read the docs.

The Project - Discord Technical Helper

I have a private discord server for a few friends to ~~talk about ideas, current events and projects~~ shitpost together. I've already made a chatbot previously (writeup coming before the heat death of the universe) that was pretty successful and also a good avenue to practice my javascript. With access to GPT3, I figured a chatbot designed to assist in our technical tasks would be a great project. While we're not all programmers, all of us work in a STEM related field.

GPT3 works based on prompts. You provide a prompt, set the stop sequences, and submit it via an API call to get a generated response back. There are a few parameters to tweak, but the only real learning curve is figuring out how to use the stop sequences.

Enter November, or 11 for short. Named so because I made the first commit on this project in November. All the code used is available on my github here. OpenAI has an example for a chatbot with a personality, so I decided to add one too and wrote the prompt in an attempt to create a brainy and smug android.

Since the discord.py functions are quite strict, I made a function that would send the Discord message to GPT 3 together with the prompt and imported it into the main bot code as a function called ask().

Ask Function
Main code:

main.py
Let's give it a whirl:

Hello World
Wonderful. Time for some field testing. I've asked my friends to ask questions about something they're knowledgeable about.

OSI Stack

ERA

HEAT

Slope

Anthropology related questions

Security assessment

Even knowledge about some more obscure projects:

Dark Winter Project

Games

Some python code, despite it not being included in the prompt:

Suxnet

Pretty impressive, isn't it?

The Good - Impressive results

Even with a fairly short prompt, the results were really impressive. The responses were articulate and had a twinge of personality added to them. GPT3 proved to be knowledgeable across all fields that we probed it for, even being snarky when we made typos or sent the message before we finished writing it. It was also incredibly fast, most responses taking less than 1 second to be processed and sent back even over Discord.

The real magic came from those little personality touches. The occasional snarky response or burn over a typo got us laughing, and some of its replies were so insightful that for a second we forgot we were talking to an API, instilling a sense of somber.

The trial is also reasonably generous, with 300k tokens being enough to get a good idea about what you're dealing with, although you can definitely burn through them surprisingly fast.

The bad - It's expensive

While very flexible in terms of pricing, the fact that for every request you have to send the whole prompt can drive the cost up significantly. Our queries together with the prompt and response were around 500 tokens, and while using the "best of" parameter and the davinci engine, the cost would be somewhere between 3 and 5 cents per question. Not a lot at first glance, but 20â€“30 questions can get you to a dollar and that's not a huge number. A dollar per day is 30 dollars per month. Not a huge sum, but consider all the other things you could get for less than 30 dollars. Your own domain, a HackTheBox sub and a Netflix sub. The flexible pricing is great, it allows me to run the chatbot for 20â€“30 dollars a month without needing to go for the 100 dollar option that offers about 2 million tokens. You can also downgrade the quality of the responses to cut down the price. The flexibility is there and it's good. The issue is that you can burn through tokens prohibitively fast during initial testing, as even requests using the playground count towards your cap.

Being an invite only program, there's not a lot of documentation available on the web. OpenAI has some examples on their website, but it might take you a moment to get your head around it. Also, some of the responses were cherry picked. While about 90% of the responses were good to great, 10% were complete nonsense.

The Great - It just works

Once you figure out how the prompt system works and plug it into something else, it just works and with great results.The responses themselves involve 0 coding and anyone can start playing around with it regardless of how much experience they have.

OpenAI's website automatically loads your API key in their documentation, while also providing great guidelines and examples. The playground prompts can be exported as Python code or curl requests, making integration really painless. You can quite literally go from 0 to a working prototype in a matter of hours and that's remarkable. And best of it all, you don't have to spend time training anything. The prompt that gets submitted is the training, so keeping the dataset intact is easy.

Upgrades people, upgrades

The main complaint from the entitled server users was the lack of persistence. Given how the prompt system worked, GPT3 would consider each query in a vacuum. People demanded persistence and I looked for a way to do it that wouldn't blow through my remaining tokens.

So I decided I'd use a cache system. Once someone interacted with 11, a cache of the dialogue would start being built and sent alongside the prompt and the query. This allowed 11 to discuss the same topic in depth and appear more human. The real issue came from integrating this system. Since the discord.py library is very strict with what parameters its functions can accept, I had to resort to using global variables.

Globals
A new problem came up now, namely token usage. Given that traffic was rather heavy, the cache would quickly balloon in size and end up burning through a ton of tokens. So I decided to implement a timer. If you'd stop interacting with 11 for more than 2 hours, the cache would get emptied. A generous timeline, but I figured it'd be enough to keep a conversation going and account for lulls in it. To implement this, I had to use a timer. Given the nature of the discord.py library, it had to be asynchronous, so I decided to use the elegant solution and use multi threading. First define a time function to use a new thread with the threading.Timer function:

Timer
Next, simply call it after every query:

Cache
Done and done. The AI is now ready to roll. Let's test it.

persistence

It works! I wonder what in depth conversations will users have with it now tha-

ouch

Well, this is unfortunate. Checking my quota usage on OpenAI, it reveals that I've used almost all of it:

ouch

I won't lie, it was a real bummer. I had more features in mind, such as it being able to differentiate between users. I also tested some other ideas in the playground, such as static code analysis and sentiment analysis and I was planning to integrate them into some other projects. Unfortunately, I've had to put a stop to it for the foreseeable future.

Conclusion - Impressive but limited

Much like me during university, GPT3 may appear extremely intelligent but deeper probing reveals how mindless it really is. It is an extraordinary achievement, and with adequate tweaking it can have some incredible results. It's user friendly and very fast to prototype and "train". It won't replace programmers anytime soon, but I can certainly see Elon Musk's point that AI can augment people's work instead of replacing it. You can certainly use OpenAI to automate away dreary tasks and quickly too. It's not general artificial intelligence, but it has certain moments when it makes you forget it's just a bot.

almost human

Downsides aside, it's an amazing step towards Artificial General Intelligence, suddenly increasing the feeling of urgency in the need to talk about AI ethics. OpenAI have knocked it out of the park by making it affordable for small scale developers and easy to use to companies with deep pockets and I think we're gonna see some really neat projects as more and more people are granted access to the API. At the end of this week, I only wish I had more tokens.