Fine-Tuning GPT-2 from Human Preferences

(openai.com)

71 points | by runesoerensen 1680 days ago

5 comments

honoredb 1679 days ago
The story of a bug that caused the AI to optimize for maximally disturbing text that went unchecked because the only people authorized to stop it were asleep is a great illustration of how easy it is for an AI to "go evil" when you're not worrying about safety.
[-]
- GranPC 1679 days ago
  Link, please?
  [-]
  - wgjordan 1679 days ago
    https://openai.com/blog/fine-tuning-gpt-2/#bugscanoptimizefo...
    (Someone didn't read the whole article..)
  - ipsum2 1679 days ago
    Not OP, but I think they're referring to: https://slatestarcodex.com/2018/10/30/sort-by-controversial/
    Though I hesitate to draw strong conclusions from fiction.
apolinario 1674 days ago
Is there already a way for a regular person to fine-tune the 774M model in a similar fashion as it was possible for the 124M and 355M with gpt-2-simple?
ayw 1679 days ago
founder of Scale (scale.com) here! We worked with OpenAI to produce the human preferences to power this research, and are generally very excited about it :)
[-]
- gwern 1679 days ago
  Any thoughts about offering this as as service? There are lots of hobbyists who have been playing around with GPT-2 text generation, and it'd be sweet if you could just fire up a simple form URL with two text snippets, two options, and it trains on feedback.
  [-]
  - ayw 1679 days ago
    It's a good idea! They didn't demonstrate a lot of the inputs as the models were training, but that was very entertaining of course.
- GregoryPerry 1679 days ago
  Super great! Perhaps you would actually release your GPT-2 774M model instead of the "OpenAI way of just talking about it" for media exposure purposes :)
jeffshek 1679 days ago
In regards to improving the original dataset of 345/745M, I'm encountering this while I've been building an open-sourced tool at https://writeup.ai
I'm not certain how random users feel when knowing their selection knowing it's being used to improve an algorithm. In this case, it's relatively easy for me to log the prompt and the option that was selected -- just doing it felt a little ... bad, and I'm a little too scared of GDPR for a fun project.
The other thing is humans sometimes select the funnier option even though it may not necessarily be the best one. In a Show Reddit post, the most upvoted response is a Game of Throne's character Tyrion and a brothel story.
https://www.reddit.com/r/FanFiction/comments/d5s9yh/i_made_a...
Open-sourced Code: https://github.com/jeffshek/writeup-frontend
[-]
- Jack000 1679 days ago
  just curious how you're hosting this? The cheapest GCP GPU instance is $0.5/hr, which is pretty expensive long-term.
  [-]
  - jeffshek 1679 days ago
    GCP Instances that are scaling up and down based on usage. Lot of image freezing, for faster boots, etc.
    One hack is for the medium-level models, you can actually run them on Cascade Lake (which are sort-of more optimized for ML) than traditional processors. There's a 30% performance there. Mathematically, GPU vs CPU in performance at inference time, you're paying an annoying premium!
    Right now, the default writing style "medium" is running in Cascade Lake (no gpu). I also (over)optimized the microservices running the endpoints too.
    [-]
    - Jack000 1679 days ago
      didn't know about Cascade Lake, thanks! Every time I did the numbers it didn't add up for cloud GPU. I could buy a new gaming desktop, serve the model from that, and make my money back in 2-3 months
mordae 1679 days ago
TL;DR Can you give me a summary. :-)
[-]
- gwern 1679 days ago
  You can generate one yourself if you want! https://github.com/openai/lm-human-preferences
  > Note that we provide pre-trained models, so you can skip directly to RL fine-tuning or even to sampling from a trained policy, if desired.