managemnet company strategy managemanet How Large Language Models Reflect Human Judgment

How Large Language Models Reflect Human Judgment

How Large Language Models Reflect Human Judgment post thumbnail image

Artificial intelligence is based around prediction. But decision making requires prediction and judgment. This leaves a role for people, to give judgment on what kinds of outcomes are better and worse. But the large language models represent a significant advance: OpenAI has found a way to teach its AI human judgment by using a simple form of human feedback, through chat. That opens the door to a new way for people to work with AI, essentially communicating with them which outcomes are better or worse for any given type of decision.

Artificial intelligences are prediction machines. They can tell you it’s likely to rain today, but they can’t tell you whether you should pack an umbrella. That’s because the umbrella decision requires more than a prediction – if the chance of rain is 10%, some people will choose to carry an umbrella, but others will not. Why do different people behave differently even when they are faced with the same information? Because they want different things. In this case, some people care more than others about readability. Only you or someone who knows you well can decide the costs and benefits of carrying an umbrella. Making that decision for you requires foresight and judgment based on your preferences.

AIs excel at prediction but they lack judgement. Certainly, there are many decisions where we know the rewards or can easily make them. Basically, we know what a driver should do in most situations – accelerate, brake, turn – because we know the consequences of not doing the right thing. But ask Google to give you advice about a new dishwasher, and the best it can do is predict which pages are likely to provide the information you’re looking for. It’s not lucky enough to tell you what to do. Likewise, even if you worry that your employer will use an AI to determine whether you should be fired, the machine is not ultimately responsible. AI can provide a prediction about your performance, but your employer applies the judgment and decides who to fire.

This is why, in our 2018 book Prediction Machines, we see a role for reward function engineers whose job is to “determine the rewards of various actions, due to predictions made by AI.” AI prediction can improve so many decisions that reward-driven engineers who understand predictions and decision-making at risk will provide a skilled complement as AI adoption continues to accelerate. But innovation in reward function engineering is slow. There has been little progress in developing tools to codify appropriate human judgment in machines before deploying them at scale.

Until recently. Large language models (LLMs) are, for all their seeming intelligence, still prediction machines. But they’re changing the way AI helps make decisions because they’re changing the way humans make judgments.

Ask ChatGPT to rewrite a paragraph more clearly for an audience, and it doesn’t give you options or a lecture on grammar and rhetoric. This gives you a paragraph. This is impressive, but the real miracle here is that ChatGPT can write paragraphs you want. There are a ton of reward and risk issues that go into writing a paragraph. Is the writing honest (consistent with the facts), harmless (does not include words that may offend), and helpful (achieves the purpose of the paragraph)? Just think about that last one. These models are trained on existing human writing. The paragraph created, at its core, is written using a form of “autocomplete” that repeats itself. When our phones autocomplete, it does a good but not perfect job. So how does ChatGPT produce written results that are better than the average person? How does ChatGPT decide the quality from all the content, good and bad, it is trained? Basically, why didn’t it become a toxic cesspool Microsoft’s Tay chatbot became after one day on Twitter?

Some people, like Stephen Wolfram, believe that LLMs reveal some fundamental rules of grammar. Yes, that makes the writing readable, but it certainly doesn’t make it clear and compelling.

A 2022 paper from OpenAI researchers provides an important clue. That paper describes how raw, unmodified LLMs are taken and used to offer outputs to real people. For the most part, people are asked use their judgment to rank multiple alternative outcomes to the same prompt. Ranking criteria are carefully specified (ie, helpfulness, honesty, and harmlessness are prioritized). It seems that with clear instructions and some person-to-person training, different people can easily agree on things.

Those rankings are then used to tweak or “fine tune” the algorithm. The model learns human judgment and adjusts based on positive and negative reinforcement. Articles that rank high receive a positive boost. Those who don’t get a negative hit. Interestingly, even for a model trained on billions of pages, with only a few thousand examples of human judgment in the form of ranked responses, the AI ​​begins to produce results for all high ranking question. This happened even with questions that seemed far from being ranked by the evaluators. In a way, the human judgment of writing quality is spread throughout the model.

Evaluators effectively reward function engineers. But unlike a statistical model, the output of which may not be good for many people, LLMs are associated with plain language — anyone can help teach model judgment. In other words, anyone who can speak or type can be a reward function engineer. The surprising discovery behind ChatGPT is that with little effort, reward function engineers were able to train LLMs to be profitable and safe. This is how OpenAI was able to launch a consumer-facing model that didn’t suffer the flaws of its predecessors when released into the wild. This simple method of codifying human judgment into machines supercharges AI performance. The machine is now filled with the ability to not only predict word sequences that are likely, but also to use the reward function of engineers to judge sequences to increase their appeal to readers. Discovering an easy way for machines to use human judgment – ​​the equivalent of any AI prediction machine in determining risks and rewards in a wide variety of situations – makes all the difference.

For many decisions, engineers with special reward functions are needed to deploy AI prediction engines at scale. Discovering this intuitive way of codifying human judgment into a machine – improving it by reinforcing learning from human feedback – could open up many valuable AI applications. where human judgment is hard to codify in advance but easy to implement once you see it.

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Post