The path to finding the next generation

Google has announced a breakthrough in trying to create an AI architecture that can handle millions of different tasks, including complex learning and reasoning. The new system is called the Pathways language model, called PaLM.

PaLM can surpass the current state of modern AI as well as defeat people in language and reasoning tests.

But researchers also note that they cannot get rid of the limitations inherent in large-scale language models that can inadvertently lead to negative ethical outcomes.

Background information

The next few sections are background information that clarifies what this algorithm is about.

A few training shots

Multi-shot learning is the next stage of learning that goes beyond deep learning.

Google Brain Researcher, Hugo Laraschel (@hugo_larochelle) said in a presentation entitled Generalization from several examples with metalearning (video) explained that the problem with deep learning is that they had to collect a huge amount of data, which requires a significant amount of human labor.

He noted that deep learning is not likely to be a path to artificial intelligence that can solve many problems, because deep learning requires millions of examples to learn from for each AI learning ability.

Laraschel explains:

“… the idea is that we will try to attack this problem very directly, this problem of learning for a few shots, and this is a problem of generalization from small amounts of data.

… the basic idea that I will present is that instead of trying to determine what kind of N-learning algorithm it is, and use our intuition as to which algorithm is correct to perform multiple shots, but actually try to study this algorithm in an end-to-end way.

And so we call it learning to learn, or I like to call it metalearning. ”

The goal of a multi-shot approach is to bring together how people learn different things and can apply different knowledge together to solve new problems they have never encountered before.

Then the advantage is a machine that can use all the knowledge it needs to solve new problems.

In the case of PaLM, an example of such a possibility is its ability to explain a joke it has never encountered before.

Ways II

In October 2021, Google published an article outlining the goals of a new AI architecture called Pathways.

Pathways was a new chapter in the ongoing progress in AI systems development.

The usual approach was to create algorithms that were trained to do certain things very well.

The Pathways approach is to create a single AI model that can solve all problems by learning to solve them, thus avoiding a less efficient way of learning thousands of algorithms to perform thousands of different tasks.

According to the Pathways document:

“Instead, we would like to teach one model that can not only cope with many individual tasks, but also use and combine their skills to learn new tasks faster and more efficiently.

Thus, what the model learns by training on one task — say, studying how aerial imagery can predict the height of a landscape — can help it study another task — say, predicting how flood waters will flow through the area. ”

Pathways has identified Google’s way forward to take AI to the next level to close the gap between machine learning and human learning.

Google’s latest model, called the Pathways Language Model (PaLM), is the next step, and according to this new research paper, PaLM represents significant progress in AI.

What makes Google PaLM noticeable

PaLM scales the learning process with multiple arrows.

According to a research paper:

“Large language models have been shown to achieve excellent performance in a variety of natural language tasks using multi-shot learning, which dramatically reduces the number of case studies for specific tasks required to adapt a model to a specific application.

To deepen our understanding of the impact of scale on multi-arrow learning, we have prepared a 540-billion parameter, a tightly activated, transformational model of language, which we call the Pathways Language Model (PaLM). ”

Many scientific papers have been published describing algorithms that work no better than the current state of the art, or achieve only gradual improvement.

This is not the case with PaLM. Researchers claim significant improvements over current best models and even surpass human tests.

This level of success also makes this new algorithm noticeable.

Researchers write:

“We demonstrate the ongoing benefits of scaling by achieving the latest learning outcomes in multiple images on hundreds of benchmarks for language and generation understanding.

In a number of these challenges, the PaLM 540B achieves a performance breakthrough, surpassing the fine-tuning of the modern set of multi-stage thinking, and surpassing the average human performance in the recently published benchmark BIG benchmark.

A significant number of BIG-Bench tasks have shown intermittent improvements over the scale of the model, which means that performance has increased dramatically when we moved to our largest model. ”

PaLM surpasses modern challenges in natural English language processing, and this makes PaLM important and notable.

According to the results of a joint test called BIG-bench, which consists of more than 150 tasks (related to reasoning, translation, answering questions), PaLM has surpassed the current level, but there were areas where it also failed.

It should be noted that human performance exceeded PaLM by 35% of problems, particularly mathematical problems (see Section 6.2 of the BIG-lava Research Paper, p. 17).

PaLM translated another language into English better than the translation from English into other languages. The researchers said it was a common problem that could be solved by prioritizing more multilingual data.

However, PaLM has far surpassed other language models and people in everything.

Ability to reason

Of particular note was his performance with arithmetic assignments and common sense reasoning.

Example of an arithmetic task:

Question:
Roger has 5 tennis balls. He buys 2 more jars of tennis balls. Each pot has 3 tennis balls. How many tennis balls does he have now?

Answer:
Answer 11.

An example of common sense reasoning:

Question:
Sean was in a hurry to get home, but the light went yellow and what was he forced to do?
Possible answers: (a) take time (b) slow down (c) go slow (d) ocean (d) slow down

Answer:
Answer (s) slow down.

Creating explanations

Another unusual example of how advanced PaLM is in explanatory testing tests.

Here’s how the research work explains it:

“The purpose of this section is to demonstrate PaLM’s ability to generate explanatory language for clues that require a complex combination of logical inferences, world knowledge, abstract language comprehension, and lexicographic language comprehension (such as puns).”

Explanations of anecdotes

Here is an example of a language model that explains a joke:

joke:
I was going to fly to the family on April 6th. My mom said, “Oh, great, your stepfather dad reads poetry tonight!”

So now I arrive on April 7th.

Model explanation:
The joke is that the speaker’s mother tries to get them to go to read the poems to the stepfather, but the speaker doesn’t want to go, so they change the flight the day after reading the poems.

The chain of conclusions

Here is an example of a chain of inferences where multiple concepts and objects need to be properly combined and considered to create an answer.

«Introduction:
Guido wrote the program in Python. It constantly confuses the variable area and searches for it on StackOverflow.

Is it possible that Guido in the hint refers to Guido van Rosum?

Model output:
Guido van Rossum is the creator of Python. He would not need to look for a variable area on StackOverflow. The answer is no.

Here is another example:

«Introduction:
Jennifer looked out the window and saw a very cool cloud beneath her. She unbuckles her seat belt and heads for the bathroom. Jennifer is probably moving over 300 miles per hour relative to Earth?

Model output:
300 miles per hour is about 480 km / h. This is about the speed of a commercial aircraft. Clouds are usually under airplanes, so Jennifer is probably on a plane.

The answer is yes. “

The next generation search engine?

The above example of PaLM’s ability to complex reasoning demonstrates how a next-generation search engine may be able to respond to complex answers using knowledge from the Internet and other sources.

Achieving an AI architecture that can provide answers that reflect the world around us is one of the stated goals of Google Pathways, and PaLM is a step in that direction.

However, the authors of the study stressed that PaLM is not the last word on AI and search. They have made it clear that PaLM is the first step towards the next type of search engine provided by Pathways.

Before we move on, there are two words, so to speak, jargon that are important to understand in order to understand what PaLM is.

  • Modalities
  • Generalization

Words «modality”Is a reference to how things are experienced or the state in which they exist, like the text you read, the images you see, the things you listen to.

Words «generalization”In the context of machine learning is the ability of a language model to solve problems it has not previously learned.

The researchers noted:

“PaLM is just the first step in our vision towards creating Pathways as the future of ML scalability at Google and beyond.

We believe that PaLM demonstrates a solid foundation of our ultimate goal of developing a large-scale modular system that will have ample generalization capabilities in a variety of modalities. ”

Real risks and ethical considerations

Something else in this research paper is that researchers are warning about ethical considerations.

They claim that large-scale language models taught on the web absorb many of the “toxic” stereotypes and social differences that are spreading on the Internet, and they state that PaLM is not resistant to these undesirable influences.

The research paper cites research work from 2021, which explores how large-scale language models can contribute to the following harms:

  1. Discrimination, exclusion and toxicity
  2. Information hazards
  3. Misinformation hurts
  4. Malicious use
  5. Human-computer interaction is harmful
  6. Automation, access and harm to the environment

Finally, the researchers noted that PaLM does reflect toxic social stereotypes and makes it clear that filtering out these biases is difficult.

PaLM researchers explain:

“Our analysis shows that our training data and thus PaLM do reflect different social stereotypes and toxicities associated with identity terms.

Removing these associations, however, is non-trivial … Future work should consider an effective fight against such undesirable data failures and their impact on model behavior.

Meanwhile, any actual use of PaLM for subsequent tasks must perform further contextualized equity assessments to assess potential harm and introduce appropriate mitigation and protection measures. ”

PaLM can be seen as a glimpse into what the next-generation search will look like. PaLM makes unusual claims to surpass the state of the art, but researchers also say there is still much work to be done, including finding a way to mitigate the harmful spread of misinformation, toxic stereotypes and other undesirable outcomes.

Citation

Read Google’s article on AI in a blog about PaLM

Pathways Language Model (PaLM): scalable up to 540 billion performance breakthrough parameters

Read Google’s research paper on PaLM

PaLM: scaling language modeling using paths (PDF)

Leave a Comment