The latest technology and digital news on the web

Human-centric AI news and analysis

Google’s new trillion-parameter AI accent model is almost 6 times bigger than GPT-3

A trio of advisers from the Google Brain team afresh apparent the next big thing in AI accent models: a massive one trillion-parameter agent system.

The next better model out there, as far as we’re aware, is OpenAI’s GPT-3, which uses a measly 175 billion parameters.

Background: Accent models are able of assuming a array of functions but conceivably the most accepted is the bearing of novel text. For example, you can go here and talk to a “philosopher AI” accent model that’ll attack to answer any catechism you ask it (with abundant notable exceptions).

While these absurd AI models exist at the cutting-edge of apparatus acquirements technology, it’s important to bethink that they’re about just assuming parlor tricks. These systems don’t accept language, they’re just fine-tuned to make it look like they do.

That’s where the number of ambit comes in – the more basic knobs and dials you can twist and tune to accomplish the adapted outputs the more finite ascendancy you have over what that output is.

What Google‘s done: Put simply, the Brain team has ample out a way to make the model itself as simple as accessible while binding in as much raw compute power as accessible to make the added number of ambit possible. In other words, Google has of money and that means it can afford to use as much accouterments compute as the AI model can conceivably harness.

In the team’s own words:

Switch Transformers are scalable and able accustomed accent learners. We abridge Mixture of Experts to aftermath an architectonics that is easy to understand, stable to train and vastly more sample able than equivalently-sized dense models. We find that these models excel across a assorted set of accustomed accent tasks and in altered training regimes, including pre-training, fine-tuning and multi-task training. These advances make it accessible to train models with hundreds of billion to abundance ambit and which accomplish abundant speedups about to dense T5 baselines.

Quick take: It’s cryptic absolutely what this means or what Google intends to do with the techniques declared in the pre-print paper. There’s more to this model than just one-upping OpenAI, but absolutely how Google or its audience could use the new system is a bit muddy.

The big idea here is that enough brute force will lead to better compute-use techniques which will in turn make it accessible to do more with less compute. But the accepted absoluteness is that these systems don’t tend to absolve their actuality when compared to greener, more useful technologies. It’s hard to pitch an AI system that can only be operated by trillion-dollar tech companies accommodating to ignore the massive carbon footprint a system this big creates.

Context: Google‘s pushed the limits of what AI can do for years and this is no different. Taken by itself, the accomplishment appears to be the analytic progression of what’s been accident in the field. But the timing a bit suspect.

Published January 13, 2021 — 17:08 UTC

Hottest related news