AI Smarts now comes with a big price tag

Calvin Q, K. Working at a search startup called Glean, his company would like to use the latest artificial intelligence algorithms to improve products.

Glean provides search tools through applications such as Gmail, Slack and Salesforce. Q said the new AI techniques for language analysis will help Glian’s clients find the right files or conversations much faster.

But training such a sophisticated AI algorithm costs millions of dollars. So Glean uses smaller, less capable AI models that can’t make much sense out of text.

“It’s hard to get the same level of results for small places on a small budget,” Q said. The most powerful AI models are “out of the question,” he says.

AI has produced exciting success over the past decade – programs that can beat people in complex games, drive on city streets under certain conditions, respond to spoken commands, and write consistent text based on short prompts. Writing in particular relies on recent advances in computer language analysis and manipulation.

These advances are essentially learning as an example of feeding algorithms as more text, and with more chips that they can digest. And it costs money.

Consider OpenAI’s language model GPT-3, a large, mathematically simulated neural network that was fed rims of scraped text from the web. GPT-3 can find statistical patterns that predict, with interesting consistency, which words should follow others. Out of the box, GPT-3 is significantly better than previous AI models in answering questions, summarizing text, and correcting grammatical errors. By one measure, it is 1,000 times more capable than its predecessor, the GPT-2. But the cost of GPT-3 training is, according to some estimates, about $ 5 million.

“If GPT-3 was accessible and cheap, it would fully charge our search engines,” Q said. “It will be really, really strong.”

The spiraling cost of advanced AI training also seeks to create their problems for established companies.

Dan McCurry leads a team in a division of Optom, a health IT company that uses language models to identify high-risk patients or analyze transcripts of calls to recommend referrals. He said even training a language model that could consume GPT-3’s one-thousandth-size fast team budget. Models need to be trained for specific tasks and it can cost up to $ 50,000 to rent cloud computing companies their computers and programs.

Cloud computing providers have little reason to reduce costs, McCurry said. “We can’t believe that cloud providers are working to reduce our costs for creating our AI models,” he says. He is looking to buy specialized chips designed to speed up AI training.

Part of why AI has progressed so fast recently is because so many academic labs and startups can download and use new ideas and strategies. Algorithms that have led to success in image processing, for example, originated from academic labs and were developed using off-the-shelf hardware and openly shared data sets.

Over time, however, it has become increasingly clear that the advancement of AI is associated with an indicative increase in the underlying computer power.

Big companies, of course, always had the advantage in terms of budget, scale and reach. And large-scale computer power in industries such as drug discovery.

Now, some are pushing to scale things further. Microsoft said this week that, with Nvidia, it has created a language model twice as large as the GPT-3. Researchers in China say they have created a language model that is four times larger than that.

“The cost of AI training is skyrocketing,” said David Canter, executive director of MLCommons, an organization that tracks the performance of chips designed for AI. He said the idea that big models can unlock valuable new capabilities can be seen in many areas of the technology industry. This may explain why Tesla is designing its own chips to train AI models for automatic driving.

Some are concerned that the rising cost of tapping the latest and greatest technology could slow down innovation by saving it for the largest companies and those who lease their equipment.

“I think it slows down innovation,” said Chris Manning, a Stanford professor who specializes in AI and language. “When we have a handful of places where people can play with the underlying part of the model on this scale, the amount of creative search needs to be greatly reduced.”

Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button