Before OpenAI’s ChatGPT came along and grabbed the world’s attention for its ability to create compelling sentences, a tiny startup called Latitude created an AI that could use artificial intelligence to create fantastical narratives based on your prompts. It was surprising consumers with dungeon games.
But as AI Dungeon grew in popularity, Latitude CEO Nick Walton recalled that the cost of maintaining the text-based role-playing game began to skyrocket. AI Dungeon’s text generation software was enhanced by microsoftOpenAI, an AI research institute backed by . The more people playing AI Dungeon, the bigger the bill Latitude had to pay OpenAI.
To make matters worse, Walton discovered that content marketers were using AI Dungeon to generate promotional copy. This is a use of the AI Dungeon that his team never foresaw and ended up adding to his AI bill for the company.
At its peak in 2021, Walton estimates that Latitude was spending nearly $200,000 a month on OpenAI’s so-called generative AI software and Amazon Web Services to serve the millions of user queries it needed to process each day. .
“We joked that we have human employees and AI employees and spend about the same amount on each,” Walton said. “We spent hundreds of thousands of dollars a month on AI, and we weren’t a big startup, so it cost us a lot.”
By the end of 2021, Latitude will have switched from using OpenAI’s GPT software to a cheap but capable language software from startup AI21 Labs, said Walton, who said the startup will open its doors to lower costs. He added that the source and free language models have also been incorporated into the service. Walton said Latitude’s generative AI bill is below $100,000 a month, and the startup charges players monthly subscriptions for more advanced AI capabilities to reduce costs.
Latitude’s hefty AI bill highlights the uncomfortable truth behind the recent boom in generative AI technology. Software development and maintenance costs can be very high, both for companies that develop large languages or underlying technologies, commonly referred to as foundation models. , and those that use AI to power their own software.
The high cost of machine learning is an uncomfortable reality in the industry as VCs look to companies potentially worth trillions of dollars and large corporations such as Microsoft. metaand Google They use their considerable capital to develop technology leads that smaller challengers cannot keep up with.
But the high cost of computing could put a damper on the current boom if the margins for AI applications are permanently smaller than those of software-as-a-service before.
The high cost of training and “inferring” (actually running) language models at scale is a structural cost that differs from previous computing booms. Even when software is built or trained, running large language models requires enormous computational power. This is because billions of calculations are performed each time you respond to a prompt. By comparison, serving a web app or page requires far less computation.
These calculations also require dedicated hardware. Traditional computer processors can run machine learning models, but they are slow. Today, most training and inference are done on graphics processors (GPUs). Initially intended for 3D games, GPUs have become the standard for AI applications because they can perform many simple calculations simultaneously.
Nvidia makes most of the GPUs for the AI industry, with flagship chips in major data centers priced at $10,000. The scientists who build these models say, “melt the GPU.”
training model
Nvidia A100 processor
NVIDIA
Analysts and engineers estimate that the significant process of training a large language model like GPT-3 costs over $4 million. Forrester analyst Rowan Curran, who specializes in AI and machine learning, said training more sophisticated language models could cost more than “high single digits.”
For example, Meta’s largest LLaMA model, released last month, used 2,048 Nvidia A100 GPUs to train 1.4 trillion tokens (750 words is about 1,000 tokens) and took about 21 days, the company said last month. I mentioned when I released the model.
Training took about 1 million GPU hours.and Dedicated pricing for AWS, cost more than $2.4 million. And with 65 billion parameters, it is smaller than OpenAI’s current GPT models like ChatGPT-3, which has 175 billion parameters.
Clement Delangue, CEO of AI startup Hugging Face, said the company’s Bloom large-scale language model training process took more than two and a half months and “required access to a supercomputer like the equivalent of 500 GPUs. .”
Organizations building large language models should be careful when retraining their software to improve its capabilities.
“It’s important to realize that these models aren’t being trained all the time, every day,” Delangue said, noting that some models, such as ChatGPT, don’t have knowledge of recent events. I pointed out why. He said his knowledge of ChatGPT will stop in 2021.
“We are currently training for version 2 of Bloom, and retraining will cost less than $10 million,” said Delangue. “So that’s something we don’t want to do every week.”
reasoning and those who pay for it
Bing in chat
Jordan Novett | CNBC
To make predictions or generate text using a trained machine learning model, engineers use the model in a process called “inference.” This can be much more costly than training, as popular products require millions of runs.
For a product as popular as ChatGPT, investment firm UBS estimates reached With 100 million monthly active users in January, Curran said it may have cost OpenAI $40 million to process the millions of prompts people typed into the software that month. I believe there is
When these tools are used billions of times a day, costs skyrocket. Financial analysts estimate that Microsoft’s Bing AI chatbot, powered by the OpenAI ChatGPT model, would require at least $4 billion in infrastructure to respond to all Bing users.
For Latitude, for example, startups didn’t have to pay to train the underlying OpenAI language models they had access to, but had to factor inference costs akin to ‘0.5 cents per call’. . A Latitude spokesperson talks about “millions of requests per day.”
“And I was relatively conservative,” Curran said of his reckoning.
To sow the seeds of the current AI boom, venture capitalists and tech giants are investing billions in startups specializing in generative AI technologies. For example, Microsoft, the GPT overseer, made his $10 billion investment in OpenAI, according to media reports in January. SalesforceSalesforce Ventures, the venture capital arm of , recently launched a $250 million fund to cater to generative AI startups.
As Investor Semil Shah at VC firm Haystack and Lightspeed Venture Partners explained On Twitter, “VC dollars have moved from subsidizing taxi rides and burrito deliveries to LLMs and generative AI computing.”
Many entrepreneurs find it risky to rely on potentially subsidized AI models. You can’t control AI models, you just pay per use.
“When I talk to my AI friends at startup conferences, I say: Don’t rely solely on OpenAI, ChatGPT, or other large-scale language models.” Personal.ai, a chatbot currently in beta mode. “They’re all owned by big tech companies because business shifts, right? When they cut off access, you’re gone.”
Companies such as enterprise technology company Conversica are looking for ways to take advantage of Microsoft’s Azure cloud services at today’s discounted prices.
Conversica CEO Jim Kaskade declined to comment on how much the startup is paying, but the subsidized costs are welcome as they explore ways to use language models effectively. I acknowledged that
“If they were really trying to make ends meet, they would have charged more,” Kaskade said.
how it changes
It is unclear whether AI computation will remain expensive as the industry develops. Companies creating foundational models, semiconductor manufacturers, and start-ups all see business opportunities in lowering the running cost of AI software.
Nvidia, which accounts for about 95% of the market for AI chips, continues to develop stronger versions designed specifically for machine learning, but the industry-wide chip power gains have slowed in recent years.
Still, Nvidia CEO Jensen Huang believes AI will be a million times more efficient within 10 years, thanks to improvements not just in chips, but in software and other computer parts.
“Moore’s Law, in its prime, would have done 100 times more in 10 years,” Huang said on the earnings call last month. “Coming up with new processors, new systems, new interconnects, new frameworks and algorithms, and data, he works with scientists and his AI researchers to develop new models, all the while working with large-scale language models. is a million times faster than his.”
Some startups are eyeing the high cost of AI as a business opportunity.
“No one said we should build something specifically for inference. What would that look like?” It claims to be a startup building a system that saves money on inference by doing more.
“Today, people are using GPUs, NVIDIA GPUs, to do most of their inference. The problem is when the workload spikes very quickly, what happened to ChatGPT, reached 1 million users in 5 days, the GPU capacity can’t keep up with it, it’s built for that It’s built for training and graphics acceleration,” he said.
Delangue, CEO of HuggingFace, said many companies would rather focus on small, specific models that are less expensive to train and run, rather than the larger language models that get the most attention. I think.
Meanwhile, OpenAI announced last month that it would lower the cost for companies to access the GPT model.it will charge now one-fifth of a cent About 750 words of output.
OpenAI’s low price has caught the attention of AI dungeon maker Latitude.
A Latitude spokesperson said, “It’s no exaggeration to say that we are very excited about the big changes happening in our industry and are always evaluating how we can provide the best possible experience for our users.” . “Latitude will continue to evaluate all AI models to ensure we deliver the best possible game.”
clock: AI “iPhone Moments” – Separating ChatGPT Hype from Reality