Nvidia CEO Jensen Huang speaks at the press conference at MGM during CES 2018 in Las Vegas on January 7, 2018.
Mandel Gunn | AFP | Getty Images
Software that can write text and draw pictures that look like they were made by humans has sparked a gold rush in the tech industry.
Companies like Microsoft and Google are fighting to integrate cutting-edge AI into their search engines. Multi-billion dollar competitors such as OpenAI and Stable Diffusion are racing to publish their software.
Powering many of these applications is the Nvidia A100, a roughly $10,000 chip that has become one of the most important tools in the artificial intelligence industry.
The A100 is now a “workhorse” for artificial intelligence professionals, said investor Nathan Benaich. Newsletter and report It covers the AI industry, including a partial list of supercomputers using A100. According to New Street Research, Nvidia holds 95% of the market for graphics processors that can be used for machine learning.
A100 is ideal for machine learning models that power tools like ChatGPT. Bing Bing, or stable diffusion. Many simple calculations can be performed simultaneously. This is important for training and using neural network models.
The technology behind the A100 was originally used to render sophisticated 3D graphics in games. Often referred to as a graphics processor or GPU, his recent Nvidia A100 is geared toward machine learning tasks and runs in a data center rather than inside a hot gaming PC.
Large companies and start-ups working on software such as chatbots and image generators need hundreds or thousands of Nvidia chips and either buy them themselves or secure access to their computers from cloud providers. Offers.
hundreds of GPUs Required for training artificial intelligence models such as large language models. Chips must be powerful enough to quickly process terabytes of data and recognize patterns. After that, you’ll also need a GPU like the A100 for “inference,” i.e. using the model to generate text, make predictions, or identify objects in photos.
This means AI companies need access to many A100s. Some entrepreneurs in this space even see the number of A100s they have access to as a sign of progress.
“A year ago we had 32 A100s,” Emad Mostaque, CEO of Stability AI wrote on twitter in January. “Dream big and stack up the Moar GPU kids. Brrr.” over $1 billion.
Stability AI now has access to over 5,400 A100 GPUs. in one quote The State of AI report charts and tracks the companies and universities with the largest collection of A100 GPUs, but does not include cloud providers that do not publish numbers.
Nvidia is on the AI train
Nvidia stands to benefit from the AI hype cycle.During Wednesday’s fiscal quarter Earnings reports showed that overall sales fell 21%, but investors pushed the stock up about 14% on Thursday. This is largely because the company’s AI chip business, which is reported as a data center, increased 11% during the quarter to give him more than $3.6 billion in revenue. continuous growth.
Nvidia’s stock is up 65% so far in 2023, outperforming the S&P 500 and other semiconductor stocks.
Nvidia CEO Jensen Huang couldn’t stop talking about AI during a conference call with analysts on Wednesday, suggesting the recent boom in artificial intelligence is at the heart of the company’s strategy.
“Activities around the AI infrastructure we built and inference using Hopper and Ampere to influence large language models have gone through the roof in the last 60 days,” said Huang. increase. “Whatever our view is this year, there is no doubt that the last 60, 90 days have resulted in a pretty dramatic shift.”
Ampere is Nvidia’s codename for its A100 generation chips. Hopper is the codename for the new generation that includes the recently shipped H100.
need more computers
Nvidia A100 processor
Machine learning tasks can take up an entire computer’s processing power compared to other types of software that use occasional bursts of processing power in microseconds, such as serving web pages., Sometimes hours and days.
This means that companies with hit AI products often need to get more GPUs to handle peak periods or improve their models.
These GPUs aren’t cheap. In addition to a single A100 on a card that can be inserted into existing servers, many data centers use systems that include eight of his A100 GPUs working in tandem.
This system, Nvidia’s DGX A100, It comes with the necessary chips, but the asking price is about $200,000. On Wednesday, Nvidia announced it would sell cloud access to his DGX system directly.
It’s easy to see how the cost of A100 increases.
For example, New Street Research estimates that the OpenAI-based ChatGPT model within Bing’s search may require eight GPUs to deliver responses to questions in less than a second.
At this rate, Microsoft would need more than 20,000 8-GPU servers just to roll out Bing’s model to everyone, a feature that could cost Microsoft $4 billion in infrastructure spending. suggests.
“If you’re from Microsoft and you want to scale it on the scale of Bing, maybe $4 billion. If you want to scale on the scale of Google, which handles 8 or 9 billion queries every day, you really need DGX. We need to spend $80 billion,” said Antoine Chakaivan, technology analyst at New Street Research. “The numbers we came up with are huge. But this simply reflects the fact that anyone using such a large language model would need a large supercomputer in use.” Thing.”
The latest version of Stable Diffusion, an image generator, 256 A100 GPUsA total of 200,000 compute hours, according to information posted online by Stability AI.
Mostaque, CEO of Stability AI, said on Twitter that training the model alone would cost $600,000 at market prices. tweet exchange The price was unusually cheap compared to its rivals. This does not include the cost of “inference” or model deployment.
Nvidia CEO Huang said in an interview with CNBC’s Katie Tarasov that the company’s product is actually cheaper for the amount of computation this kind of model requires.
“We went from a $1 billion data center running CPUs down to a $100 million data center,” Huang said. “Today, if you put $100 million in the cloud and share it with 100 companies, it’s pretty much zero.”
Huang said Nvidia’s GPUs allow startups to train models at a much lower cost than using traditional computer processors.
“Now you can build something like a large language model like GPT for $10 million to $20 million,” Huang said. “It’s really, really affordable.”
Nvidia isn’t the only company making GPUs for artificial intelligence. AMD and intel Competing graphics processors, like the big cloud companies Google and Amazon develops and deploys proprietary chips specifically designed for AI workloads.
Still, according to State of AI, “AI hardware continues to be strongly integrated into NVIDIA.” calculation reportAs of December, over 21,000 open source AI papers said they used Nvidia chips.
most researchers Included in the State of AI Compute Index used V100, Nvidia’s chip that came out in 2017, but A100 will grow rapidly in 2022, becoming the third most used Nvidia chip. became. game.
The A100 also has the distinction of being one of the few chips that has been subject to export controls for national defense reasons. Last fall, Nvidia said in his SEC filing that the US government imposed licensing requirements barring exports of his A100 and his H100 to China, Hong Kong and Russia.
“USG indicated that the new licensing requirements address the risk that covered products may be used or diverted for ‘military end uses’ or ‘military end users’ in China and Russia,” said Nvidia. said. mentioned in the filingNvidia has previously said it adapted some of its chips for the Chinese market to comply with US export regulations.
The A100’s fiercest competition may be its successor. The A100 was first introduced in his 2020, long before the chip cycle. Introduced in 2022, his H100 is starting to go into mass production. In fact, Nvidia announced Wednesday that in its January-ending quarter, it recorded more revenue from its H100 chip than it did its A100, but the H100 will have a higher price per unit.
According to Nvidia, the H100 is one of the first data center GPUs optimized for Transformers, an increasingly important technology used by many of the top modern AI applications. Nvidia said Wednesday that it wants to speed up AI training by over a million percent. This could ultimately mean that AI companies don’t need that many of his Nvidia chips.