• Cerebras Systems and Cirrascale Cloud Services have teamed up to offer a new service that allows users to train GPT-class models for a fraction of the cost of current providers, with as little as a few lines of code.
  • The ability to generate software, images, and papers from simple text inputs makes generative artificial intelligence (AI) a potentially game-changing development in the history of technology.

Large language models (LLMs) are the talk of the artificial intelligence (AI) industry right now but training them may be difficult and expensive; models with billions of parameters require months of work by skilled engineers to get them up and running (reliable and precise).

With the help of a new collaborative service from Cerebras Systems and Cirrascale Cloud Services, customers will be able to train GPT-class models far more affordably than with current providers—and with as few lines of code.

“We believe that LLMs are under-hyped, CEO and co-founder of Cerebras Systems, Andrew Feldman, said in a pre-briefing. “Within the next year, we will see a sweeping rise in the impact of LLMs in various parts of the economy.”

Also, generative AI may be one of the most significant technological advancements in recent history because it makes it possible to produce software, images and papers from simple text inputs.

A new agreement between Cerebras and AI content platform Jasper AI was recently revealed to speed-up generative AI’s deployment and boost its accuracy.

Unlocking new possibilities for research

Traditional cloud providers may struggle compared to LLMs because they cannot ensure latency between many GPUs. According to Feldman, “large swings in time to train” and fluctuating latency creates difficult, time-consuming problems when distributing a sizable AI model among GPUs.

Users can now use the Cerebras Wafer-Scale Clusters to train generative Transformer (GPT)-class models, such as GPT-J, GPT-3, and GPT-NeoX, thanks to the new Cerebras AI Model Studio, hosted on Cirrascale AI Innovation Cloud. In this category is the recently announced the Andromeda AI supercomputer.

According to Feldman, users can select from cutting-edge GPT-class models with 1.3 billion to 175 billion parameters, complete training eight times more quickly than on an A100 and pay half as much as standard cloud providers.

For instance, the Cerebras AI Model Studio cuts the training time on GPT-J to just eight days from scratch, compared to the conventional cloud’s 64-day average. Similarly, production costs on traditional clouds for GPUs alone can reach USD 61,000, compared to USD 45,000 for the entire production run on Cerebras.

The Push-button model scanning using the new tool may be done with one to 20 billion parameters, eliminating the necessity for distributed programming and DevOps. Longer sequence lengths can be used to train models, creating new research options.

Improving the capabilities of AI

According to Rogenmoser, the young Jasper (formed in 2021) will train its computationally expensive models on Cerebras’ Andromeda AI supercomputer in “a quarter of the time.”

“They want these models to become better to self-optimize based on past usage data, based on performance,” he said.

Jasper discovered that Andromeda performed tasks that thousands of GPUs couldn’t handle in its early testing on tiny workloads, which was announced last month at SC22, the international conference on high-performance computing, networking, storage and analytics.

The business anticipates a “dramatic advance in AI work,” including training GPT networks to adapt AI outputs to all granularities and end-user complexity levels. According to Rogenmoser, this will make it quick and straightforward for Jasper to personalize content for various consumer classes.

One hundred thousand users use Jasper’s products to write copy for marketing materials, commercials, novels and more. By acting as “an AI co-pilot,” the company, according to Rogenmoser, ends “the tyranny of the blank page.”

This enables authors to concentrate on the essential aspects of their story, “not the mundane.”