Highlights:

  • The 500 billion parameters that Microsoft’s MAI-1 is said to have suggested that it might be positioned as a type of middle ground between GPT-3 and ChatGPT-4.
  • It has been reported that Microsoft may power MAI-1 using training data and other resources from Inflection AI.

Microsoft Corp. is developing a large language model with over 500 billion parameters.

It is anticipated that the LLM, internally referred to as MAI-1, will launch as early as this month.

In mid-2020, OpenAI unveiled GPT-3, revealing that its first iteration contained 175 billion parameters. Although it hasn’t yet released precise figures, the business has stated that GPT-4 is larger. According to some reports, Google LLC’s Gemini Ultra, which performs similarly to GPT-4, has 1.6 trillion parameters, and OpenAI’s flagship LLM is said to have 1.76 trillion.

The 500 billion parameters that Microsoft’s MAI-1 is said to have suggested that it might be positioned as a type of middle ground between GPT-3 and ChatGPT-4. Compared to OpenAI’s flagship LLM, this setup would enable the model to give excellent response accuracy while consuming a substantially smaller amount of electricity. Microsoft would pay less for inference as a result.

Mustafa Suleyman, the founder of LLM developer Inflection AI Inc., is reportedly supervising the development of MAI-1. Suleyman and most of the startup’s staff joined Microsoft in March after a purported USD 625 million transaction. The Executive formerly co-founded the DeepMind AI research group at Google LLC.

It has been reported that Microsoft may power MAI-1 using training data and other resources from Inflection AI. Web content and text produced by GPT-4 are among the information categories allegedly included in the model’s training dataset. According to reports, Microsoft is using a “large cluster of servers” with graphics cards made by Nvidia Corp. to carry out the development process.

According to the reports, the corporation hasn’t decided how it will employ MAI-1 yet. The model is too complicated to execute on consumer devices if it really has 500 billion parameters. It follows that Microsoft will probably implement MAI-1 in its data centers, where the LLM may be incorporated into Azure and Bing services.

If the model demonstrates enough potential by May 16, the corporation is expected to introduce MAI-1 at its Build developer conference. This suggests that the company anticipates having a functioning model prototype in a few weeks if it doesn’t already have one.

Less than two weeks have passed since Microsoft released a language model named Pi-3 Mini

for public use. Now, it has announced that it is working on MAI-1. According to the company, the latter model can outperform LLMs more than ten times its size and has 3.8 billion parameters. Pi-3 is a member of an AI family that also consists of two larger, somewhat more effective neural networks.