Highlights:

  • Mistral is adding three exclusive models to its LLM range, the most notable of which is Mistral Large.
  • Apart from offering the LLMs through an API, Mistral also plans to incorporate them into a novel ChatGPT-like platform named Le Chat.

A well-funded artificial intelligence startup, Mistral AI, unveiled three LLM and a chatbot designed to rival OpenAI’s ChatGPT.

The company also outlined a fresh collaboration with Microsoft Corp., the primary investor in OpenAI. This partnership will grant Mistral’s engineers access to supercomputing infrastructure on Azure. Moreover, Microsoft is facilitating the availability of the startup’s models to users of its cloud platform.

Established in Paris in May, Mistral completed a seed funding round of USD 113 million within four weeks of its launch. It obtained an additional USD 415 million in funding in December from a group headed by Lightspeed Venture Partners and Andreessen Horowitz. As of recently, the product portfolio of the organization consisted of two open-source language models, each comprising 46.7 billion parameters and 7 billion parameters, respectively.

Mistral is broadening its collection of Large Language Models (LLMs) by introducing three exclusive models, with Mistral Large taking the lead. This model is capable of producing text in English, French, Spanish, German, and Italian. It also possesses the ability to generate software code and solve mathematical problems. Notably, a user prompt can now accommodate up to 32,000 tokens, where each token represents a unit of data consisting of a few letters or numbers.

The model is, according to the manufacturer, the second most sophisticated of its kind on the market, following GPT-4. Mistral Large lagged OpenAI’s premier model by less than 10% in a test comprising four LLM reasoning benchmarks. It outperformed Llama 2 70B, an open-source GPT-4 alternative introduced by Meta Platforms Inc. last year, in a distinct assessment by a significant margin.

Access to Mistral Large is granted to developers through an application programming interface (API). This API offers the flexibility to establish personalized moderation policies for the model and integrate it with external applications. For instance, software teams can leverage Mistral Large to utilize data from an external database in order to respond to user queries.

The API has the option to encapsulate the model’s output into JSON files. JSON, a data format facilitating the seamless transfer of data between applications, can streamline the process of integrating an AI system’s output into a company’s customized applications, thereby minimizing the associated workload.

Mistral Small, a second new LLM, is being released concurrently with Mistral Large. The former model is less expensive and has a lower latency, but it is less capable of sophisticated reasoning. It still promises to beat Mistral’s previous flagship LLM in a variety of reasoning tasks despite the feature set being scaled back.

Customers will have multiple options to access the company’s most recent AI models. Mistral plans to make the LLMs available as part of a new ChatGPT-like service called Le Chat in addition to offering them via an API. Access to a third prototype model called Mistral Next—which the company claims is “designed to be brief and concise”—will also be available through Le Chat.

In addition, as part of the new alliance it unveiled with Microsoft recently, Mistral is providing its flagship LLMs via Azure. The cloud platform offers access to Mistral Large upon launch. Future plans call for the AI developer’s other proprietary models and its previous open-source LLMs to do the same.

There are other elements to the alliance, which is referred to as a multiyear partnership. With the help of Azure’s supercomputing infrastructure, Mistral will be able to run inference workloads and train new models. Additionally, the business and Microsoft will collaborate to investigate the potential of creating personalized LLM versions that are tailored for particular use cases.