• MosaicML helps corporate clients install large language models like OpenAI LP’s AI chatbot ChatGPT and diffusion models like Stable Diffusion’s AI picture generator.
  • A significant portion of current AI interest is focused on custom AI models that rely on specific training sets for disciplines such as coding, legal work, healthcare, and others.

Recently, MosaicML Inc., a provider of generative artificial intelligence, announced the launch of MosaicML Inference for businesses, significantly reducing the costs for programmers to scale and deploy AI models for their own applications.

MosaicML provides enterprise clients with the infrastructure required to deploy large-scale large language models, such as those that power OpenAI LP’s AI chatbot ChatGPT, and diffusion models, which power the AI image generator Stable Diffusion. With these novel inference capabilities, the company now provides a comprehensive solution for training and deploying generative AI.

CEO Naveen Rao told a leading media house that the company’s value proposition for enterprise customers is twofold: protecting customer data and reducing costs.

A significant portion of current AI interest is focused on custom AI models that rely on specific training sets for coding, healthcare, legal work, and others. Many of these industries rely on strict compliance and data control. Generic models cannot contend with enterprise customers’ desire for models exceptionally capable of performing a particular task.

Naveen Rao stated, “We provide tools that work in any cloud that enable customers to pretrain, fine-tune and serve models within their own private tenancy to enable and empower model ownership. If a customer trains a model, they can be rest assured that they own all the iterations of it; that model is theirs. We claim no ownership of that.”

Using MosaicML’s Inference offering, clients can deploy AI models for text completion and text embedding at a cost four times less than OpenAI’s LLM and 15 times less than OpenAI’s DALL-E 2 for image generation.

With the release of Inference, MosaicML provides access to several curated open-source LLM models, including Instructor-XL, Dolly, and GPTNeoX, that developers can customize to meet their specific requirements. All models will acquire the same optimization and affordability when deployed with Inference, allowing them to operate cheaply.

Naveen Rao said, “These are open-source models, so by definition, customers can customize and fine-tune them with our tools and serve them with our tools. We can serve them either publicly with a public-facing URL or completely within the closed tenancy of a customer. So they can host private models for their own internal use if they want that. I think this is very important from a privacy standpoint.”

The company is also unveiling the MosaicML Foundational Model, which developers can use to build upon their engineering expertise and efficiency knowledge. One of the advantages of the new model is that it has a huge “context window,” meaning it can accept a substantial amount of initial text.

With over 64,000 tokens or roughly 50,000 words, it is considerably larger than many other models on the market. GPT-4 has a maximum of 32,768 tokens or approximately 25,000 words. Rao presented the model with F. Scott Fitzgerald’s “The Great Gatsby” to demonstrate its capabilities and asked it to write the epilogue.

Organizations such as Replit, a browser-based integrated development environment, Twelve Labs, an AI-based video search company, and Stanford University have built upon MosaicML’s platform to gain greater control over their models and to tailor domain-specific AIs to their specific requirements.