• Baseten offers a dashboard that allows developers to monitor an AI model’s infrastructure usage and relevant metrics like processing times for user requests.
  • Companies can access Baseten’s AI inference platform as a managed cloud service or deploy it within their Amazon Web Services and Google Cloud environments.

Recently, Baseten Labs Inc., a startup simplifying the deployment of artificial intelligence models for developers, revealed the closure of a USD 40 million funding round.

IVP and Spark Capital spearheaded Baseten’s AI inference platform’s investment, with numerous existing backers also contributing. As reported by Forbes, the Series B funding has resulted in Baseten being valued at over USD 200 million.

Executing inference, which involves running AI models in production, often demands considerable time and resources. Developers need to set up the infrastructure to accommodate sudden surges in traffic for running the model. They must also ensure that the AI replies to user prompts promptly, avoids cloud cost overruns, and completes dozens of other technical tasks.

Baseten, headquartered in San Francisco, is dedicated to simplifying this process. The platform provided by Baseten automates numerous tasks associated with managing production AI environments, beginning with the deployment of the initial model.

Once developers train a new neural network, they must package it into a format compatible with their organization’s cloud infrastructure for deployment. Baseten has created an open-source tool named Truss to expedite this process. The startup asserts that AI models can be deployed on its platform using just a few lines of code with the help of the tool.

After a neural network has been deployed in production, Baseten implements an autoscaling engine to guarantee the prompt processing of user requests. An AI model may become inundated with requests in an abrupt manner, resulting in prolonged response durations. Upon detecting an increase in utilization, the autoscaling engine generates replicas of the AI model in order to distribute the extra traffic.

Additionally, it automates removing replicas once user activity returns to normal levels. The company claims its platform allows developers to implement a “scale to zero” approach, where an AI workload shuts down entirely when inactive. This approach helps to mitigate unnecessary cloud costs.

Baseten offers a dashboard that allows developers to monitor an AI model’s infrastructure usage and relevant metrics like processing times for user requests. A complementary observability tool simplifies the process of troubleshooting technical issues.

In a blog post, Chief Executive Officer Tuhin Srivastava wrote, “Our native workflows serve large models in production, so users don’t need to think about version management, roll-out, and observability. In 2023, we scaled inference loads hundreds of times over without a minute of downtime.”

Businesses can deploy Baseten’s platform in their own Amazon Web Services and Google Cloud environments or utilize it as a managed cloud service. Nearly twenty significant organizations and tens of thousands of developers utilize the software manufacturer’s platform, per Forbes. Based on reports, the company generates annual revenue in the “mid-single-digit millions.”

To extend its customer reach, the company aims to almost double its 25-person sales and marketing team by the end of the year. The recently closed funding round for Baseten will also support product development initiatives. The company intends to introduce support for additional cloud platforms, develop tools to optimize the performance of customers’ AI models, and simplify the training process.