Highlights:

  • Lilac AI provides an upgraded cloud version of its tool with extra functionality for some charges.
  • The software from Lilac AI will be incorporated by Databricks into their flagship data management and AI platform.

Databricks Inc. acquires Lilac AI Inc., a startup that assists developers in managing the text datasets used in artificial intelligence projects.

The financial details of the deal were not disclosed when the firms announced it recently. Daniel Smilkov and Nikhil Thorat, two former Google LLC engineers who contributed to developing TensorFlow.js, launched Boston-based Lilac AI. That’s a segment of TensorFlow, the well-liked AI development tool from the search giant, which lets you create JavaScript machine learning apps.

Software development teams must compile and examine massive amounts of text to create an AI model. For the model to be trained, developers must first compile a set of documents. After training, the AI’s outputs need to be examined to see if the text it produces satisfies accuracy standards.

Databricks Co-founder Matei Zaharia and other executives stated, “Exploring and understanding these datasets is critical for building quality GenAI apps. However, analyzing unstructured text data can become highly cumbersome and extremely difficult in the age of GenAI. Historically, this process has been marred by manual, labor-intensive methods that lack scalability.”

Lilac, an open-source program created by Lilac AI, claims to simplify the process. Databricks, Cohere Inc., and other companies in the AI software industry use the program.

A built-in AI model powers Lilac’s “clustering capability,” one of its standout characteristics. A text dataset’s documents can be analyzed, related documents can be grouped, and a description of each group can be produced. For instance, Lilac could figure out that in an AI training dataset, book summaries make up two-thirds of the items, and arithmetic questions make up the remaining third.

Developers can use the tool to identify portions of a training dataset that need to be eliminated. Math problems don’t always have to be in the dataset used to establish an AI model that a software company is using to create book summaries. Eliminating pointless elements expedites the training process and enhances the precision of AI reactions.

Lilac is also helpful for other jobs. It has a dashboard that allows you to evaluate the effects of dataset updates by comparing individual records from a dataset with one another. Additionally, it enables programmers to convert text data into mathematical representations called embeddings, making text data more accessible to interpret by AI models.

Lilac AI provides an upgraded cloud version of its tool with extra functionality for some charges. The organization claims that an enhanced clustering feature can aggregate one million records into groups in under 20 minutes. Additionally, the cloud edition comes with features that simplify editing big datasets.

The software from Lilac AI will be incorporated by Databricks into their flagship data management and AI platform. The addition will enhance the technology the business acquired last June when it acquired MosaicML Inc. for USD 1.3 billion. MosaicML created multiple preconfigured language models and an AI development platform of the same name.