Mon. Dec 23rd, 2024
Deasie Wants To Rank And Filter Data To Increase Confidence

Deasya startup developing tools that give companies more control over their text-generating AI models, today announced seed funding with participation from Y Combinator, General Catalyst, RTP Global, Rebel Fund, and J12 Ventures. announced that it had raised $2.9 million in a funding round.

Deasie founders Reece Griffiths, Mikko Peiponen, and Leo Platzer previously co-developed data governance tools at McKinsey. During their time at McKinsey, they said they observed “significant issues and opportunities” around enterprise data governance and how these issues could impact companies’ ability to deploy generative AI. ing.

They’re not alone. Recent IDC investigation It found that 86% of more than 900 executives at large companies agree that more governance is needed to ensure the “quality and integrity” of the AI ​​insights generated. I did. Meanwhile, just 30% of survey respondents feel “very ready or ready” to take advantage of generative AI today.

In an effort to improve the reliability of generative AI models, especially OpenAI’s GPT-4-aligned large-scale language models (LLMs), the Deasie team is connecting unstructured corporate data such as documents, reports, and emails to automatically We have built a product that categorizes Both content and sensibility.

For example, Deasie may automatically tag a report with “Personally Identifiable Information” or “Sensitive Information” to indicate that it is a third version of the report. Alternatively, you might tag a spec sheet as “confidential” to emphasize that the sheet has restricted access. Deasie customers define tags and labels to reflect their approach to data classification and organization, Griffiths told TechCrunch via email. This will “teach” Deasie’s algorithm how to classify future data.

After Deasie auto-tags a document, the platform processes the resulting library of tags to evaluate the corresponding data in terms of overall relevance and importance. Then, based on this evaluation, decide what data to “feed” to the text generation model.

“Companies have vast amounts of unstructured data, yet it receives little attention from a governance perspective.” griffith Said. “The probability that a language model will get an answer that doesn’t make sense or be exposed to sensitive information increases with the amount of data. Deasie filters thousands of documents across the enterprise and generates AI It is an intelligent platform that ensures that the data fed into applications is relevant, high quality and safe to use.”

Deasie is certainly an interesting platform. The idea of ​​limiting the LLM to vetted data is not a bad one, especially considering the implications of leaking outdated or contradictory information to the LLM. But I’m wondering how consistently Deasie’s algorithms classify data, and how often the platform makes mistakes when inferring the importance of a document.

No matter what Deasie demonstrates, companies must provide satisfactory answers to at least some of these questions. Deasie, which has just three employees, has signed its first pilot agreement with a “multibillion-dollar” company in the United States and has more than 30 enterprise customers, including five Fortune 500 companies, according to Griffiths. It is said that it has a pipeline of

“Other products focus strictly on the ‘data safety’ aspect of LLM governance or the ‘data governance of structured data’ aspect,” says Deasie. “What did not exist was a good approach to measuring data quality and relevance of unstructured data…NEveryone was directly solving the problem of matching any generative AI use case to the “best” dataset possible. Deasie has developed a new approach in this area. ”

In the coming months, Deasie plans to expand its engineering team and make “multiple hires,” with a focus on building features that differentiate it from competitors such as Unstructurald.io, Scale AI, Collibra, and Alation. .