Mon. Dec 23rd, 2024
Nucleus Ai Emerges From Stealth With Agfocus And Releases 22b

VentureBeat Presents: AI Unleashed – A special executive event for enterprise data leaders.Network and learn from industry peers. learn more


based in california nucleus AIa four-member startup with talent from Amazon and Samsung Research, has come out of stealth today by announcing its first product, a 22 billion-parameter large-scale language model (LLM).

Available under open source MIT and commercial licenses, this versatile model sits between the 13B and 34B segments and can be fine-tuned for different generations of tasks and products. Nucleus says this outperforms models of comparable scale and will ultimately help build towards his company’s goal of using AI to transform agriculture.

“We’re starting with a 22 billion model, which is a transformer model, and in about two weeks we’ll be releasing a state-of-the-art RetNet model that offers significant benefits in terms of cost and inference speed,” said the company’s CEO. ‘s Gnandeep Moturi told VentureBeat. .

New Nucleus AI model

Nucleus began training the 22B model about three and a half months ago after receiving computing resources from early investors.

event

AI unleashed

An exclusive, invitation-only evening of insights and networking designed for senior executives at companies overseeing data stacks and strategy.

learn more

The company leverages existing research and the open-source community to pre-train LLM with a context length of 2,048 tokens, ultimately generating large-scale deduplicated and cleaned information collected from the Web, Wikipedia, and the Web. We trained LLM on 1 trillion tokens of data covering . Stack exchange, arXiv, and code.

This established a comprehensive knowledge base of models, covering everything from general information to academic research and coding insights.

As a next step, Nucleus will add additional versions of the 22B model trained on 350 billion and 700 billion tokens, and two RetNet models (3 billion and 11 billion parameters) pre-trained on larger tokens. We are planning to release it. The context length is 4,096 tokens.

These compact models take full advantage of RNN and transformer neural network architectures and offer significant speed and cost benefits. According to Moturi, internal experiments have shown him that it is 15 times faster than comparable transformer models typically require, and requires only a quarter of the amount of GPU memory.

“So far, there is only research proving this can work. No one has actually built a model and released it to the public,” the CEO added.

greater ambition

These models will be available for enterprise applications, but Nucleus has even bigger ambitions for AI research.

Instead of building straight-up chatbots like other LLM companies OpenAI, Anthropic, and Cohere, we are leveraging AI to optimize supply and demand and reduce uncertainty for farmers. The plan is to build an intelligent operating system for agriculture, Moturi said.

“We have a marketplace idea where supply and demand are highly optimized for farmers, similar to what Uber does for taxi drivers,” he said. .

This has the potential to solve multiple challenges for farmers, from climate change and lack of knowledge to optimizing supply and maintaining distribution.

“At this point, we’re not competing with anyone else’s algorithms. When we had access to computing, we were looking to build internal products to step into the agricultural environment. But then we realized that we needed a language model as the core of the marketplace itself, and we started building it with contributions from the open source community,” he added.

More details about the agriculture-centric OS and RetNet model will be announced later this month.

VentureBeat’s mission will be a digital town square for technical decision makers to gain knowledge and transact on transformative enterprise technologies. Please see the briefing.