Elon Musk’s AI startup X.ai has announced Grok-1.5, its latest generative AI model. X.ai writes that it plans to power social network X’s Grok chatbot in the not-too-distant future (“in the next few days”). blog post), Grok-1.5 appears to be a visible upgrade over its predecessor, Grok-1, at least judging by the benchmark results and specifications published by X.
According to X.ai, Grok-1.5 benefits from “improved reasoning,” especially for coding and math-related tasks. This model more than doubled Grok-1’s score in MATH, a popular math benchmark, and outperformed him by more than 10 percentage points on his HumanEval test, which tests programming language generation and problem-solving ability. I am.
Of course, it is difficult to predict how these results will translate into real-world use. As we recently wrote, commonly used AI benchmarks measure something esoteric, such as performance on graduate-level chemistry exam questions, but how does the average person today interact with the models? It is not enough to know how to interact.
There was one point that could be improved. should What leads to tangible benefits is the amount of context that Grok-1.5 can capture compared to Grok-1.
Grok-1.5 has 128,000 token contexts, where a “token” refers to a bit of raw text (for example, the word “fantastic” is split into “fan”, “tas”, and “tic”). Context, or context window, refers to the input data (in this case text) that the model considers before producing output (additional text). Models with small context windows tend to forget even the most recent conversations, whereas models with large contexts avoid this pitfall and have the added benefit of better visibility into the data flow they ingest.
“[Grok-1.5 can] It leverages information from fairly long documents,” X.ai wrote in the aforementioned blog post. “Additionally, this model can handle longer and more complex prompts while maintaining the ability to follow instructions as the context window grows.”
What makes X.ai’s Grok model historically different from other generative AI models is that it responds to questions about topics that are typically off-limits to other models, such as conspiracies and controversial political ideas. The models also answer questions with what Musk described as a “defiant attitude,” using overtly rude language when asked.
It is unclear what changes Grok-1.5 brings to these areas. X.ai doesn’t mention this in their blog post.
According to X.ai, Grok-1.5 will soon be available to early X testers and will add “several new features.” Musk previously suggested he would summarize threads and replies and suggest content for posts. Let’s see if they arrive soon.
The announcement of Grok-1.5 came after X.ai open-sourced Grok-1, but without the code needed for fine-tuning or further training. Lately, Musk has announced that more X users, especially those paying for X’s $8/month premium plan, are using the chatbot Grok, which was previously only available to X Premium+ customers (who pay $16/month). He said he would be able to access it.