Anthropic says its latest AI bot can beat Gemini and ChatGPT

Anthropic, the AI company started by several former OpenAI employees, says the new Claude 3 family of AI models performs as well as or better than leading models from Google and OpenAI. Unlike earlier versions, Claude 3 is also multimodal, able to understand text and photo inputs.

Anthropic says Claude 3 will answer more questions, understand longer instructions, and be more accurate. Claude 3 can understand more context, meaning it can process more information. There’s Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus, with Opus being the largest and “most intelligent model.” Anthropic says Opus and Sonnet are now available on claude.ai and its API. Haiku will be released soon. All three models can be deployed on chatbots, auto-completion, and data extraction tasks.

Previous versions of Claude refused to answer some prompts that were harmless, which the company writes “suggests a lack of contextual understanding.” The new models are less likely to refuse to answer prompts that toe the line of its safety guardrails, similar to rumors about Meta’s plans for Llama 3 when it’s released.

Incorrect refusals on Claude 3 versus Claude 2.1.

Image: Anthropic

Anthropic claims Claude 3 models can give near-instant results even while parsing dense material like a research paper. A blog post says Haiku, the smallest version of Claude 3, is “the fastest and most cost-effective model on the market,” able to read a dense research paper complete with charts and graphs “in less than three seconds.”

Anthropic says Opus outperformed most models in several benchmarking tests. It showed better graduate-level reasoning than OpenAI’s GPT-4, getting 50.4 percent in that test over GPT-4’s 35.7 percent. It also answered math questions, coded, and understood reasoning better.

Claude 3 models compared to GPT-4, GPT-3.5, and Gemini 1.0 Ultra / Pro.

Image: Anthropic

The new models also significantly improve against the previous Claude 2.1 model. Sonnet, the middle ground model, was twice as fast as Claude 2 and Claude 2.1. “It excels in tasks demanding rapid responses, like knowledge retrieval or sales automation,” Anthropic said.

Source link