OpenAI pushes AI agent capabilities with new developer API



Builders utilizing the Responses API can entry the identical fashions that energy ChatGPT Search: GPT-4o search and GPT-4o mini search. These fashions can browse the online to reply questions and cite sources of their responses.

That is notable as a result of OpenAI says the added internet search capacity dramatically improves the factual accuracy of its AI fashions. On OpenAI’s SimpleQA benchmark, which goals to measure confabulation charge, GPT-4o search scored 90 p.c, whereas GPT-4o mini search achieved 88 p.c—each considerably outperforming the bigger GPT-4.5 mannequin with out search, which scored 63 p.c.

Regardless of these enhancements, the know-how nonetheless has important limitations. Other than points with CUA correctly navigating web sites, the improved search functionality does not fully resolve the issue of AI confabulations, with GPT-4o search nonetheless making factual errors 10 p.c of the time.

Alongside the Responses API, OpenAI launched the open supply Brokers SDK, offering builders with free instruments to combine fashions with inside techniques, implement safeguards, and monitor agent actions. This toolkit follows OpenAI’s earlier launch of Swarm, a framework for orchestrating a number of brokers.

These are nonetheless early days within the AI agent subject, and issues will possible enhance quickly. Nevertheless, in the mean time, the AI agent motion stays weak to unrealistic claims, as demonstrated earlier this week when customers found that Chinese language startup Butterfly Impact’s Manus AI agent platform didn’t ship on a lot of its guarantees, highlighting the persistent hole between promotional claims and sensible performance on this rising know-how class.

More From Author

You May Also Like

Leave a Reply

Your email address will not be published. Required fields are marked *