OpenAI launches GPT-5.5: 12M context, AA index tops the chart, and Terminal-Bench rewrites the agent benchmark with 82.7%

ChainNewsAbmedia

OpenAI officially released GPT-5.5 on 4/23, positioning it as a flagship model for agentic work and enterprise knowledge processing, while also rolling out on ChatGPT and Codex. The official promotional message is set as “our smartest model and the most intuitive to use,” and the AA Intelligence Index tops out at 60 points, ahead of Claude Opus 4.7 and Gemini 3.1 Pro Preview by 3 points each.

Key data at a glance

Indicator GPT-5.5 versus (GPT-5.4 or same-tier competitors) AA Intelligence Index 60 Claude Opus 4.7: 57; Gemini 3.1 Pro Preview: 57 Terminal-Bench 2.0 (command-line workflow) 82.7% GPT-5.4: 75.1% Expert-SWE (OpenAI internal programming evaluation) 73.1% GPT-5.4: 68.5% Context window 12 million tokens dramatically increased, enabling it to handle an entire enterprise codebase or several hours of video Price (per million tokens) Input $5, output $30 GPT-5.4 double the unit price; but output token usage drops by about 40%, net cost rises by about 20%

Positioning: Built for the “Agent Era”

OpenAI describes GPT-5.5 as a foundational model for agentic computing—able to understand complex goals, use tools, self-check its work results, and complete multi-step tasks without requiring humans to intervene at every step. According to a TechCrunch interview, President Greg Brockman characterizes this version as “a big step toward future computing, but only a step,” and emphasizes that it is “a faster, sharper reasoner than 5.4, using fewer tokens.”

Chief Scientist Jakub Pachocki noted, “We’re seeing very significant improvements in the short term”; Research Lead Mark Chen, meanwhile, emphasized that this release delivers “meaningful breakthroughs” in scientific and technical research workflows.

Supply scope and version tiering

GPT-5.5: Plus, Pro, Business, and Enterprise users can use it in ChatGPT and Codex

GPT-5.5 Pro: A more advanced reasoning version available in ChatGPT for Pro, Business, and Enterprise users

Codex integration: Also available in OpenAI’s program agent tools, strengthening multi-file editing, command-line support, and test loops

Cybersecurity and defense rhetoric rises in parallel

When asked in a TechCrunch interview, Mia Glaese, a member of the technical team, said that GPT-5.5’s cybersecurity capabilities will “have a major impact on how OpenAI deploys models into digital defense.” This rhetoric directly mirrors recent controversy from Anthropic around Claude Mythos, a weapon-grade cybersecurity model—Altman previously criticized Anthropic’s “fear marketing” strategy on the 《Core Memory》 show. With GPT-5.5, OpenAI places even more emphasis on the narrative of “attack and defense in one, deployable,” aiming to draw a clearer contrast with Anthropic’s stance of limiting access.

Pricing strategy changes

GPT-5.5’s price per million tokens doubles to input $5 and output $30—this is the first generation in the GPT-5 series where unit prices rise significantly. OpenAI’s explanation is that the model can reduce output token usage by about 40% in terms of reasoning efficiency, so the typical bill for actual tasks is about 20% higher than GPT-5.4, not simply 2x. For enterprises, the decision therefore shifts from “is the unit price worth it?” to “under the same prompt, can GPT-5.5 complete more complex tasks with a smaller total token count?”

Signals for the industry

GPT-5.5 widens the gap in OpenAI’s performance on Terminal-Bench and internal SWE evaluations. These two benchmarks test command-line agent execution and real software engineering tasks respectively—making the scores a more direct battleground versus Codex and Claude Code. Combined with the simultaneous opening of a 12 million token context window, OpenAI adds pressure to both “full-scale enterprise knowledge base processing” and “long-task agents” tracks at the same time. For Anthropic, Claude Opus 4.7 trails by 3 points—57 on the AA index—while for Claude Code users, there’s also another reason to watch the progress of the next generation (Opus 4.8 or a new Claude).

This article on OpenAI pushing GPT-5.5:12M context, tops the AA index, rewrites the agent benchmark with Terminal-Bench 82.7% first appeared on Chain News ABMedia.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Related Articles

Worxphere Rebrands JobKorea With AI-Powered Hiring Tools

Gate News message, April 26 — South Korean HR platform Worxphere has rebranded JobKorea as it transitions from traditional online job boards to AI-driven hiring solutions. The company is consolidating services including JobKorea and Albamon into a unified platform covering permanent employment,

GateNews2h ago

AI Agents can already independently recreate complex academic papers: Mollick says most errors come from human original text rather than AI

Mollick points out that publicly available methods and data can allow AI agents to reproduce complex research without the original paper and code; if the reproduction does not match the original paper, it is usually due to errors in the paper’s own data processing or overextension of the conclusions, rather than the AI. Claude first reproduces the paper, and then GPT‑5 Pro cross-validates it; most attempts succeed, but they are blocked when the data is too large or when there are issues with the replication data. This trend greatly reduces labor costs, making reproduction a widely actionable form of verification, and it also raises institutional challenges for peer review and governance, with government governance tools or becoming a key issue.

ChainNewsAbmedia4h ago

UAE Announces Shift Toward AI Government Model in the Next Two Years

His Highness Sheikh Mohammed bin Rashid Al Maktoum stated that the goal was for 50% of government sectors to operate through autonomous agentic AI. The transition will also include the training of federal employees to “master AI” and will be overseen by Sheikh Mansour bin Zayed. Key Takeaways:

Coinpedia22h ago

AI Trading Platform Fere AI Raises $1.3M in Funding Led by Ethereal Ventures

Gate News message, April 25 — Fere AI, an AI-powered digital asset trading platform, announced the completion of a $1.3 million funding round led by Ethereal Ventures, with participation from Galaxy Vision Hill and Kosmos Ventures, according to Globenewswire. The platform supports cross-chain

GateNews23h ago

Nvidia Deploys OpenAI Codex AI Agent Across Entire Workforce on Blackwell Infrastructure

Gate News message, April 25 — Nvidia has rolled out OpenAI's Codex, an AI agent powered by GPT-5.5, to its entire workforce following a successful trial with approximately 10,000 employees, according to internal communications from CEO Jensen Huang and OpenAI CEO Sam Altman. Codex is designed to as

GateNews04-25 03:11

AI Coding Startup Cognition in Talks for $25B Valuation Funding Round

Gate News message, April 25 — AI coding startup Cognition is in early talks to raise hundreds of millions of dollars or more at approximately a $25 billion valuation, according to people familiar with the matter. Interest has increased following SpaceX's acquisition of a rival AI coding startup. Co

GateNews04-25 02:51
Comment
0/400
No comments