Gate News message, April 20 — Top AI models excel at solving complex problems like Olympiad mathematics but struggle with routine enterprise work, according to David Meyer of Databricks. Some models may correct an incorrect invoice number instead of flagging it as an error, while coding tools like Claude can also underperform on data engineering tasks.
The gap stems from fundamental differences between enterprise data and the public web text used to train large models. Enterprise data often features vague column labels, numerous blank fields, and codes stored as plain text. In one academic study, an AI model’s F1 score, which balances precision and recall, dropped from 0.94 on public data to 0.07 on enterprise data for a data engineering task. Additionally, large models tend to default to familiar patterns from training; some defaulted to Structured Query Language (SQL) even after receiving instructions and documentation for a company’s proprietary query language.
Smaller open source models tuned with reinforcement learning can handle specific jobs more efficiently at significantly lower training costs than large general-purpose models. Databricks is building smaller AI agents for specific workflows, such as KARL, which uses reinforcement learning for multi-step reasoning with company documents. The industry is shifting from reliance on giant models to hybrid architectures where small efficient models handle routine volume, then escalate only unclear or complex cases to larger, costlier systems.
Databricks recently acquired Quotient AI to help large enterprises run AI agents more reliably. Competition in the AI business now centers on running the full AI lifecycle, including feedback systems for tracking errors and continuously improving models over time, making evaluation and tuning tools increasingly valuable after deployment.
Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to
Disclaimer.
Related Articles
Silicon Valley AI Agent Reality: Massive Token Wastage, System Integration “Extremely Chaotic,” Huang Jen-hsun’s “Next ChatGPT” Prediction Still to Be Verified
At a recent Silicon Valley conference, several AI startup CEOs shared their views on the current issues with using AI agents, saying they face two major challenges: token waste and system confusion. Experts noted that companies need to judge more carefully when to use large language models, to avoid unnecessary waste of resources. In addition, the collaboration of multiple AI agents often leads to message-passing and state-consistency problems, indicating that standardization still needs improvement. Although Huang Renxun mentioned token compensation metrics, feedback shows that this does not equal productivity; the real value lies in effective task design.
ChainNewsAbmedia13h ago
AI consumes 80% of global venture capital; Q1 2026 sees a pull of $242 billion: How crypto players should respond to the reallocation of capital
According to reports, in Q1 2026 the total global venture capital funding is close to $300 billion, of which AI-related companies account for about $242 billion alone, or 80% of venture capital. This shows that AI has become the primary focus of venture capital. As funding concentrates on AI, other areas such as crypto have been squeezed; industry players need to adjust their strategies, integrate AI more deeply into their businesses, and expect a trend of infrastructure consolidation.
ChainNewsAbmedia17h ago
Honor Humanoid Robot Wins 2026 Beijing Yizhuang Half Marathon in 50 Minutes 26 Seconds
The 2026 Beijing Yizhuang Half Marathon featured humanoid robots racing 21.0975 km. The autonomous team Qitian Dasheng won in 50:26, while the remote-control team Jueying Chitu finished first in net time but ranked lower due to penalties.
GateNews19h ago
Hong Kong Police Warn of 'AI Quantitative Trading' Crypto Scam, Woman Loses HK$7.7 Million
Hong Kong police revealed a cryptocurrency fraud where a woman lost HK$7.7 million to scammers posing as investment experts via Telegram, promising high returns through AI trading. The police warned the public of the risks associated with cryptocurrency investments.
GateNews19h ago
Ethereum Co-founder Lubin: AI Will Be Critical Turning Point for Crypto, But Tech Giant Monopoly Poses Systemic Risk
Ethereum co-founder Joseph Lubin emphasized the transformative potential of AI for the cryptocurrency sector while cautioning against the risks of centralization among tech giants. He envisions AI-driven autonomous transactions on blockchain and highlights the convergence of traditional finance with DeFi.
GateNews04-18 14:01
Luffa Partners with Digital Asset Platform to Integrate AI-Powered Crypto Trading
Luffa, a Web3 social ecosystem, partnered with a digital asset trading platform to integrate AI-driven trading features, enhancing secure communication and trading in a unified interface while maintaining decentralization and risk mitigation.
GateNews04-18 06:31