Pay with

USD

Supports Visa, Mastercard, SEPA & more

Flexible trading, zero fees

Use your crypto for payments worldwide

Basic

Trade crypto freely

Magnify your profit with leverage

Convert & Auto-Invest

Trade any size with no fees and no slippage

Get exposure to leveraged positions simply

Pre-Market Trading

Trade new tokens before listing

Advanced

Trade on-chain with Gate Wallet

Smart access to new on-chain tokens

Smart strategies with automated trading

Follow expert trading strategies

CrossEx Trading

One margin balance, shared across platforms

Access hundreds of perpetual contracts

One platform for global traditional assets

Trade European-style vanilla options

Unified Account

Maximize your capital efficiency

Introduction to Futures Trading

Learn the basics of futures trading

Join events to earn rewards

Use virtual funds to practice risk-free trading

U.S. stock CFD derivatives

Access real US stocks and ETFs

Trade quality Hong Kong-listed stocks

Real Korean stocks and top assets

High leverage, 24/7 trading

Tokenized Stocks

Backed by real stock assets

Unlock full access to global stock IPOs

Mint GUSD for Treasury RWA yields

Stocks Activities

Trade Popular Stocks and Unlock Generous Airdrops

Launch

Collect candies to earn airdrops

Quick staking, earn potential new tokens

Hold GT and get massive airdrops for free

Unlock full access to global stock IPOs

Trade on-chain assets and earn airdrops

Earn futures points and claim airdrop rewards

Investment

Earn interest with idle tokens

Auto-invest on a regular basis

Dual Investment

Profit from market volatility

Earn rewards with flexible staking

Pledge one crypto to borrow another

One-stop lending hub

Premium wealth growth plans

Take control of your financial future

Top-tier quant strategies

Stake cryptos to earn in PoS products

No-liquidation leverage

One-click stake, daily earnings

Post, share, and explore crypto trends

Live crypto market analysis

Chat with crypto traders

What is happening in crypto

More

Promotions

Activity Center

Participate in activities to earn rewards

Invite friends to earn referral rewards

Affiliate Program

Earn exclusive commission rewards

Grow influence and earn airdrops

Real-time platform updates

Crypto industry articles

Huge fee discounts

Asset Management

One‑stop asset management solution

Enterprise digital asset solutions

Developers (API)

Connects to the Gate application ecosystem

OTC Bank Transfer

Deposit and withdraw fiat

Generous API rebate mechanisms

AI

Your all-in-one conversational AI partner

Use Gate AI directly in your social App

Gate Blue Lobster, ready to go

Gate for AI Agent

AI infrastructure, Gate MCP, Skills, and CLI

Gate Skills Hub

From office tasks to trading, the all-in-one skill hub makes AI even more useful.

Others

Find FAQs and help guides

Learn about crypto investing

Grow with the champions

Proof of Reserves

Gate promises 100% proof of reserves

Keep your assets secure

DeepSeek V4 Training Data Doubled to 33T, Triggering Instability That Delayed Release

AI Industry News

2026-04-24 03:21:29

Gate News message, April 24 — DeepSeek's V4 technical report reveals that V4-Flash and V4-Pro were pre-trained on 32T and 33T tokens respectively, double the approximately 15T tokens used for V3. The report acknowledges encountering "significant instability challenges" during training, with loss spikes repeatedly occurring due to anomalies in the Mixture-of-Experts (MoE) layer; the routing mechanism itself exacerbates these anomalies, and simple rollback cannot resolve the issue.

DeepSeek implemented two solutions now applied to actual training: Anticipatory Routing, which decouples routing index computation from backbone network updates and automatically triggers only when loss spikes are detected (adding approximately 20% overhead), and SwiGLU Clamping, which directly suppresses anomalies by clamping activation values to a fixed range. The report states both approaches are effective but admits "the underlying principles remain insufficiently understood."

Susan Zhang, a Google DeepMind researcher who previously worked at Meta AI and OpenAI, commented that the instability triggered by doubling training data "explains the delay." She described the two solutions as "band-aids" while acknowledging DeepSeek's technical transparency.

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.

Related News

DeepSeek Releases V4 Open-Source Model Series with 1.6T Parameters and MIT License

OpenAI Launches GPT-5.5, Designed for Agent Tasks and Complex Workflows

Vercel Security Breach Expands to Hundreds of Users; AI Developers at Higher Risk

Cluster Protocol Raises $5M in Funding, DAO5 Leads Round

DeepSeek Open-Sources TileKernels, GPU Kernel Library for Large Model Training and Inference

In-Depth Analysis

JPMorgan Chase: KelpDAO bug wipes out $20 billion in DeFi TVL, institutional appeal damaged

Market Whisper04-24 02:50

Extreme Fear at 23 — But AI Coins Are Printing Green: 4 Crypto Picks Smart Money Is Quietly Accumulating

Crypto News Land04-24 01:31

JPMorgan: DeFi hackers are increasingly common, and interest in compression mechanisms to address TVL stagnation is drawing capital into USDT

ChainNewsAbmedia04-23 15:24

Comment

0/400

No comments