Table of Contents
- What You Need to Know
- Key Questions Answered
- Core Findings
- Contradictions & Debates
- Deep Analysis
- Implications
- Future Outlook
- Unknowns & Open Questions
- Evidence Map
- References
What You Need to Know
China’s AI push is one of the most consequential tech-geopolitical contests of this century. The central finding? Paradoxical. US export controls have both blocked China from frontier hardware and accelerated a domestic ecosystem that is less capable but increasingly viable. Huawei’s Ascend chips have made fast iterative progress (910A → 910B → 910C → 910D), but they’re still stuck with SMIC’s sub‑50% yields on 7nm DUV, a software stack 12 years behind CUDA [8], and critical HBM dependencies on foreign suppliers [14], [16].
The most important analytical insight: a training vs. inference asymmetry created by the architecture of export controls. The US has successfully targeted arithmetic performance (critical for training), giving it a ~4‑year lead and a ~3× cost advantage in training price‑performance [13]. But controls left memory bandwidth (critical for inference) essentially unrestricted, so the US has zero lead in inference hardware [13]. Since inference costs dominate real‑world AI deployment, China’s AI deployment ecosystem can stay highly competitive even with a meaningful training hardware deficit.
DeepSeek’s R1 model—reportedly built for $5.6–6 million using pre‑sanctions A100 chips [10], [12]—triggered a $600 billion single‑day loss in NVIDIA market cap [12]. The big unresolved question: is “good enough AI” at 50–60% of frontier performance, combined with open‑source models, cheaper inference, and massive domestic scale, enough to shift global economic gravity? The evidence suggests “yes” for commercial deployment and inference, but “not yet” for frontier model training through at least 2027.
Overall confidence: MEDIUM. Too many critical claims lack independent verification—no standardized MLPerf benchmarks, yields, production volumes, or cost figures come from verified sources.
Key Questions Answered
Can Chinese AI materially shift global economic gravity away from the West?
Partially—through inference economics and open‑source diffusion, not frontier training supremacy. China is unlikely to match Western frontier training by 2030, but that may matter less than assumed. Evidence: Chinese chips already near parity in inference cost‑performance [13]; DeepSeek R1 achieves global adoption at a fraction of Western cost [10], [12]; cheap Chinese hardware + cheap/free models create a vertically integrated alternative stack [10], [12]; and the Global South is a massive addressable market where Chinese AI’s cost advantages are decisive [5], [10].
Can China build frontier AI without the West?
Not yet at frontier scale, but increasingly “good enough” for commercial and strategic purposes. The 910C achieves roughly 50–60% of H100 real‑world training performance and 60–70% inference [3], [4], [6], [8], [11], [14]. Over 90% of 130 notable Chinese language models (2017–2024) were trained on Western hardware [8]. However, software innovations and architectural efficiency are enabling competitive outputs despite inferior hardware [10], [12], [14].
Are sanctions slowing China or accelerating independence?
Both—but acceleration may be exceeding constraint. All sources agree controls have constrained access to EUV, advanced foundry nodes, and HBM [5], [6], [7], [8], [9], [11], [13], [14]. But they also forced Huawei to redesign chips for domestic fabrication [14], drove Chinese tech giants toward domestic alternatives [1], [2], [3], [4], [7], [8], catalyzed software efficiency innovations like DeepSeek’s architecture [10], [12], [14], and created a government‑backed procurement pipeline [15], [16]. NVIDIA’s share of the Chinese market dropped from 90% pre‑controls to ~50% as of May 2025 [8].
Is "good enough AI" economically sufficient?
For inference—increasingly yes. For frontier training—not yet. The Ascend 910C at $12,000–$28,000 vs. H100 at $25,000–$44,000 offers competitive price‑performance for inference [8], [14], [16]. DeepSeek’s $5.6–6 million development cost suggests software innovation can dramatically cut hardware requirements. However, training reliability remains a “critical weakness” of Chinese hardware [11], and no data exists on Chinese chips used for 1T+ parameter training.
Core Findings
1. Semiconductor Independence: Huawei’s Ascend Program
Hardware Evolution
| Chip | Year | Process | FP16 TFLOPS | Memory | Power | Source(s) |
|---|---|---|---|---|---|---|
| Ascend 910A | 2019 | TSMC N7+ (7nm EUV) | ~256 | HBM2e | 310–350W | [14], [16] |
| Ascend 910B | 2023 | SMIC N+1 (7nm DUV) | ~320 | 64GB HBM2e, ~400 GB/s | ~400W | [6], [9], [14] |
| Ascend 910C | 2024 | SMIC N+2 (7nm DUV) | ~800 | 96GB HBM2e or 128GB HBM3 | 310–600W | [3], [4], [14], [15] |
| Ascend 910D | 2025 | SMIC 7nm | 1,200 (claimed) | HBM2e, ~800 GB/s | 350W | [3], [5] |
The 910C uses a dual‑die chiplet design on SMIC’s 7nm DUV, ~53 billion transistors [11], [14], [15], [16]. It’s a ~3× FP16 throughput improvement over the 910A within the same effective process generation—achieved entirely through architectural optimization, since SMIC’s DUV can’t match TSMC’s density [14], [16].
The Performance Gap
Most consistently cited: DeepSeek tests show the 910C delivers ~60% of H100 inference throughput on large language models [3], [4], [6], [8], [11]. Source 14 provides more granularity:
- Training: 910C ≈ 50–60% of H100 [14]
- Inference: 910C ≈ 60–70% of H100 [14]
- Memory bandwidth: 910C ≈ 54% of H100 (~1,800 GB/s vs. ~3,350 GB/s) [14]
- Interconnect bandwidth: HCCS ≈ 44% of NVLink (~400 GB/s vs. 900 GB/s) [14]
The interconnect gap is killer. Huawei’s HCCS delivers ~56 GB/s per link vs. NVLink’s 400–900 GB/s per GPU pair, and there’s no NVSwitch equivalent [9]. At the 8‑GPU level, that’s ~392 GB/s vs. 3,200–7,200 GB/s—an 8–18× disadvantage [9]. That hurts distributed training the most.
The FLOP/s gap between 910C (~752–800 TFLOP/s FP16) and NVIDIA B200 (2,250 TFLOP/s FP16) is ~3× as of 2024 [8], down from an order of magnitude in 2018 [8]. Chinese memory bandwidth has improved ~24% per year (2017–2025) vs. ~13% for non‑Chinese chips [8].
Critical caveat: No independent, third‑party benchmark of any Ascend chip exists in these sources. All performance claims come from Huawei disclosures, leaks, or limited DeepSeek testing [1], [2], [3], [4], [5], [6], [7].
Source Disagreements on 910C Specs
This is messy. Look at the contradictions:
| Metric | Source 14 | Source 15 | Source 4 |
|---|---|---|---|
| FP16 TFLOPS | ~800 | ~800 | 780 BF16 |
| Memory Capacity | 96GB HBM2e | 128GB HBM3 | Not specified |
| Memory Bandwidth | ~1,800 GB/s | ~3.2 TB/s | 3.2 TB/s |
| Power (TDP) | 600W | ~310W | ~310W |
| Price | $12,000–$18,000 | Not specified | ~$28,571 (derived) |
These could be sub‑variants or steppings. The 910B3 variant introduced HBM3e with 1.2 TB/s bandwidth [16]. The power discrepancy (310W vs 600W) is huge for data center economics and remains unresolved.
2. Export Controls: A Double‑Edged Sword
The evolution:
- Oct 2022: Dual thresholds: aggregate bidirectional transfer rate <600 GB/s AND aggregated processing <4,800 bit TOPS (<300 FP16 TFLOPS) [9]. NVIDIA created the A800 and H800 with throttled NVLink [9], legally bypassing the rules [13].
- Oct 2023: Closed the A800 loophole by halving the arithmetic ceiling to 50% of A100 levels and removing network bandwidth as a control variable [13]. Forced the H20, retaining only ~15% of H200 arithmetic power but similar memory and network bandwidth [13].
- Dec 2024: Set a 2 GB/s/mm² bandwidth density limit on standalone HBM exports but exempted co‑packaged HBM, allowing H20 exports to continue [13]. Two BIS documents totaled 210 pages [13].
The resultant asymmetry is the most analytically important finding: export controls have been highly effective at restricting arithmetic performance (training) but essentially ineffective at restricting memory bandwidth (inference) [13]. The US has a ~4‑year lead in training hardware but zero lead in inference hardware [13]. Since inference costs dominate real‑world AI spend, the economic impact of controls is far less severe than headline specs suggest.
NVIDIA’s China revenue: ~$17 billion (13% of total) in FY2025 [7]. JPMorgan estimates up to $16 billion lost in 2025 worst‑case [7]. NVIDIA already forecast a $5.5 billion hit from the H20 ban [3], [7]. On April 28, 2025, NVIDIA shares fell over 2% following reports that Huawei is entering final testing for the Ascend 910D [7].
3. Foundation Models and the DeepSeek Disruption
DeepSeek is the poster child for China’s software push. Founded late 2023 by hedge fund manager Liang Wenfeng [10], [12]:
- R1 reportedly cost $5.6–6 million to develop [10], [12]—but these figures are unverified and likely exclude prior R&D, salaries, data acquisition, and infrastructure.
- Built using pre‑sanctions NVIDIA A100 chips [12].
- Surpassed ChatGPT in app store downloads shortly after launch [10].
- Open‑source release [10], [12].
- Uses inference‑time computing (selectively activating model portions per query) to cut costs [12].
- Achieved #3 on Chatbot Arena leaderboard [1].
- Marc Andreessen called it “AI’s Sputnik moment” [10].
Market carnage: NVIDIA fell 17% ($600B loss, largest single‑day loss in stock history), ASML fell 6%, Broadcom fell 17%, GE Vernova fell 21%, Vistra fell 28%, Nasdaq fell 3% [12].
Critical limitations: Both primary sources for DeepSeek claims are Bitrue cryptocurrency exchange blog posts—not tech research outlets. No benchmark data (MMLU, HumanEval, GSM8K) is provided [10], [12]. The $5.6–6M figure’s methodology is unspecified. Comparing a single training run to US companies spending “billions” is potentially misleading.
More credible corroboration: Tom’s Hardware reports DeepSeek provides native support for Huawei’s CUNN kernels and CUDA‑to‑CUNN conversion [11]. DeepSeek’s testing of the 910C positions it as a serious H‑W‑S integration player [11].
Open‑source as strategic weapon: DeepSeek’s open‑source release “democratizes access to advanced AI for smaller companies and developing nations” [10]. Combined with cheap Chinese hardware, this creates a vertically integrated alternative stack.
4. Software Ecosystem: The Hidden Moat
NVIDIA’s CUDA (2007) vs. Huawei’s CANN (2019)—a 12‑year head start [8]:
- CUDA: mature tools (TensorRT, cuDNN), deep PyTorch/TensorFlow integration [5], [9].
- CANN: bug‑prone, unstable, poorly documented [8]. Developers debug without community support; model optimization depends on Huawei and is slow [6].
- Fragmented stacks: MindSpore, PaddlePaddle, etc. [15].
Quantified impact: Over 90% of 130 notable Chinese language models (2017–2024) were trained on Western hardware [8]. The first model trained entirely on Chinese hardware (iFLYTEK Spark 3.5) came out only in January 2024 [8]. DeepSeek’s next model has reportedly been delayed due to the effort of porting to Ascend [15].
Bright spots: DeepSeek’s optimization of its V4 model for Ascend is “the most significant validation of the 910C ecosystem” [14]. CANN is now “production‑grade for Transformer workloads” [14]. As models converge on Transformers, NVIDIA’s ecosystem advantage may erode [11] (confidence 0.5).
5. Manufacturing, Supply Chain, and Production Scale
Yield Crisis
SMIC’s 7nm yields: reported <50% vs. TSMC ~90% [8]. Source 16 says N+2 yield improved from ~20% to 40–50% [16] (very low confidence). Source 3 says ~30% [3] (from a crypto blog). The real figure is unknown. No EUV access means SMIC is stuck with DUV [3], [5], [7], [8].
Production Volumes
| Metric | Figure | Source(s) | Confidence |
|---|---|---|---|
| 910B manufactured in 2024 | ~200,000–400,000 units | [8], [16] | Medium |
| NVIDIA GPUs delivered to China 2024 | ~1,000,000 | [8] | Medium |
| 2025 combined 910B+C target | 400,000–1,000,000 units | [7], [8], [16] | Low |
| 910C initial order | 70,000 chips (~$2B) | [1], [2], [4] | Medium |
| 910C production capacity | 26,000 wafers/month | [16] | Low |
Source 16 admits specs and shipment figures are “largely mysterious” and based on “unreliable information” [16]. The most consistent number is the 70,000‑unit initial order [1], [2], [4].
Supply Chain Dependencies
- HBM still from SK Hynix/Samsung, not domestic Changxin Memory [14], [16]. Both are under US pressure.
- Shadow supply chain: An estimated over 2 million Ascend 910B logic dies may have been manufactured by TSMC for Huawei shell companies after controls [8]—legally precarious.
- Samsung disruption risk: Samsung reportedly paused production of Baidu’s 4nm Kunlun chip designs [15].
- Localization rate: Claimed >90% [16], but the source may be promotional.
6. The Broader Domestic Chip Ecosystem
Beyond Huawei:
- Baidu Kunlun P800: 345 TFLOPS FP16, 30,000‑chip cluster, Qianfan‑VL models trained on Kunlun. Orders worth >1B yuan (~$139M) from China Mobile [15]. Baidu stock up 64% over the year.
- Alibaba T‑Head PPU: 96GB HBM, PCIe 5.0, pitched as H20 rival. A China Unicom data center runs 16,000 PPUs out of 22,000 chips [15].
- Cambricon MLU 590: 345 TFLOPS FP16, 7nm, FP8 support in 2023. Returned to profitability late 2024; share price jumped ~500% [15].
Key observation: Most offerings are “barely comparable to NVIDIA’s A100 from 2020” [15]. The 910C at 50–60% of H100 training [14] is the best domestic ceiling.
7. System‑Level Scaling and the "Good Enough AI" Thesis
Chinese firms compensate with scale:
- Huawei Atlas 950 SuperPoD (2026 H2): 8,192 Ascend chips, 8 EFLOPS FP8, 1,152 TB total memory, 16.3 PB/s interconnect [6], [15]. Atlas 960 plans for 15,488 chips [15].
- CloudMatrix 384 (384 Ascend 910C chips) claimed stronger than NVIDIA GB200 NVL72 [16]—confidence very low (0.35).
- MoE architectures (like DeepSeek V3) require less all‑reduce communication, scaling better on constrained interconnects [14].
The “good enough” thesis says China doesn’t need to match NVIDIA chip‑for‑chip. Supporting evidence: DeepSeek R1’s frontier results on constrained hardware [1]; 910B roughly matching A100 in FP16 TFLOPS (320 vs 312) [9]; 60% inference performance acceptable for many commercial deployments [3], [4], [11]; lower power potentially reducing TCO [4], [5].
Against: training performance gap is largest, and no source shows successful frontier‑scale training on Chinese chips [1]–[8]. CUDA “software tax” remains significant [2], [6], [8].
8. Energy Efficiency and Compute Economics
Power data is contradictory:
| Chip | Source 14 | Source 15 | Source 5 | Source 4 |
|---|---|---|---|---|
| Ascend 910B | Not specified | 400W | — | — |
| Ascend 910C | 600W | ~310W | — | ~310W |
| Ascend 910D | — | — | 350W | — |
| NVIDIA H100 | 700W | 700W | 700W | ~700W |
If 910C is 310W, its perf/W (800 TFLOPS / 310W ≈ 2.58 TFLOPS/W) would beat H100 (990 TFLOPS / 700W ≈ 1.41 TFLOPS/W)—but that contradicts the 60% inference claim. No source explains this inconsistency [3], [4], [14], [15].
The 910D’s claimed 1.2 PFLOPS at 350W [5] would be dramatically superior if true. Confidence in energy efficiency claims: LOW.
9. Global South Influence and Market Strategy
Huawei targets non‑Western markets: China, Middle East, Russia, countries less aligned with US [5]. 910D reportedly 30–40% cheaper than comparable NVIDIA solutions [5]. Adoption barriers remain: political concerns and lack of trust [7]. Huawei’s market cap fell from ~$500B (2020) to ~$160B (2025) [5]; NVIDIA’s reached ~$2.2–2.6T [5].
No quantitative data on actual Global South adoption exists in these sources. Government procurement provides a domestic floor: China Mobile ordered >1B yuan of Kunlun chips [15]; China Unicom runs 16,000 Alibaba PPUs [15]. But telecom operators reportedly prefer mixing suppliers [15].
Contradictions & Debates
1. Ascend 910C Specifications: A Fundamentally Uncertain Product
The biggest unresolved contradictions:
- Memory: 96GB HBM2e at ~1,800 GB/s [14] vs. 128GB HBM3 at ~3.2 TB/s [15]
- Power: 600W [14] vs. ~310W [4], [15]
- Price: $12,000–$18,000 [14] vs. ~$25,000–$28,000 [1], [4], [16]
- Performance vs. H100: 50–60% training [14] vs. ~80% at FP16 [16]
Both sources 14 and 15 are rated “High” confidence in the chunk reports—this is a genuine disagreement.
2. Can Huawei Ever Catch NVIDIA?
- “Will likely never catch” [2] (editorial, low confidence)
- “910C matches/exceeds H100” [1] (Reddit, very low)
- “~60% of H100” [3], [4], [6], [8], [11] (most cited, low‑medium)
- “Gap narrowed from order of magnitude to ~3×” [8] (medium‑high)
- Jensen Huang: “serious competitor” [3] (strategic statement)
- “Most domestic offerings barely comparable to A100 from 2020” [15] (medium)
Synthesis: the gap has narrowed but is still significant. Most important nuance: gap is far smaller for inference than training.
3. Production Scale: 100,000 to 1.4 Million
Sources range from 100K+ to 1.4M Ascend units for 2025 [1], [4], [7], [8], [16]. Most grounded: 400K–1M combined 910B+C [8], [16]. The 1.4M figure comes from an anonymous Reddit post [1]. The most solid number is the 70,000‑unit initial order [1], [2], [4].
4. Are Export Controls Working?
For: Huawei forced from TSMC to SMIC [11]; 910C at 50–60% H100 training [14]; 90% of Chinese models on Western HW [8]; SMIC yields abysmal [8]; H20 caused buyer dissatisfaction [3].
Against: DeepSeek built competitive model on pre‑sanctions HW [10], [12]; software innovations reduce HW dependency [10], [12], [14]; NVIDIA’s China share fell 90% to 50% [8]; $600B NVIDIA market cap loss [12]; shadow supply chains (2M+ TSMC dies [8]).
Assessment: Controls impose real costs on hardware but fail to prevent competitive AI development, especially in inference. They forced the creation of a domestic ecosystem that wouldn’t have existed otherwise.
5. Hardware Specs vs. Software Innovation
Source 9 emphasizes hardware specs. Sources 10 & 12 argue software can compensate. Source 14 notes MoE can offset interconnect constraints. This is an unresolved, consequential question.
Deep Analysis
The Training vs. Inference Asymmetry
The most important analytical finding [13]:
- Training needs massive arithmetic throughput. BIS controls have successfully restricted arithmetic, giving the US a
3× cost advantage (4‑year lead) in training HW. - Inference depends mainly on memory bandwidth. BIS controls left memory bandwidth essentially unrestricted [13]. The H20, with only ~15% of H200’s arithmetic power, retains similar memory bandwidth [13].
Practical implication: China can serve (infer from) frontier models at near‑US‑parity costs. Its disadvantage is concentrated in training new frontier models. Since inference costs dominate total AI spend, the economic impact of controls is far less severe than headline gaps suggest.
This suggests a bifurcated strategy: ByteDance and Tencent use NVIDIA GPUs for training (when available) and Huawei Ascend for inference [8]. As inference demand grows relative to training, Chinese HW becomes progressively more competitive in the workload that matters most for economic deployment—without ever matching NVIDIA for training.
The "Sanctions Made China Stronger" Thesis
Evidence strongly supports a contrarian view. Before sanctions, Chinese firms bought NVIDIA chips at scale with no incentive for alternatives. Controls:
- Forced Huawei to redesign Ascend for SMIC’s DUV process [14].
- Created a domestic software ecosystem (MindSpore/CANN) [14], [15].
- Drove tech giants to domestic options—ByteDance ordered 100,000 Ascend 910B [3]; Baidu, ByteDance, Tencent collaborating on Ascend [4]; hybrid approaches [8].
- Created a government‑backed procurement pipeline with regulatory mandates [7], [15], [16].
- Catalyzed software innovations like DeepSeek [10], [12], [14].
- Forced NVIDIA to accept permanent market share loss—from 90% to ~50% [8].
Limits: Yield constraints (below 50% at SMIC [8]) and no EUV access mean sanctions still impose real costs.
Is Cheap Inference More Powerful Than Smarter Models?
DeepSeek suggests cost efficiency may matter more than raw capability for commercial apps. R1 achieved competitive results for $5.6–6M (unverified [10], [12]) while surpassing ChatGPT in downloads [10]. Open‑source release enables deployment on any compatible HW [10], [12]. Inference‑time computing (selectively activating model portions [12]) reduces per‑query costs. If this trajectory continues, China could deliver 80–90% of frontier capabilities at 10–20% of cost—a “good enough” proposition many commercial users would accept.
China’s Path to AI Dominance Without Frontier Silicon
Combining all evidence, China’s most viable path to influence by 2030 doesn’t require matching Western frontier HW:
- Dominate inference economics through cheaper chips (Ascend at 50–80% of NVIDIA pricing [14], [16]) and lower power.
- Open‑source model diffusion that achieves global adoption regardless of HW constraints [10], [12].
- Massive domestic scale (potentially 1M+ Ascend chips by 2026 [8], [16]) creating network effects and maturing the software ecosystem.
- Algorithmic innovation (MoE, inference‑time computing, distillation) reducing HW requirements [10], [12], [14].
- Global South infrastructure built on Chinese HW + models at price points NVIDIA can’t match [5], [10].
This path wouldn’t be technological supremacy in the traditional sense—but it could produce the default AI infrastructure for most of the world’s population.
Implications
For US Export Control Policy
The evidence suggests controls need fundamental redesign:
- The training/inference asymmetry [13] means current controls constrain training but not deployment. To limit deployment, target memory bandwidth.
- The December 2024 exemption of co‑packaged HBM [13] preserves H20 exports but leaves a vulnerability.
- Shadow supply chains (2M+ TSMC dies for Huawei [8]) represent enforcement failures.
- The “sanctions as accelerator” effect [5], [7], [14], [16] means diminishing marginal returns.
- Market share loss is permanent—NVIDIA’s 90%→50% [8] is unlikely to reverse even if controls are relaxed.
For Global AI Competition
- DeepSeek’s cost claims challenge the assumption that massive capex is required for frontier AI [10], [12] (though verification is needed).
- Two parallel ecosystems are emerging: US‑optimized for frontier training and high‑performance inference; Chinese for cost‑efficient inference and open‑source deployment.
- The “good enough” segment could be captured by Chinese HW and models, especially in price‑sensitive markets.
- NVIDIA faces a structural threat from Chinese software innovations reducing demand for frontier compute, not from Chinese HW matching its specs.
For NVIDIA and Western Semiconductor Firms
- The Chinese market—previously NVIDIA’s second‑largest—is being permanently lost to domestic alternatives [7], [8].
- The $16B worst‑case revenue loss [7] could accelerate lobbying for relaxed export controls.
- Jensen Huang’s acknowledgement of Huawei as a “serious competitor” [3] may reflect real concern or strategic positioning.
- Western chip companies face deflationary pressure as Chinese alternatives set cost benchmarks.
For the Global South
- Open‑source Chinese models offer accessible AI without Western licensing costs [10], [12].
- Chinese HW provides a sanctions‑resistant infrastructure option at 30–40% lower cost [5].
- The combination creates a vertically integrated alternative stack developing nations may prefer.
- No quantitative data on actual adoption—this is a critical unknown.
For Global Labor Markets and Deflation
Not directly covered in sources, but Chinese AI cost efficiency will likely:
- Put deflationary pressure on Western AI pricing.
- Accelerate automation across white‑collar and manufacturing sectors.
- Further enhance Chinese manufacturing competitiveness (though direct evidence on AI + manufacturing is absent).
- Compress wages in AI‑adjacent sectors globally.
Future Outlook
Scenario A — “Sanctioned but Self‑Sufficient” (15–25%)
By 2030, SMIC yields >60%, Huawei closes per‑chip gap to within 80% of latest NVIDIA, annual Ascend production >2M. HBM secured through domestic or continued Korean access. CANN matures to support frontier training. Chinese models rival Western ones on domestic HW. Global South defaults to Chinese infrastructure.
Key dependencies: Breakthroughs in yields, HBM localization, software ecosystem—all three must succeed.
Scenario B — “Fragmented AI World” (Base Case) (40–50%)
Two parallel ecosystems solidify by 2030: US stack (NVIDIA + OpenAI + hyperscalers) for frontier training; Chinese stack (Huawei + DeepSeek/Qwen/Ernie + domestic cloud) for cost‑efficient inference and open source. Global South adopts Chinese for price‑sensitive apps; Western enterprises stay on US. Training gap narrows to ~1.5–2× but persists. Hybrid approaches continue. CUDA advantage remains the most durable asymmetry.
This is the trajectory most consistent with current evidence.
Scenario C — “Chinese Cost Shock” (10–15%)
DeepSeek’s efficiency innovations prove generalizable. China delivers 80–90% of frontier capabilities at 10–20% cost by 2030. Western AI companies face severe deflation. NVIDIA’s revenue base erodes. The $500B US “Stargate” initiative [12] and projected $1T total US AI investment [12] prove overcommitment.
Key dependencies: DeepSeek’s cost claims must be replicable; sustained software innovation; cheap inference beats smarter models.
Scenario D — “Taiwan Crisis / Compute Supply Disruption” (5–10%)
A Taiwan crisis cuts off TSMC production. China’s domestic Ascend production—though inferior—becomes the only available AI HW at scale. Shadow supply chain eliminated. 400,000–1,000,000 annual Ascend production becomes strategically decisive. But China also loses remaining NVIDIA HW and faces HBM disruption.
This is the extreme test of the “good enough AI” thesis.
Scenario E — “Open‑Source Dominance” (15–20%)
Chinese open‑source models become the global default for most commercial AI by 2030. Western proprietary models retain frontier edge but lose market share in the larger commercial inference market. Chinese firms monetize through cloud, HW sales, and ecosystem lock‑in.
Key dependencies: Chinese models must maintain competitive quality; government support for open source; Western response (e.g., Meta’s LLaMA).
Unknowns & Open Questions
- Independent benchmarks: No MLPerf or equivalent for any Ascend chip [1]–[16].
- True yield rates: 30% [3] vs. <50% [8] vs. 40–50% [16]—unknown.
- Ascend 910C true specs: Memory (96 vs 128GB), bandwidth (1,800 vs 3.2 TB/s), power (310 vs 600W), price ($12–18K vs $25–28K).
- Training vs. inference split at scale: No cluster‑scale training data for Chinese chips.
- Software ecosystem maturity: How many production workloads run on CANN/MindSpore? Developer size? CUDA porting penalty?
- DeepSeek’s actual benchmark performance against GPT‑4, Gemini, LLaMA [10], [12].
- What does the $5.6–6M figure include? Training compute only? Total R&D?
- How many A100 chips does DeepSeek possess? Thousands vs. hundreds changes interpretation.
- HBM supply chain resilience under escalating US pressure [14]—no quantification of risk or stockpiles.
- Gray market GPU flows—extent of unofficial NVIDIA hardware acquisition.
- Global South adoption data—no quantitative evidence.
- Real‑world cluster performance for 1,000+ Ascend clusters.
- Software ecosystem tipping point—when does Ascend/CANN become self‑sustaining?
- Next‑gen roadmap feasibility—Ascend 970 (2028) and Atlas 960 [15] are aspirational.
- Algorithmic efficiency offsets—how much can MoE, quantization, distillation compensate for HW?
Evidence Map
| Theme | Sources | Confidence | Key Gap |
|---|---|---|---|
| 910C ~60% H100 inference | [3], [4], [6], [8], [11] | Low‑Med | No independent benchmarks |
| HW gap narrowed to ~3× | [8], [14] | Med‑High | Directional consistency |
| SMIC yields <50% | [3], [8], [16] | Low‑Med | Single‑source, inconsistent |
| Software ecosystem 12 yrs behind CUDA | [5], [6], [8], [15] | High | Consensus |
| Export controls driving domestic adoption | [1]–[8], [13], [14] | High | Consensus across clusters |
| Training/inference asymmetry | [13] | High | Single rigorous source |
| DeepSeek cost claims ($5.6–6M) | [10], [12] | Low | Unverified, promotional |
| DeepSeek market impact ($600B loss) | [12] | Med | Widely reported but crypto source |
| Production volumes (400K–1M) | [7], [8], [16] | Low | Wide discrepancy |
| Shadow supply chain (2M+ TSMC dies) | [8] | Med | Single source |
| HBM from SK Hynix/Samsung | [14], [16] | Med | No risk quantification |
| Interconnect gap (8–18×) | [9] | High | Datasheet‑based |
| Power efficiency (310W vs 700W) | [4], [5], [15] | Low | Contradictory |
| Jensen Huang acknowledges Huawei | [3] | Med | Strategic statement |
| Domestic ecosystem (Baidu, Alibaba, Cambricon) | [15] | Med | Single source |
| Global South adoption | [5], [7], [10] | Low | Speculative only |
References
- Huawei's Ascend 910C chip matches NVIDIA's H100 - https://reddit.com/r/deeplearning/comments/1ihecl0/huaweis_ascend_910c_chip_matches_nvidias_h100
- Huawei vs NVIDIA: Chip Performance Comparison - https://asapdrew.com/p/huawei-vs-nvidia-chip-performance
- Huawei vs NVIDIA: Ascend Chip Performance 2025 - https://bitrue.com/blog/huawei-vs-nvidia-ascend-chip-performance-2025
- Huawei Ascend 910C vs Nvidia H100: A Big Leap Towards AI Independence | LinkedIn - https://linkedin.com/pulse/huawei-ascend-910c-vs-nvidia-h100-big-leap-towards-ai-qureshi-xi84e
- Huawei's Ascend 910D: The Silent Challenger to Nvidia's AI Crown – A Deep Global Perspective (2025) - https://semiconductorsinsight.com/huawei-ascend-910d-vs-nvidia-h100
- Comparing Ascend 910B and NVIDIA H100 - https://github.com/lzwjava/jekyll-ai-blog/blob/main/notes/2026-03-28-ascend-910b-vs-h100-en.md
- Why Huawei's New AI Chip Isn't a Global Threat to Nvidia Yet - https://tecknexus.com/why-huaweis-new-ai-chip-isnt-a-global-threat-to-nvidia-yet
- Why China Isn't About to Leap Ahead - https://epochai.substack.com/p/why-china-isnt-about-to-leap-ahead
- GPU Performance Datasheets: NVIDIA & Huawei/HiSilicon - https://arthurchiao.art/blog/gpu-data-sheets
- What is DeepSeek AI? - https://bitrue.com/blog/what-is-deepseek-ai
- DeepSeek research suggests Huawei's Ascend 910C delivers 60% NVIDIA H100 inference performance - https://tomshardware.com/tech-industry/artificial-intelligence/deepseek-research-suggests-huaweis-ascend-910c-delivers-60-percent-nvidia-h100-inference-performance
- DeepSeek AI: Chinese Innovation Cripples BTC, Nvidia Impact - https://bitrue.com/blog/deepseek-ai-chinese-innovation-cripples-btc-nvidia-impact
- US export controls on China and their impact on AI - https://epoch.ai/gradient-updates/us-export-controls-china-ai
- Huawei Ascend 910C - Awesome Agents AI Hardware Analysis - https://awesomeagents.ai/hardware/huawei-ascend-910c
- Who Will Fill Nvidia's AI Chip Void? - https://recodechinaai.substack.com/p/who-will-fill-nvidias-ai-chip-void
- A Brief Introduction to Huawei Ascend Cloud - https://medium.com/@huaweiclouddevelper/a-brief-introduction-to-huawei-ascend-cloud-cbef8f25bc34