On August 19, 2025, DeepSeek officially open-sourced its new-generation large model DeepSeek-V3.1 on the Hugging Face platform, and released the full version to global developers on August 21. The company positioned it as “the first step towards the Agent (intelligent agent) era,” significantly enhancing tool usage and agent task capabilities through Post-Training optimization, while announcing API price adjustments and deep adaptation plans for domestic chips, causing a stir in the industry.
DeepSeek, as a company focused on low-cost, high-performance AI models, relies on large-scale GPU clusters for its training and inference systems (such as using 2048 NVIDIA H800 GPUs to train the DeepSeek-V3 model). These clusters face the challenge of rapid growth in computing power while network bandwidth fails to keep pace. Specifically, the annual growth rate of GPU computing speed is about 68%, but traditional storage and network bandwidth improvements are only 25%, leading to bandwidth starvation issues, which could result in a 47% bandwidth gap by 2026.
With the rapid rise of large-scale AI models such as DeepSeek, global demand for computing power has reached unprecedented levels. Training massive models requires tens of thousands or even millions of GPUs working in parallel, where the efficiency of parameter synchronization and data transmission directly determines the speed and cost of training.
In this context, traditional 400G optical modules can no longer meet the high-bandwidth and low-latency requirements of large AI clusters. 800G optical modules have emerged as a core driver of network upgrades for computing infrastructure. By delivering higher data transmission rates, lower latency, and improved energy efficiency, 800G modules help overcome bandwidth bottlenecks in DeepSeek’s large-scale computing clusters.
Network Challenges Driven by the Compute Boom
In training DeepSeek-class models, GPUs must exchange massive volumes of data—such as gradient transfers and global parameter updates. When network bandwidth falls short, communication bottlenecks emerge, preventing GPUs from reaching their full performance potential.
By delivering higher data throughput, lower latency, and improved energy efficiency, 800G optical modules address these bottlenecks head-on, enabling DeepSeek-scale compute clusters to operate at maximum efficiency.
As DeepSeek continues to scale its AI models and infrastructure, the network layer faces unprecedented pressure. The key bandwidth bottlenecks can be summarized in three areas:
Next-generation models such as DeepSeek-V3 leverage Mixture of Experts (MoE) architectures, which demand massive all-to-all communication among experts. While today’s InfiniBand (IB) networks deliver up to 400Gbps, this capacity falls short as cluster performance expands from 2 ExaFLOPS to 10 ExaFLOPS. Traditional copper cables and legacy optical modules simply cannot handle the explosive traffic, resulting in lower training efficiency (with Model FLOPs Utilization, MFU, dropping to just 43.7%).
Training workflows span preprocessing, forward and backward propagation, and checkpointing. Storage latency and bandwidth constraints can quickly magnify bottlenecks. DeepSeek is adopting PCIe 5.0/6.0 and CXL 3.0 memory pooling to narrow the storage–compute gap, but network infrastructure upgrades remain essential to keep pace with computation growth.
With cost-efficient models such as DeepSeek-V3 and R1, AI is rapidly moving toward the edge, generating exponentially more data. In industrial and factory deployments, optical communication nodes may increase 3–5x, requiring significantly higher bandwidth to enable the rollout of micro data centers.
Ultra-high bandwidth transmission:
800G optical modules deliver data rates of 800Gbps—double that of 400G—enabling massive parallel communication between GPUs in AI clusters. For example, in NVIDIA HPC clusters, 800G modules ensure low latency and high throughput, supporting DeepSeek’s Multi-Plane network topology (a 2-layer Fat-Tree structure that costs only half as much as a traditional Fat-Tree) and reducing cluster-level networking overhead. This fully saturates InfiniBand or Ethernet links and optimizes MoE communication (e.g., the DeepEP module currently saturates 400Gbps IB but can be upgraded to 800G in the future).
Low power consumption with silicon photonics integration:
Leveraging silicon photonics (SiPh) and co-packaged optics (CPO), 800G modules reduce energy consumption by up to 50%, achieving around 5 pJ per bit, and support high-density deployment. DeepSeek’s future configuration (targeted for 2028) plans to use optical links to connect GPUs with photonic SSD fabrics, enabling over 1 TB/s of aggregate storage throughput and eliminating the “storage wall.”
Low-latency optimization:
In DeepSeek’s inference systems, prefill and decode stages adopt EP/DP parallel strategies. 800G modules help hide communication latency (e.g., via dual-batch overlapping techniques), boosting generation speed by 1.8×. This is critical for real-time AI applications such as intelligent customer service, which is seeing quarterly usage growth of over 270%.
Higher bandwidth for improved interconnect efficiency:
800G optical modules deliver 800Gbps per port, enabling high-speed connections between GPUs, between GPUs and switches, and between switches. This significantly reduces training latency. Compared with deploying multiple 400G modules, 800G solutions greatly simplify cabling and reduce the number of required ports.
Support for large-scale AI network architectures:
In high-performance interconnect topologies such as Fat-Tree and Dragonfly+, 800G modules boost bidirectional network bandwidth, prevent congestion, and ensure the efficient training of large-scale DeepSeek models.
Reducing Energy Consumption and Overall Costs:
A single 800G optical module outperforms two 400G modules in terms of power consumption and port utilization, which can reduce the number of optical fibers and switch ports while optimizing the total cost of ownership (TCO) for data centers.
Reducing Latency and Power Consumption:
By adopting silicon photonics and co-packaged optics (CPO) technology, the optical module reduces energy consumption by 50% and supports high-density deployment. This is crucial for DeepSeek’s MoE model, as it can hide communication latency and improve inference speed by 1.8 times.
Future-Oriented Evolution Node:
800G serves as an important transition from 400G to 1.6T optical interconnect evolution. From 2025 to 2026, 800G optical modules are expected to become the mainstream choice for AI and cloud computing data centers, laying the foundation for future higher-speed optical interconnects.
AI/HPC Cluster Internal Interconnection: 800G SR8/SR4 modules are used for high-speed GPU communication within racks and between racks.
Data Center Interconnect (DCI): 800G FR4/LR4 modules meet the medium to long-distance connection needs across racks and across data centers.
Liquid Cooling Scenario Deployment: Some 800G OSFP/QSFP-DD modules support liquid cooling design, solving the heat dissipation challenges caused by high power consumption in DeepSeek clusters.
The explosion in AI computing power driven by DeepSeek has imposed higher requirements on network bandwidth, latency, and energy efficiency. As a core interconnect device, the 800G optical module not only resolves communication bottlenecks but also enhances the overall computing power utilization rate of large-scale clusters, making it the inevitable choice for upgrading AI data centers. In the future, with the emergence of 1.6T and higher-speed optical modules, 800G will continue to serve as a key transition technology, supporting the development of next-generation AI infrastructure.