In the digital era, data centers have become the core assets of enterprises and organizations, carrying the tasks of processing, storing and analyzing massive data. In order to meet the construction of AI networks, Cisco's N9K series products combine the latest AI technology and innovative hardware design, which can not only meet current network needs, but also lay a solid foundation for future development. It supports port speeds from 25G to 400G and has the ability to expand to 800G ports, ensuring that the network can adapt to the growth of data traffic in the next few years. Through CloudScale and Silicon one technology, Cisco's AI network products can achieve more intelligent traffic management and automated operation and maintenance, bringing users an unprecedented network experience.
With the rise of artificial intelligence (AI) and machine learning (ML) technologies, the network architecture and operation methods of data centers are undergoing a profound change. Cisco, as a leader in network technology, the latest AI technology in its data center is redefining the intelligence and automation level of the network.
Data transmission efficiency and stability are crucial in AI networks. Cisco ensures the integrity and reliability of data transmission by using lossless network technology.
- RDMA over Ethernet (RoCEv2): This technology allows remote direct memory access directly over Ethernet, reducing CPU load and improving data transmission efficiency.
- PFC (Priority Flow Control): Priority flow control ensures priority transmission of key data flows and avoids network congestion.
- ECN (Explicit Congestion Notification): Explicit congestion notification notifies the sending end of network congestion and takes measures in advance to reduce packet loss.
Through non-blocking high-performance network technology, a powerful data exchange platform is built to provide the underlying foundation for the AI network:
- 400G/800G Clos: High-bandwidth Clos network architecture supports large-scale data exchange and meets the high-bandwidth requirements of data centers.
- UDFECMP: An advanced congestion management algorithm that intelligently distributes traffic in the network to avoid congestion.
- DLB (Dynamic Load Balancing): Dynamic load balancing, adjusts data flow according to real-time network conditions, and optimizes resource allocation.
In order to improve the adaptive capabilities of the AI network and avoid network congestion affecting data center network performance. Cisco manages congestion through the following smart technologies:
- Smart Buffering: Intelligent buffering, dynamically adjusts the buffer size to adapt to different network conditions.
- AFD (Approximate Fair Drop): Approximate Fair Drop, a flow control mechanism that can fairly discard data packets when congestion occurs to ensure network fairness and efficiency.
At the same time, Cisco has greatly simplified the complexity of AI network management through intelligent and automated solutions, achieving in-depth network management and visibility.
- Hardware Telemetry: Hardware telemetry monitors the status of network devices in real time and provides data support for network management.
- NDI (Network Data Ingest) and NDFC (Network Data Forwarding and Control): Network data collection and control technology to achieve in-depth analysis and intelligent control of network traffic.
N9K is the industry's first platform to support RoCEv2 over VXLAN EVPN, which enables data center networks to achieve more efficient data transmission and lower latency. By implementing RoCEv2 on the overlay network, the N9K platform helps enterprises save investment in AI/ML and enterprise networks and achieve optimal allocation of resources.
RoCEv2 is an extension of RoCEv1 and overcomes the limitation of RoCEv1 that it can only communicate within the same VLAN. RoCEv2 enables packets to be routed across L2 and L3 networks by introducing IP and UDP headers. The RoCEv2 data packet format includes an IP header and a UDP header, which enables it to be transmitted through an IP L3 router. The UDP header serves as a stateless encapsulation layer for RDMA transmission protocol data packets on IP.
Its core features include:
Data Center Bridging (DCB)
RoCEv2 leverages Data Center Bridging (DCB) technology, a quality of service (QoS) architecture for Ethernet networks. DCB ensures the priority and bandwidth allocation of network traffic through the 802.1Qaz standard.
DCBX protocol
The DCB Exchange (DCBX) protocol allows automatic negotiation of DCB parameters between switches and network cards. This includes Priority Flow Control (PFC, 802.1Qbb) and Explicit Congestion Notification (ECN, RFC 3168) to optimize network performance and avoid congestion.
Enhanced Transport Selection (ETS)
ETS is part of the 802.1Qaz standard, which allows more fine-grained bandwidth allocation to ensure that traffic of different priorities can receive appropriate network resources.
RoCEv2 is an Ethernet-based remote direct memory access (RDMA) technology that allows efficient data transfer within the data center. Compared with the traditional TCP/IP protocol, RoCEv2 provides higher performance and lower latency by reducing CPU load and delay during data transmission. Ideal for applications requiring low latency and high throughput, such as high-performance computing (HPC), storage area networks (SAN), and virtualized environments. It provides strong support for these applications by reducing data transmission delays and improving network reliability. Play a key role in building efficient and reliable AI network infrastructure.