The Fast Lane - High-Speed Networks, Congestion Control, and QoS Demystified

June 15, 2023 — #Computer Networks #CS Basics

Welcome to the express lane of computer networking! As our digital demands have skyrocketed, so has the need for networks that can handle vast amounts of data quickly and efficiently. From foundational technologies that paved the way for faster communication to sophisticated mechanisms for managing traffic and ensuring quality of service, the world of high-speed networks is both fascinating and crucial.

In this extensive post, we'll journey through various high-speed WAN and LAN technologies, delve into the challenges of network congestion and how it's managed, explore how TCP performs in such environments, and finally, unravel architectures designed to provide differentiated Quality of Service (QoS) for diverse applications. Buckle up!

Foundations of High-Speed Wide Area Networking

Before today's multi-gigabit internet connections became commonplace, several key technologies laid the groundwork for high-speed data transmission over wide areas.

Frame Relay Networks

Frame Relay was a popular WAN technology in the 1990s, offering a more efficient alternative to older X.25 packet-switching networks.

Overview: It's a packet-switching telecommunication service designed for cost-efficient data transmission for intermittent traffic between LANs and between endpoints in a WAN.
Virtual Circuits: Frame Relay operates using virtual circuits (VCs), which are logical connections between two endpoints.
- Permanent Virtual Circuits (PVCs): Pre-configured by the service provider, offering a constant connection.
- Switched Virtual Circuits (SVCs): Established on demand, similar to a phone call.
Frames: Data is transmitted in variable-length packets called frames. Error checking (but not correction) was done at the endpoints rather than by the network, simplifying the network's job and improving speed.
It provided higher performance and less delay compared to X.25 by streamlining protocol processing at the nodes.

Asynchronous Transfer Mode (ATM)

ATM was a highly ambitious technology designed to be a single, unified network for voice, video, and data, offering high speeds and QoS guarantees.

ATM Protocol Architecture: ATM has its own layered model:
- Physical Layer: Defines the physical medium and transmission characteristics.
- ATM Layer: Responsible for cell relay, routing using Virtual Path Identifiers (VPIs) and Virtual Channel Identifiers (VCIs), and multiplexing/demultiplexing cells.
- ATM Adaptation Layer (AAL): Bridges the gap between the services provided by the ATM layer and the needs of higher-layer protocols (like IP or voice). It segments higher-layer data into ATM cells at the sender and reassembles them at the receiver. Different AAL types (AAL1, AAL2, AAL3/4, AAL5) were defined for different types of traffic.
ATM Logical Connection: Connections in ATM are identified by a VPI and a VCI in the cell header. A Virtual Path (VP) can contain multiple Virtual Channels (VCs). This two-tiered addressing provided flexibility in managing network resources.
ATM Cell: A key feature of ATM is its use of small, fixed-size packets called cells. Each cell is 53 bytes long: 5 bytes for the header (containing VPI/VCI, payload type, cell loss priority, etc.) and 48 bytes for the payload. The fixed size was chosen to enable fast hardware switching and minimize queuing delays for real-time traffic.
ATM Service Categories: ATM defined several service categories to support different QoS requirements:
- Constant Bit Rate (CBR): For applications needing a fixed, guaranteed bandwidth and low delay (e.g., uncompressed voice, circuit emulation).
- Variable Bit Rate (VBR):
  - Real-Time VBR (rt-VBR): For applications with variable bit rates but strict timing requirements (e.g., compressed video conferencing).
  - Non-Real-Time VBR (nrt-VBR): For applications with bursty traffic but no strict timing (e.g., some data transfers).
- Available Bit Rate (ABR): For applications that can adapt their transmission rate based on network feedback (e.g., data applications that can tolerate delays). The network provides feedback to sources on the available bandwidth.
- Unspecified Bit Rate (UBR): A best-effort service with no guarantees, for non-critical applications.
ATM Adaptation Layer (AAL): As mentioned, this layer adapts higher-level PDUs to the 48-byte payload of ATM cells.
- AAL1: For CBR services, connection-oriented.
- AAL2: For rt-VBR services like compressed voice/video.
- AAL3/4: For VBR data services, connection-oriented or connectionless (merged from two original proposals). Complex.
- AAL5: For VBR data services, simpler and more efficient than AAL3/4, widely used for IP over ATM.

While ATM didn't become the single unifying network as envisioned (Ethernet's evolution and IP's flexibility largely took over), many of its concepts influenced modern networking, especially in QoS and traffic management.

High-Speed Local Area Networks (LANs)

While WAN technologies were evolving, LANs also saw dramatic speed increases, primarily driven by Ethernet.

Fast Ethernet

Increased Ethernet speed from 10 Mbps to 100 Mbps.
Maintained the same CSMA/CD mechanism and frame format as 10 Mbps Ethernet for compatibility.
Introduced auto-negotiation, allowing devices to automatically select the highest common speed (10/100 Mbps) and duplex mode (half/full).
Primarily used with twisted-pair copper cabling (e.g., 100BASE-TX) and fiber optics.

Gigabit Ethernet

Increased Ethernet speed to 1 Gbps (1000 Mbps).
While it can use CSMA/CD in half-duplex mode (especially with hubs, though rare), Gigabit Ethernet is almost exclusively used in full-duplex mode with switches, making CSMA/CD irrelevant.
Supports both copper (e.g., 1000BASE-T over Cat 5e/6) and fiber optic cabling (e.g., 1000BASE-SX, 1000BASE-LX) for longer distances.
Introduced features like jumbo frames (larger than the standard 1500-byte Ethernet frame) to improve efficiency for large data transfers.

Further evolutions include 10 Gigabit Ethernet, 40 GbE, 100 GbE, and beyond, constantly pushing the speed limits for LANs and data centers.

Fibre Channel

A high-speed network technology (supporting speeds from 1 Gbps to 128 Gbps and higher) primarily used for Storage Area Networks (SANs).
Provides in-order, lossless delivery of raw block data.
Supports multiple topologies: point-to-point, arbitrated loop, and switched fabric (most common).
Uses its own protocol stack (FC-0 to FC-4) and can carry various upper-layer protocols, including SCSI (for storage commands), IP, and FICON (for mainframe connectivity).
Known for its reliability and performance in connecting servers to shared storage devices.

Taming the Rush: Queuing, Congestion, and Traffic Management

As network speeds increase, so does the potential for bottlenecks and congestion if traffic isn't managed properly.

Queuing Analysis-Queuing Models

Queuing theory is the mathematical study of waiting lines (or queues). In networks, queues form at routers and switches when packets arrive faster than they can be processed or transmitted.
Single Server Queues: A basic model (like M/M/1 in Kendall's notation) considers arrivals following a Poisson process, exponential service times, and a single server (e.g., an output link of a router). Analysis of such models helps predict average queue length, waiting time, and packet loss probability.
Understanding queuing behavior is fundamental to designing buffers and managing network performance.

Effects of Congestion

When too much traffic is aimed at a network link or node beyond its capacity, congestion occurs, leading to:

Increased Delay: Packets spend more time waiting in queues.
Increased Jitter: Variation in packet delay, detrimental for real-time audio/video.
Packet Loss: Router/switch buffers overflow, and newly arriving packets are dropped. This can trigger retransmissions by higher-level protocols (like TCP), potentially exacerbating congestion if not handled carefully.
Reduced Throughput: As more packets are lost and retransmitted, the effective goodput (useful data transfer rate) can decrease significantly.

Congestion Control

Mechanisms to prevent, detect, and alleviate congestion.

Open-Loop Congestion Control (Prevention): Policies at admission time to prevent congestion from happening in the first place (e.g., admission control, traffic shaping).
Closed-Loop Congestion Control (Reaction): Uses feedback from the network to dynamically adjust sending rates.
- Explicit Feedback: Routers explicitly signal congestion to end-systems (e.g., ICMP Source Quench, ECN bits in IP, RM cells in ATM ABR).
- Implicit Feedback: End-systems infer congestion from events like packet loss (detected via timeouts or duplicate ACKs) or increased RTT. TCP congestion control is a prime example.

Traffic Management

Involves techniques to control the amount and characteristics of traffic entering the network or flowing through it.

Traffic Shaping (e.g., Leaky Bucket, Token Bucket): Smooths out bursty traffic by regulating the rate at which packets are sent into the network, often by buffering excess packets.
Traffic Policing: Monitors traffic for conformance to a pre-negotiated traffic contract. Non-conforming traffic might be dropped, marked with lower priority, or shaped.

Congestion Control in Packet Switching Networks

General strategies include:

End-to-end Congestion Control: No explicit support from the network; end systems infer and react to congestion (e.g., TCP).
Network-assisted Congestion Control: Routers provide explicit feedback to end systems about congestion (e.g., ECN, ATM ABR's explicit rate).

Frame Relay Congestion Control

Frame Relay networks had mechanisms to indicate congestion:

Forward Explicit Congestion Notification (FECN): A bit in the frame header set by a congested switch, sent "downstream" towards the destination. The destination DTE could then inform its source.
Backward Explicit Congestion Notification (BECN): A bit in the frame header set by a congested switch, sent "upstream" towards the source. The source DTE could then reduce its sending rate.
Discard Eligibility (DE) bit: Frames could be marked with DE=1 if they were considered lower priority (e.g., traffic exceeding a committed information rate). Congested switches would discard DE-marked frames first.

TCP Performance and Congestion Control in High-Speed Environments

TCP's mechanisms were originally designed for networks with different characteristics than today's high-speed, potentially high-latency (long fat networks - LFNs) or specialized networks like ATM.

Retransmission Timer Management

Exponential RTO Backoff: As covered in previous blogs, when TCP retransmits a segment due to a timeout, it doubles the Retransmission Timeout (RTO) value for subsequent retransmissions of that same segment. This helps avoid further overwhelming a congested network.
Karn’s Algorithm: Addresses ambiguity in RTT measurements for retransmitted segments.
1. Do not update RTT estimates using acknowledgments for retransmitted segments (because you don't know if the ACK is for the original transmission or a retransmission).
2. When a timeout occurs, use the backed-off RTO. Only after a segment is successfully acknowledged without retransmission should normal RTT estimation and RTO calculation resume.

Window Management

TCP uses a sliding window for flow control (receiver's advertised window) and congestion control (congestion window, cwnd). The amount of data TCP can send is $\min(\text{rwnd, cwnd})$ . Effective window management, especially for cwnd (slow start, congestion avoidance, fast retransmit/recovery), is crucial for performance.

Performance of TCP over ATM

Running TCP/IP over ATM presented challenges:

Cell Loss: ATM networks could drop individual cells. Losing a single cell from a segmented IP packet would corrupt the entire IP packet, forcing TCP to retransmit the whole packet (and potentially multiple segments if MSS was large).
Segmentation Overhead: The AAL5 segmentation and reassembly process added overhead.
Fairness: TCP's congestion control, designed for shared packet networks, didn't always interact well with ATM's own traffic management, especially for ABR service, where explicit rate feedback was available but not directly used by standard TCP.
Solutions like Explicit Congestion Notification (ECN) and specific TCP modifications were explored to improve performance.

Traffic and Congestion Control in ATM

ATM had a sophisticated framework for traffic management to support its diverse service categories.

Requirements & Attributes: Defined by traffic descriptors (e.g., Peak Cell Rate - PCR, Sustainable Cell Rate - SCR, Maximum Burst Size - MBS) and QoS parameters (e.g., Cell Loss Ratio - CLR, Cell Transfer Delay - CTD, Cell Delay Variation - CDV).
Traffic Management Framework: Included connection admission control (CAC), usage parameter control (UPC)/network parameter control (NPC), priority control, and congestion control.
Traffic Control (UPC/NPC): Policing functions at the User-Network Interface (UPC) and Network-Network Interface (NNI) to ensure traffic conforms to negotiated parameters.
ABR (Available Bit Rate) Traffic Management: Designed for data traffic that can tolerate delays and adjust its sending rate.
- ABR Rate Control: A closed-loop, rate-based flow control mechanism. The network provides feedback to the source about the currently available bandwidth.
- RM (Resource Management) Cells: Special cells sent by the source interspersed with data cells. Switches along the path can modify fields in these RM cells (e.g., Explicit Rate - ER, Congestion Indication - CI, No Increase - NI) to indicate available capacity or congestion. When the RM cell returns to the source, the source adjusts its Allowed Cell Rate (ACR).
- ABR Capacity Allocations: Switches use various algorithms to fairly allocate available bandwidth among ABR connections.
GFR (Guaranteed Frame Rate) Traffic Management: A service category that aimed to provide a minimum frame rate guarantee for frame-based traffic (like IP packets) over ATM, with less complexity than VBR.

Architectures for Quality of Service (QoS)

Traditional IP networks offer a "best-effort" service. QoS architectures aim to provide different levels of service to different traffic flows or classes, enabling better support for applications with specific performance requirements (e.g., voice, video).

Integrated Services Architecture (IntServ)

Approach: A fine-grained, flow-based architecture that provides explicit QoS guarantees to individual application sessions or flows. It requires applications to request specific QoS levels and for network routers to reserve resources for those flows.
Components:
- Admission Control: Routers decide if a new flow's QoS request can be met without impacting existing guarantees.
- Resource Reservation Protocol (RSVP): See below.
- Packet Scheduler: Manages how packets from different flows are queued and transmitted to meet their QoS promises (e.g., using WFQ).
- Routing Protocol: Standard routing protocols to find paths.
Services:
- Guaranteed Service: Provides firm bounds on end-to-end datagram delay and assured bandwidth.
- Controlled-Load Service: Aims to provide performance similar to that of an unloaded network, with very low probability of packet loss or excessive delay.
Challenges: Scalability is a major issue, as routers need to maintain per-flow state and perform per-flow processing.

Queuing Disciplines for QoS

How routers schedule packets from their queues significantly impacts QoS.

FQ (Fair Queuing): Aims to provide fair allocation of bandwidth among active flows by maintaining separate queues for each flow and servicing them in a round-robin fashion, often approximating bit-by-bit round robin.
PS (Priority Scheduling/Queuing): Packets are classified into priority classes. High-priority queues are always served before low-priority queues. Can lead to starvation for low-priority traffic.
BRFQ (Bit-by-bit Round Robin Fair Queuing): A theoretical ideal for FQ where queues are serviced one bit at a time from each active flow.
GPS (Generalized Processor Sharing): The theoretical ideal for fair bandwidth sharing, where each flow $i$ with weight $w_i$ is guaranteed a service rate of $(w_i / \sum w_j) \times R$ , where $R$ is the link capacity.
WFQ (Weighted Fair Queuing): A practical approximation of GPS. It assigns weights to different flows (or classes of traffic) and schedules packets to give each flow its weighted fair share of the bandwidth, while also aiming to provide good delay characteristics for low-volume flows.

Random Early Detection (RED)

An Active Queue Management (AQM) algorithm designed for congestion avoidance.
Instead of waiting for a queue to become completely full before dropping packets (tail drop), RED monitors the average queue size.
As the average queue size increases, RED starts randomly dropping incoming packets with an increasing probability, even before the queue is full.
This early notification (via packet drops) signals TCP senders to reduce their congestion windows, helping to prevent severe congestion and global synchronization of TCP back-offs. Variations like WRED (Weighted RED) can drop packets based on IP precedence or DSCP values.

Differentiated Services (DiffServ / DS)

Approach: A more scalable, coarse-grained approach to QoS compared to IntServ. It classifies packets into a small number of service classes at the network edge.
Core routers then apply a Per-Hop Behavior (PHB) to packets based on their class, indicated by the Differentiated Services Code Point (DSCP) value in the IP header (formerly the TOS byte).
PHBs: Examples include Expedited Forwarding (EF) PHB (for low-loss, low-latency, low-jitter, assured bandwidth service – like a virtual leased line) and Assured Forwarding (AF) PHB (provides different levels of forwarding assurances for IP packets in several AF classes, with different drop precedences within each class).
Advantages: More scalable than IntServ because core routers don't need to maintain per-flow state. Complexity is pushed to edge routers (ingress/egress classification, marking, shaping, policing).

RSVP (Resource Reservation Protocol)

RSVP is a signaling protocol used primarily with the IntServ architecture to request and reserve network resources for specific QoS levels for a particular data flow.

Goals & Characteristics:
- Designed to support resource reservations for both unicast and multicast applications.
- Receiver-initiated: The receiver of a data flow typically initiates and maintains the resource reservation.
- Soft State: Reservations in routers are temporary and need to be periodically refreshed by RSVP messages. If refreshes stop, the reservation times out and is removed. This makes it robust to changes.
- Simplex: Reserves resources for a unidirectional data flow.
Data Flow & Operations:
- Path Messages: Sent by the sender downstream along the data path. They store "path state" in routers, including information about the path and resources available.
- Resv Messages: Sent by receivers upstream towards the sender, following the reverse path stored by Path messages. Resv messages make the actual resource reservations in the routers. Routers along the path can accept, reject, or modify the reservation request.
Protocol Mechanisms: Uses message types like PATH, RESV, PATHERR, RESVERR, RESVTEAR. Defines filter specs (to identify the data packets for which the reservation applies) and flow specs (to define the QoS desired).

Evolving High-Speed Networking: MPLS

Multiprotocol Label Switching (MPLS) is a routing technique in telecommunications networks that directs data from one node to the next based on short path labels rather than long network addresses, avoiding complex lookups in a routing table at core routers.

Operations:
- Labels: Short, fixed-length labels are assigned to packets.
- Label Swapping: MPLS-capable routers (Label Switch Routers - LSRs) make forwarding decisions based solely on these labels. When a packet arrives, the LSR looks up the incoming label, swaps it with an outgoing label, and forwards the packet to the next hop.
- Label Switched Paths (LSPs): The path a labeled packet follows through an MPLS network is called an LSP. LSPs are established by a signaling protocol (like LDP or RSVP-TE) or by configuration.
- Forwarding Equivalence Class (FEC): A group of IP packets that are forwarded in the same manner (e.g., over the same LSP, with the same QoS treatment). Packets are assigned to an FEC at the ingress LSR.
Label Stacking: MPLS allows for multiple labels to be "stacked" on a packet. This enables hierarchical LSPs and supports applications like MPLS VPNs (where one label might identify the VPN and another the specific route within the VPN).
Benefits:
- Traffic Engineering: Allows explicit routing of traffic flows to optimize network resource utilization and meet specific performance requirements.
- VPNs: Simplifies the creation and management of Layer 2 and Layer 3 VPNs.
- QoS: Can be used to provide differentiated QoS by mapping different traffic classes to different LSPs with specific service characteristics.
- Protocol Independence: Can carry various types of traffic (IP, Ethernet, ATM).

Key Takeaways

The quest for speed and reliability in networks has led to a remarkable evolution of technologies and techniques:

Legacy WAN technologies like Frame Relay and ATM introduced concepts of virtual circuits, cell relay, and sophisticated traffic management with service categories (CBR, VBR, ABR).
High-Speed LANs, dominated by Fast Ethernet and Gigabit Ethernet (and beyond), provide the backbone for local communication, while Fibre Channel excels in storage area networks.
Understanding queuing effects and managing congestion are paramount. Techniques range from TCP's own congestion control (with enhancements like Karn's Algorithm) to network-assisted mechanisms in Frame Relay and ATM (ABR with RM cells).
Quality of Service (QoS) architectures like IntServ (with RSVP for signaling and queuing disciplines like WFQ and RED for packet handling) and the more scalable DiffServ aim to provide predictable network performance for diverse applications.
Multiprotocol Label Switching (MPLS) offers a powerful way to engineer traffic flows, create VPNs, and support QoS by using label-based forwarding.

These diverse elements work together, or have influenced successor technologies, to create the high-performance, feature-rich networks we rely on today.