2D Graphics Deep Dive - Drawing, Transforming, and Viewing Your Digital World

May 26, 2023 — #Computer Graphics #CS Basics

Hey there, pixel pioneers and digital dreamers! In our previous explorations, we've laid down the groundwork of Computer Graphics, understanding what it is, where it's used, and the basics of how images appear on our screens. Now, it's time to roll up our sleeves and get into the nitty-gritty of creating and manipulating graphics in a two-dimensional space.

Ever wondered how your computer flawlessly draws a perfectly straight line or a smooth circle on a screen made of square pixels? Or how game characters zoom, spin, and morph with such fluidity? And how does a program decide which part of a larger scene to display in its window? Today, we're tackling exactly these questions! We'll journey through 2D output primitives (the basic shapes), geometric transformations (moving and changing those shapes), and the 2D viewing pipeline (how we frame our scene). Get ready for some fascinating algorithms and a little bit of (fun!) math!

Part 1: Drawing the Unseen - 2D Output Primitives

Output primitives are the fundamental geometric structures we can instruct a graphics system to draw. In 2D, these are the dots, lines, curves, and filled areas that form the visual basis of everything else. Since our screens are grids of discrete pixels, drawing even a simple line isn't as straightforward as it sounds!

Lines - The Straight Story

A line is defined by two endpoints. The challenge is to determine which pixels on the screen should be illuminated to best approximate this ideal straight line.

Digital Differential Analyzer (DDA) Algorithm: This is an intuitive algorithm that uses the line equation $y = mx + c$ . Given two endpoints $(x_1, y_1)$ and $(x_2, y_2)$ , we calculate the slope $m = \frac{y_2 - y_1}{x_2 - x_1} = \frac{\Delta y}{\Delta x}$ .
- If $|m| \le 1$ , we increment $x$ by 1 and calculate $y_{k+1} = y_k + m$ .
- If $|m| > 1$ , we increment $y$ by 1 and calculate $x_{k+1} = x_k + \frac{1}{m}$ .
The calculated $x$ or $y$ values are real numbers and need to be rounded to the nearest integer to pick the pixel.
- Pros: Simple to understand.
- Cons: Uses floating-point arithmetic (slower) and rounding, which can accumulate errors.
Bresenham's Line Algorithm: A highly efficient and popular algorithm that uses only integer arithmetic! It determines the closest pixel to the ideal line by examining a "decision parameter." For a line with slope $0 < m < 1$ : Start with $(x_0, y_0)$ . At each step $x_k$ , we decide whether $y_{k+1}$ should be $y_k$ or $y_k + 1$ . The decision parameter $p_k$ is initialized as $p_0 = 2\Delta y - \Delta x$ . At each step $k$ :
- If $p_k < 0$ , the next point is $(x_k + 1, y_k)$ , and $p_{k+1} = p_k + 2\Delta y$ .
- Else, the next point is $(x_k + 1, y_k + 1)$ , and $p_{k+1} = p_k + 2\Delta y - 2\Delta x$ .
This process repeats $\Delta x$ times. Similar logic applies to other slopes by considering symmetry.
- Pros: Fast, uses only integer addition, subtraction, and bit-shifting (for multiplication by 2). Accurate.
- Cons: A bit more complex to derive initially.

Circles - Round and Round We Go

A circle is defined by its center $(x_c, y_c)$ and radius $R$ . The equation is $(x - x_c)^2 + (y - y_c)^2 = R^2$ . We can exploit the circle's eight-way symmetry: if we calculate a point $(x,y)$ in one octant, we can easily find 7 other points.

Midpoint Circle Algorithm (similar to Bresenham's): This algorithm also uses a decision parameter to select the closer pixel to the ideal circle. We only need to calculate pixels for one octant (e.g., from $x=0$ to $x=y$ ). Consider a circle centered at the origin. Start at $(0, R)$ . At each step, we increment $x$ and decide whether $y$ should remain the same or decrease. The decision parameter $p_k$ is initialized as $p_0 = \frac{5}{4} - R$ (or $1-R$ if using integer arithmetic and starting from $p_0 = 1-R$ ). At each step $k$ , starting with $(x_0, y_0) = (0,R)$ :

Plot $(x_k, y_k)$ and its symmetric points.
If $p_k < 0$ , the next point is $(x_k + 1, y_k)$ , and $p_{k+1} = p_k + 2x_{k+1} + 1$ .
Else, the next point is $(x_k + 1, y_k - 1)$ , and $p_{k+1} = p_k + 2x_{k+1} + 1 - 2y_{k+1}$ .

Repeat until $x \ge y$ .

Ellipses - The Elegant Squish

An ellipse is defined by its center $(x_c, y_c)$ and two radii, $r_x$ (semi-major axis) and $r_y$ (semi-minor axis). The equation for an ellipse centered at the origin is $\frac{x^2}{r_x^2} + \frac{y^2}{r_y^2} = 1$ . Ellipses have four-way symmetry. The Midpoint Ellipse Algorithm is similar to the Midpoint Circle Algorithm but a bit more complex because the slope changes, requiring division of the ellipse into two regions where decisions are made differently.

Filling It In - Area Primitives

Drawing outlines is great, but often we want to fill shapes with color or patterns.

Scan-Line Polygon Fill Algorithm: This is a very common algorithm for filling polygons.
- For each scan line (horizontal pixel row) that crosses the polygon:
  1. Find the intersections of the scan line with the polygon edges.
  2. Sort these intersections by their x-coordinates.
  3. Fill the pixels between pairs of intersections (e.g., 1st to 2nd, 3rd to 4th, etc.), using parity rules.
- Efficiency is improved using an "edge table" (ET) and an "active edge list" (AEL).
Boundary-Fill Algorithm: Starts from an interior point $(x,y)$ inside the polygon.
- If the current pixel $(x,y)$ is not the boundary color and not already filled:
  1. Fill the current pixel with the desired fill color.
  2. Recursively call boundary-fill for its neighbors (4-connected or 8-connected).
- Pros: Simple concept.
- Cons: Recursive, can lead to stack overflow for large areas. Requires a single boundary color.
Flood-Fill Algorithm: Similar to boundary-fill, but instead of checking for a boundary color, it checks if the current pixel is of a specific interior "old" color.
- If the current pixel $(x,y)$ is the "old" color:
  1. Change its color to the new "fill" color.
  2. Recursively call flood-fill for its neighbors.
- Useful for changing a region of a single color to another.

Part 2: Shape Shifters - 2D Geometric Transformations

Once we can draw shapes, we want to move, resize, and rotate them! These operations are called geometric transformations. They change the coordinates describing an object.

The Basic Trio: Translation, Scaling, Rotation

Translation: Moving an object without changing its shape or orientation. If a point $P=(x,y)$ is translated by $(t_x, t_y)$ to $P'=(x',y')$ , then: $x' = x + t_x$ $y' = y + t_y$ In matrix form (using homogeneous coordinates, which we'll see soon): $P' = T \cdot P \implies \begin{bmatrix} x' \\ y' \\ 1 \end{bmatrix} = \begin{bmatrix} 1 & 0 & t_x \\ 0 & 1 & t_y \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}$
Scaling: Changing the size of an object. If a point $P=(x,y)$ is scaled by factors $(s_x, s_y)$ relative to the origin to $P'=(x',y')$ , then: $x' = x \cdot s_x$ $y' = y \cdot s_y$ Matrix form: $P' = S \cdot P \implies \begin{bmatrix} x' \\ y' \\ 1 \end{bmatrix} = \begin{bmatrix} s_x & 0 & 0 \\ 0 & s_y & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}$ Note: Scaling is done with respect to the origin. To scale about an arbitrary point $(x_f, y_f)$ , you translate the point to the origin, scale, then translate back.
Rotation: Rotating an object around a point (usually the origin) by an angle $\theta$ . If a point $P=(x,y)$ is rotated counter-clockwise by $\theta$ about the origin to $P'=(x',y')$ , then: $x' = x \cos\theta - y \sin\theta$ $y' = x \sin\theta + y \cos\theta$ Matrix form: $P' = R \cdot P \implies \begin{bmatrix} x' \\ y' \\ 1 \end{bmatrix} = \begin{bmatrix} \cos\theta & -\sin\theta & 0 \\ \sin\theta & \cos\theta & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}$ Note: To rotate about an arbitrary pivot point $(x_r, y_r)$ , you translate the pivot to the origin, rotate, then translate back.

More Transformations: Reflections and Shears

Reflection: Produces a mirror image of an object.
- Reflection about the x-axis: $x' = x, y' = -y$ $R_x = \begin{bmatrix} 1 & 0 & 0 \\ 0 & -1 & 0 \\ 0 & 0 & 1 \end{bmatrix}$
- Reflection about the y-axis: $x' = -x, y' = y$ $R_y = \begin{bmatrix} -1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}$
Shear: Distorts the shape of an object. Lines are slanted.
- X-shear (relative to x-axis): $x' = x + sh_x \cdot y, y' = y$ $Sh_x = \begin{bmatrix} 1 & sh_x & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}$
- Y-shear (relative to y-axis): $x' = x, y' = y + sh_y \cdot x$ $Sh_y = \begin{bmatrix} 1 & 0 & 0 \\ sh_y & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}$

Homogeneous Coordinates: The Unifying Hero!

You might have noticed the $3 \times 3$ matrices and points represented as $\begin{bmatrix} x \\ y \\ 1 \end{bmatrix}$ . This is homogeneous coordinates. In 2D, we represent a point $(x,y)$ as $(x_h, y_h, W)$ , where $x = x_h/W$ and $y = y_h/W$ . For convenience, we usually use $W=1$ , so $(x,y)$ becomes $(x,y,1)$ . Why bother? Homogeneous coordinates allow us to represent all affine transformations (translation, scaling, rotation, shear) as matrix multiplications. Without them, translation would be a matrix addition, making it awkward to combine with other transformations.

Composite Transformations: Power Combos!

Often, we want to apply a sequence of transformations. For example, rotate an object and then move it. This is done by multiplying the transformation matrices. If we apply $M_1$ then $M_2$ to a point $P$ , the new point $P'$ is: $P' = M_2 \cdot (M_1 \cdot P) = (M_2 \cdot M_1) \cdot P = M_{composite} \cdot P$ Crucially, the order of matrix multiplication matters! $M_1 \cdot M_2$ is generally NOT the same as $M_2 \cdot M_1$ . For example, translating then rotating an object gives a different result than rotating then translating it (unless rotating about the origin).

Part 3: Framing Your Masterpiece - 2D Viewing ️

Now that we can draw and transform objects, we need to decide what part of our (potentially infinite) 2D world we want to display and where on the screen it should appear. This is the 2D viewing pipeline.

Window to Viewport Transformation: The Digital Lens

World Coordinates: The coordinate system used to define your objects and scene (e.g., meters, feet, or just abstract units).
Window (or Clipping Window): A rectangular region in world coordinates that defines what you want to see. It's like the frame of your camera. Defined by $(W_{x_{min}}, W_{y_{min}})$ and $(W_{x_{max}}, W_{y_{max}})$ .
Device Coordinates (or Screen Coordinates): The coordinate system of the display device (e.g., pixels on your screen).
Viewport: A rectangular region in device coordinates where the contents of the window will be displayed. Defined by $(V_{x_{min}}, V_{y_{min}})$ and $(V_{x_{max}}, V_{y_{max}})$ .

The window-to-viewport transformation maps the contents of the window to the viewport. For a point $(xw, yw)$ in the window, its corresponding point $(xv, yv)$ in the viewport is: $xv = (xw - W_{x_{min}}) \frac{V_{x_{max}} - V_{x_{min}}}{W_{x_{max}} - W_{x_{min}}} + V_{x_{min}}$ $yv = (yw - W_{y_{min}}) \frac{V_{y_{max}} - V_{y_{min}}}{W_{y_{max}} - W_{y_{min}}} + V_{y_{min}}$ This transformation involves scaling and translation to fit the window contents into the viewport, maintaining aspect ratio if desired (which requires a slightly more complex calculation if window and viewport aspect ratios differ).

Clipping: Cutting Out the Excess

Objects or parts of objects that lie outside the window should not be displayed. This process is called clipping.

Line Clipping: Cohen-Sutherland Algorithm A popular algorithm for clipping lines against a rectangular window.
- Assign a 4-bit region code (outcode) to each endpoint of the line. Each bit corresponds to one of the four boundaries of the window (Top, Bottom, Right, Left - TBRL).
  - Bit 1 (Top): 1 if $y > W_{y_{max}}$ , else 0
  - Bit 2 (Bottom): 1 if $y < W_{y_{min}}$ , else 0
  - Bit 3 (Right): 1 if $x > W_{x_{max}}$ , else 0
  - Bit 4 (Left): 1 if $x < W_{x_{min}}$ , else 0
- Trivial Accept: If both endpoints have an outcode of 0000 (both are inside), the line is entirely visible.
- Trivial Reject: If the bitwise AND of the outcodes is not 0000, the line is entirely outside (e.g., both points are to the left of the window).
- Clipping Required: Otherwise, the line might intersect the window. Calculate intersection points with window boundaries. One endpoint is chosen (usually one that is outside). Its outcode tells which boundary it crosses. Calculate the intersection point $(x_i, y_i)$ with that boundary. Replace the outside endpoint with the intersection point. Repeat the process.
Polygon Clipping: Sutherland-Hodgman Algorithm Clips a polygon against each edge of a convex clip window, one edge at a time.
- The algorithm processes the polygon's vertices sequentially against a single clip boundary (e.g., left boundary).
- For each edge of the polygon (from vertex $S$ to vertex $P$ ):
  1. If $S$ and $P$ are inside: Output $P$ .
  2. If $S$ is inside and $P$ is outside: Output the intersection $I$ of $SP$ with the boundary.
  3. If $S$ and $P$ are outside: Output nothing.
  4. If $S$ is outside and $P$ is inside: Output the intersection $I$ of $SP$ with the boundary, then output $P$ .
- The output list of vertices from clipping against one boundary becomes the input for the next boundary. After clipping against all four boundaries, the resulting polygon is the clipped polygon.
- Note: This algorithm works for convex clip windows. For concave clip windows, it can produce disconnected pieces.

A Glimpse into Projections: Adding Depth (Conceptually)

While projections are a cornerstone of 3D graphics, the fundamental idea of mapping from a higher dimension to a lower dimension can be introduced here. In 2D viewing, we are essentially projecting our 2D world coordinates onto a 2D device (the screen).

Parallel Projection: Imagine light rays are parallel to each other and perpendicular to the view plane. This preserves the relative size and shape of objects. Orthographic projection is a common type where the view plane is one of the coordinate planes (e.g., viewing the xy-plane). This is implicitly what we do in most 2D graphics.
Perspective Projection (Conceptual Link to 3D): In 3D, this is how we achieve realism where objects farther away appear smaller. While not directly applied in the same way for pure 2D-to-2D mapping, understanding that viewing can involve "projection" helps bridge the gap to 3D concepts.

Phew! That Was a 2D Marathon!

We've journeyed through the core of 2D graphics: drawing the basic building blocks, transforming them in various ways, and finally, figuring out how to frame and display them. These concepts are not just theoretical; they are the engine behind countless applications, from simple drawing programs and user interfaces to complex 2D games and animations.

Understanding these fundamentals gives you a powerful lens through which to see and appreciate the digital world around you.