Into the Third Dimension - 3D Graphics, Transformations, and Seeing the Unseen

May 27, 2023 — #Computer Graphics #CS Basics

Welcome back, intrepid explorers of the digital frontier! Having mastered the fundamentals of 2D graphics, from drawing lines and circles to transforming and clipping them, it's time to take a monumental leap: into the captivating realm of Three-Dimensional (3D) Graphics.

Ever been awestruck by the lifelike characters in an animated movie, the sprawling landscapes in a video game, or the intricate designs of a 3D-printed object? That's the power of 3D graphics! But how do we represent these complex shapes in a computer? How do we move and sculpt them in virtual space? And critically, when we look at a 3D scene, how does the computer know which surfaces are visible and which are hidden behind others?

Today, we're embarking on an exciting expedition to answer these questions. We'll cover 3D Geometric Transformations, delve into sophisticated ways of representing 3D shapes (including elegant curves, complex surfaces, and infinitely detailed fractals), and finally, unravel the secrets of Visible Surface Detection. Fasten your seatbelts; the Z-axis awaits!

Part 1: Sculpting in Space - 3D Geometric Transformations

Just like in 2D, we need ways to translate, scale, and rotate our 3D objects. The principles are similar, but now we're operating in three dimensions $(x, y, z)$ . We'll use homogeneous coordinates again, but this time with $4 \times 4$ matrices, representing a point $P=(x,y,z)$ as $(x,y,z,1)$ .

Translation: Moving an object in 3D space. To translate a point $P=(x,y,z)$ by $(t_x, t_y, t_z)$ to $P'=(x',y',z')$ : $x' = x + t_x$ $y' = y + t_y$ $z' = z + t_z$ The translation matrix $T$ is: $T(t_x, t_y, t_z) = \begin{bmatrix} 1 & 0 & 0 & t_x \\ 0 & 1 & 0 & t_y \\ 0 & 0 & 1 & t_z \\ 0 & 0 & 0 & 1 \end{bmatrix}$ So, $P' = T \cdot P$ .
Scaling: Changing the size of an object relative to the origin. To scale a point $P=(x,y,z)$ by factors $(s_x, s_y, s_z)$ to $P'=(x',y',z')$ : $x' = x \cdot s_x$ $y' = y \cdot s_y$ $z' = z \cdot s_z$ The scaling matrix $S$ is: $S(s_x, s_y, s_z) = \begin{bmatrix} s_x & 0 & 0 & 0 \\ 0 & s_y & 0 & 0 \\ 0 & 0 & s_z & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}$ To scale about an arbitrary point, translate to the origin, scale, then translate back.
Rotation in Space: This is more complex than 2D rotation because we can rotate around any arbitrary axis. The basic rotations are around the coordinate axes:
- Rotation about the Z-axis by angle $\theta$ : (Similar to 2D rotation in the xy-plane) $R_z(\theta) = \begin{bmatrix} \cos\theta & -\sin\theta & 0 & 0 \\ \sin\theta & \cos\theta & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}$
- Rotation about the X-axis by angle $\theta$ : $R_x(\theta) = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & \cos\theta & -\sin\theta & 0 \\ 0 & \sin\theta & \cos\theta & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}$
- Rotation about the Y-axis by angle $\theta$ : $R_y(\theta) = \begin{bmatrix} \cos\theta & 0 & \sin\theta & 0 \\ 0 & 1 & 0 & 0 \\ -\sin\theta & 0 & \cos\theta & 0 \\ 0 & 0 & 0 & 1 \end{bmatrix}$
Rotation around an arbitrary axis is more involved, typically requiring aligning the axis with a coordinate axis, performing the rotation, and then reversing the alignment. Composite transformations (matrix multiplications) are key here too!

Part 2: Building Blocks of 3D Worlds - Representing Shapes

Creating realistic and complex 3D objects requires sophisticated representation methods beyond simple polygons (though polygons, especially triangles, form the basis of many 3D models, known as meshes).

Smooth Curves and Surfaces: Beyond Straight Lines

Splines: A spline is a piecewise polynomial (or other basis function) curve that provides smoothness and control. Instead of one high-degree polynomial to define a complex curve (which can be unstable and hard to control), splines use multiple lower-degree segments joined smoothly.
- Interpolating Splines: Pass through all their control points.
- Approximating Splines: Are influenced by their control points but don't necessarily pass through them (e.g., Bezier, B-Splines). This often gives better curve shape.
Hermite Interpolation (Hermite Curves): A cubic spline segment defined by two endpoints and the tangent vectors (derivatives) at these endpoints. Given endpoints $P_0, P_1$ and tangent vectors $P'_0, P'_1$ , the curve $P(u)$ for $0 \le u \le 1$ is: $P(u) = (2u^3 - 3u^2 + 1)P_0 + (-2u^3 + 3u^2)P_1 + (u^3 - 2u^2 + u)P'_0 + (u^3 - u^2)P'_1$ The terms in parentheses are the Hermite basis (or blending) functions.
Bezier Curves and Surfaces: Extremely popular in computer graphics and design (e.g., in Adobe Illustrator, CAD software).
- Bezier Curve: Defined by a set of control points. The curve is "pulled" towards these points but generally only passes through the first and last. The degree of the curve is one less than the number of control points. A Bezier curve with $n+1$ control points $P_0, P_1, \ldots, P_n$ is given by: $P(u) = \sum_{i=0}^{n} P_i B_{i,n}(u) \quad \text{for } 0 \le u \le 1$ where $B_{i,n}(u)$ are the Bernstein polynomials: $B_{i,n}(u) = \binom{n}{i} u^i (1-u)^{n-i}$ For example, a cubic Bezier curve (4 control points $P_0, P_1, P_2, P_3$ ): $P(u) = (1-u)^3 P_0 + 3u(1-u)^2 P_1 + 3u^2(1-u) P_2 + u^3 P_3$
- Bezier Surfaces: An extension of Bezier curves to 3D surfaces. They are defined by a grid of control points. A Bezier surface patch is formed by taking the Cartesian product of Bezier blending functions: $P(u,v) = \sum_{i=0}^{n} \sum_{j=0}^{m} P_{i,j} B_{i,n}(u) B_{j,m}(v)$

Fractals: The Beauty of Infinite Complexity ♾️

Fractals are fascinating mathematical sets that exhibit self-similarity – meaning they look similar at different scales. They are generated by repeating a simple process iteratively.

Fractal Generation: Typically involves an iterative function system (IFS) or a recurrence relation.
Classification:
- Self-similar fractals: Strictly or statistically scale-invariant (e.g., Koch snowflake).
- Iterated Function Systems (IFS): Generated by applying a set of contractive affine transformations repeatedly.
- Escape-time fractals: Defined by a recurrence relation at each point in a space (e.g., a complex plane). The behavior of the sequence (bounded or escapes to infinity) determines if the point belongs to the set.
Fractal Dimension (Hausdorff Dimension): A way to measure the "roughness" or "complexity" of a fractal. It can be a non-integer value, which is mind-bending! For a self-similar object that can be broken into $N$ copies of itself, each scaled by a factor $S$ , the fractal dimension $D$ is: $D = \frac{\log N}{\log (1/S)}$
Famous Examples:
- Julia Set: For a complex function $f_c(z) = z^2 + c$ (where $c$ is a fixed complex number), the Julia set is the set of complex numbers $z_0$ for which the sequence $z_{n+1} = f_c(z_n)$ remains bounded. The shape of the Julia set changes dramatically with $c$ .
- Mandelbrot Set: The set of complex numbers $c$ for which the Julia set $J(f_c)$ is connected. Equivalently, it's the set of $c$ values for which the sequence $z_{n+1} = z_n^2 + c$ starting with $z_0 = 0$ remains bounded. It's one of the most famous and beautiful fractals. To generate it, iterate $z_{n+1} = z_n^2 + c$ . If $|z_n|$ exceeds a certain bound (usually 2) after a number of iterations, $c$ is outside the set. The color of points outside the set often depends on how quickly they escape.

Part 3: Peeling Back the Layers - Visible Surface Detection (Hidden Surface Removal)

In a 3D scene, many objects or parts of objects might be hidden from view by other objects closer to the camera. Visible Surface Detection (VSD), also known as Hidden Surface Removal (HSR), algorithms determine which surfaces are visible and should be rendered. This is crucial for realism.

Back-Face Detection (Back-Face Culling): For solid objects represented by polygon meshes, polygons facing away from the viewer are "back faces" and are usually invisible.
- We can compute the normal vector to a polygon. If the camera is at $(V_x, V_y, V_z)$ and a point on the polygon is $(P_x, P_y, P_z)$ , and the normal vector is $N=(N_x, N_y, N_z)$ .
- The polygon is a back face if the dot product of the view vector $V = (P_x-V_x, P_y-V_y, P_z-V_z)$ and the normal $N$ is positive (assuming normals point outwards): $V \cdot N > 0$ . (Or $V \cdot N < 0$ if normals point inwards or view vector is defined from camera to point).
- This is an object-space algorithm. It doesn't handle objects obscuring each other, but it can quickly eliminate about half the polygons in a scene of closed objects.
Depth Buffer Method (Z-Buffering): An image-space algorithm and one of the most common HSR methods.
- A depth buffer (or z-buffer) is used, which is a 2D array with the same dimensions as the screen, storing a depth value (z-value) for each pixel.
- Initialize all depth buffer values to a maximum depth (representing the farthest possible distance). Initialize the frame buffer (color buffer) to the background color.
- For each polygon in the scene:
  - For each pixel $(x,y)$ covered by the polygon, calculate its depth $z$ .
  - If $z$ is less than the depth stored in depth_buffer[x][y] (i.e., this part of the polygon is closer to the viewer):
    - Update depth_buffer[x][y] = z.
    - Update frame_buffer[x][y] to the color of the polygon at that pixel.
- Pros: Simple to implement, handles intersecting polygons correctly, works well with hardware acceleration.
- Cons: Requires extra memory for the depth buffer. Can have precision issues (z-fighting).
Depth Sorting Method (Painter's Algorithm): An object-space algorithm that works like an oil painter: paint distant objects first, then paint closer objects on top of them.
1. Sort all polygons in the scene by their farthest z-coordinate from the viewer.
2. Render the polygons in order, from back to front (farthest to nearest).
- Pros: Conceptually simple.
- Cons: Sorting can be computationally expensive. Fails for certain complex cases like intersecting polygons or cyclically overlapping polygons, requiring polygons to be split.

Part 4: Let There Be Light! - Illumination Models and Surface Rendering

Simply drawing 3D shapes isn't enough; to make them look realistic, we need to simulate how light interacts with their surfaces. This is where illumination models and surface rendering techniques come into play.

Basic Illumination Models (Local Illumination)

These models consider direct light from light sources and the properties of the object's surface, but not global effects like reflections from other objects or soft shadows. The total illumination at a point on a surface is often a sum of three components:

Ambient Light: This represents indirect light that's been scattered around the scene so much that it seems to come from all directions equally. It ensures that even parts of an object not directly lit by a light source are not completely black. $I_{ambient} = k_a I_a$ where $k_a$ is the surface's ambient reflection coefficient (how much ambient light it reflects, $0 \le k_a \le 1$ ) and $I_a$ is the intensity of the ambient light.
Diffuse Reflection (Lambertian Reflection): This simulates light reflecting equally in all directions from a matte (non-shiny) surface. The amount of reflected light depends on the angle between the surface normal ( $N$ ) and the direction to the light source ( $L$ ). $I_{diffuse} = k_d I_p (N \cdot L)$ where $k_d$ is the surface's diffuse reflection coefficient, $I_p$ is the intensity of the point light source, $N$ is the normalized surface normal vector, and $L$ is the normalized vector pointing from the surface point to the light source. The dot product $(N \cdot L)$ is $\cos\theta$ , where $\theta$ is the angle between $N$ and $L$ . If $(N \cdot L) < 0$ , the surface is facing away from the light, so this term is 0.
Specular Reflection (Highlights): This simulates the bright highlights seen on shiny surfaces. The reflection is strongest when the viewing direction ( $V$ ) is aligned with the reflection direction ( $R$ ) of the light. The reflection vector $R$ can be calculated as $R = 2(N \cdot L)N - L$ . $I_{specular} = k_s I_p (R \cdot V)^{n_{shiny}}$ where $k_s$ is the surface's specular reflection coefficient, $V$ is the normalized vector pointing from the surface point to the viewer, and $n_{shiny}$ is the shininess exponent (or specular exponent), which controls the size and sharpness of the highlight (higher values mean smaller, sharper highlights). If $(R \cdot V) < 0$ , there's no specular highlight.

The total intensity for a single light source is $I = I_{ambient} + I_{diffuse} + I_{specular}$ . This calculation is done for each color component (e.g., Red, Green, Blue).

Surface Rendering (Shading) Methods

Once we have an illumination model, we need a way to apply it to render the surfaces of our objects (typically polygon meshes).

Flat Shading: The simplest and fastest method. The illumination model is applied once for each polygon (e.g., at its centroid or one of its vertices) to calculate a single color. The entire polygon is then filled with this color.
- Pros: Very fast.
- Cons: Produces a faceted, blocky appearance, especially for surfaces that are supposed to be smooth. Mach band effect (visual artifacts at shared edges) can be prominent.
Gouraud Shading (Intensity Interpolation): Provides a smoother appearance than flat shading.
1. Calculate the surface normal at each vertex of the polygon (often by averaging the normals of adjacent polygons).
2. Apply the illumination model at each vertex to calculate the light intensity (color) at that vertex.
3. Linearly interpolate these intensities along the edges of the polygon.
4. Linearly interpolate the intensities across each scan line that fills the polygon.
- Pros: Smoother than flat shading, reduces faceted look.
- Cons: Can miss specular highlights if they fall in the middle of a polygon (since calculations are only at vertices). Mach bands can still be visible.
Phong Shading (Normal-Vector Interpolation): Produces the most realistic smooth shading of these three methods.
1. Calculate the surface normal at each vertex of the polygon.
2. Linearly interpolate the vertex normals along the edges of the polygon.
3. Linearly interpolate these normals across each scan line.
4. For each pixel along the scan line, apply the full illumination model using the interpolated normal vector for that pixel.
- Pros: Produces very smooth surfaces and accurately renders specular highlights that may occur in the interior of polygons.
- Cons: Computationally more expensive than Gouraud shading because the full illumination model is applied per pixel.

Part 5: Painting the Digital World - Color Models

Color is a fundamental aspect of how we perceive graphics. Computers need a way to represent and manipulate colors numerically. Color models provide a systematic way to do this.

RGB (Red, Green, Blue): An additive color model. Colors are created by adding different intensities of Red, Green, and Blue light. This is the primary model used for displays like monitors and phone screens.
- Each color component (R, G, B) is typically represented by a value, often from 0 to 255 (for 8-bit color depth per channel).
- $(0,0,0)$ represents Black (no light).
- $(255,255,255)$ represents White (full intensity of all three).
- $(255,0,0)$ is pure Red, $(0,255,0)$ is pure Green, $(0,0,255)$ is pure Blue.
- Yellow is $(255,255,0)$ (Red + Green).
- Cyan is $(0,255,255)$ (Green + Blue).
- Magenta is $(255,0,255)$ (Red + Blue).
CMY(K) (Cyan, Magenta, Yellow, Key/Black): A subtractive color model. Colors are created by subtracting (absorbing) certain wavelengths of light from white light, using inks or pigments. This is primarily used in printing.
- Cyan absorbs Red. Magenta absorbs Green. Yellow absorbs Blue.
- Theoretically, mixing C, M, and Y in full intensity should produce Black. In practice, it often results in a muddy dark brown.
- Therefore, Black (K) ink is added for true blacks and to save on colored inks. This is the CMYK model.
- Conversion from RGB to CMY (assuming values are normalized 0-1): $C = 1 - R$ $M = 1 - G$ $Y = 1 - B$
HSV (Hue, Saturation, Value) / HSL (Hue, Saturation, Lightness): These models represent color in a way that is often more intuitive for humans to understand and manipulate, as they align better with how we perceive color attributes.
- Hue (H): The pure color (e.g., red, green, blue, yellow, orange). Represented as an angle around a color wheel (0° to 360°). 0° is typically Red.
- Saturation (S): The "purity" or intensity of the color. How much gray is mixed with the hue. Ranges from 0 (gray) to 1 (or 100%) (fully saturated, pure color).
- Value (V) / Brightness: The brightness of the color. Ranges from 0 (black) to 1 (or 100%) (the brightest version of the color).
- Lightness (L) (in HSL): Similar to Value, but 0 is black, 1 is white, and 0.5 is the "purest" color.
- These models are useful for color pickers in software, allowing users to select a hue and then adjust its saturation and brightness/lightness.

The 3D Adventure Continues!

And there you have it – a whirlwind tour of 3D transformations, the art of representing complex 3D shapes, the crucial techniques for figuring out what we actually see, how light brings scenes to life, and how we define the colors that paint our digital worlds! These concepts are the bedrock of modern animation, gaming, virtual reality, and so much more.

The journey into 3D graphics is vast and filled with even more fascinating topics like advanced rendering techniques (ray tracing, radiosity), texturing, and animation, which we'll explore in future posts!