Graduate Program KB

Clean Architecture

Component Principles

  • Component principles are a layer of abstraction above SOLID principles
    • SOLID principles tell us how to arrange bricks into rooms
    • Component principles tell us how to arrange rooms into buildings

Chapter 12 - Components

  • Components are the smallest unit of deployable entities within a system
    • They can be binary or source file aggregations depending on the language (.jar, .dll, etc.)
    • Components can be combined into an executable, an archive file or dynamically loaded in as a plugin
  • Well-designed components should be independently deployable and therefore, developable
    • Reduces dependencies and coupling between components
    • Parallel development in a team
    • Test components in isolation

A Brief History of Components

  • Early programmers had to manually control the memory location and layout of their programs as they were not relocatable
  • Ex. PDP-8 program
    • PDP-8 is a 12-bit minicomputer
    • GETSTR is a subroutine (indicated by 0), that saves a keyboard input into the buffer
    • *200 is the origin statement instructing the assembler to generate code at the memory address 2008 (octal number)
  • Nowadays, memory location is abstracted away from the programmer
  • Accessing library functions in the past:
    • Libraries were maintained as source code along with the application code, meaning all code was compiled together
    • Slow devices with limited memory meant compiling the whole source code was difficult
      • Had to perform multiple passes over the source code, resulting in long compile times
      • Becomes worse as program or library sizes increased
  • The solution was to separate the library and application source codes
    • The library functions are precompiled and loaded at a known set address (20008)
    • This separation enabled the reuse of function libraries without having to recompile it everytime the application code was compiled
      • Issue occurs when the application code grows too large and it needs to be split into more address segments
      • The library function could also continue to grow, which constantly changes the boundaries of address segments which the application code occupies

Relocatability

  • Relocatable binaries are compiled programs that can be loaded at different memory addresses
    • Solution to address memory fragmentation and fixed memory addresses
  • Smart loaders are responsible for loading these programs, which adjusts the memory references accordingly
    • The binary contains flags, indicating to the loader which memory references need adjustments when it's loaded into memory
      • Uses relative / offset addresses to determine final location of references
      • Example:
        • Expected memory address of binary: 0x1000
        • Loader decides to load it at: 0x5000
        • Loader will adjust all memory references in the program by adding this difference: 0x5000 - 0x1000 = 0x4000
        • Address for funcA() expected to be 0x1000, updated address will be 0x5000
        • Address for funcB() expected to be 0x1200, updated address will be 0x5200
    • Removes the need to predetermine memory addresses at compile time, which could create conflicts between libraries / programs
    • The loader can adapt the memory addresses during the loading process
  • Compiler also emits the names of functions as metadata
    • Calling a library function would emit that name as an external reference
    • Defining a library function would emit that name as an external definition
    • The smart loader can link the the reference and definition once it's been loaded
    • Example:
      // main.c
      #include "utils.h"
      
      int main() {
          int result = add(1, 2); // Marks add() as an external reference
      
          return 0;
      }
      
      // utils.c
      
      int add(int a, int b) { // Mark add() as external definition
          return a + b;
      }
      
      • After main.c and utils.c are compiled, the linker checks object files for any unresolved external references
      • Here, main.c has an external reference to add() and utils.c defines add(), linking the call to the actual definition
      • The resulting binary file contains metadata about where the loader should load different functions and other data in memory

Linkers

  • Early linking loaders enabled programmers to compile programs into smaller, manageable segments
    • These segments can be combined into a single executable
    • Made programming more modular and efficient, especially for smaller programs / libraries
    • However, the process of linking at load time became too slow as code size outpaced the technology at the time
      • Lots of big libraries, need to resolve lots of external references
      • Slow storage devices, using magnetic tapes and disks
  • The phases of linking and loading were separated into two steps
    • A linker is a separate application that handles linking all object files and resolving external references to produce a single relocatable executable file
      • Relocatable file contains all the linked code already and it can still be dynamically loaded at different memory addresses depending on what's available at runtime
      • Linking was the slow part of the process, but enabled loading quickly at any time
    • A relocating loader loads the linked relocatable file into memory very quickly
    • In this two step approach:
      • Linker only runs once
      • Loader simply loads relocatable file into memory, only adjusting memory addresses to ensure the program runs in the correct location
  • Murphy's law of program size
    • "Programs will grow to fill all available compile and link time"
    • Highlights the issue that even as tools continue to improve, programmers will push the boundaries therefore resulting in longer compilation and linking times
  • In the 1980s, technology accelerated which led to improved computing power and memory efficiency
    • Linking times reduced to seconds
    • Can scrap the two phase approach and go back to linking at load time
    • Concept of component plugin architecture is created, now developers can create modular, reusable pieces of code that could be dynamically linked to programs at runtime
      • Adding a .jar file to a project without needing to recompile it (ex. adding mod to Minecraft)
      • Including DDLs at runtime to extend VSCode functionality without modifying the core application

Conclusion

  • Dynamically linked files are the core of modern component-based architectures
    • They can be independently developed, tested and deployed
    • They can be loaded and linked at runtime quickly without affecting the core application
  • Nowadays, component plugin architectures are common practice where we see new functionality "plugged in" without the need to recompile or redeploy the entire system
    • Browser extensions
    • VSCode extensions
    • Minecraft mods
  • Systems can be scaled and maintained much easier since people have overcome the technological bottlenecks back then