Clean Architecture
Component Principles
- Component principles are a layer of abstraction above SOLID principles
- SOLID principles tell us how to arrange bricks into rooms
- Component principles tell us how to arrange rooms into buildings
Chapter 12 - Components
- Components are the smallest unit of deployable entities within a system
- They can be binary or source file aggregations depending on the language (.jar, .dll, etc.)
- Components can be combined into an executable, an archive file or dynamically loaded in as a plugin
- Well-designed components should be independently deployable and therefore, developable
- Reduces dependencies and coupling between components
- Parallel development in a team
- Test components in isolation
A Brief History of Components
- Early programmers had to manually control the memory location and layout of their programs as they were not relocatable
- Ex. PDP-8 program
- PDP-8 is a 12-bit minicomputer
- GETSTR is a subroutine (indicated by 0), that saves a keyboard input into the buffer
- *200 is the origin statement instructing the assembler to generate code at the memory address 2008 (octal number)
- Nowadays, memory location is abstracted away from the programmer
- Accessing library functions in the past:
- Libraries were maintained as source code along with the application code, meaning all code was compiled together
- Slow devices with limited memory meant compiling the whole source code was difficult
- Had to perform multiple passes over the source code, resulting in long compile times
- Becomes worse as program or library sizes increased
- The solution was to separate the library and application source codes
- The library functions are precompiled and loaded at a known set address (20008)
- This separation enabled the reuse of function libraries without having to recompile it everytime the application code was compiled
- Issue occurs when the application code grows too large and it needs to be split into more address segments
- The library function could also continue to grow, which constantly changes the boundaries of address segments which the application code occupies
Relocatability
- Relocatable binaries are compiled programs that can be loaded at different memory addresses
- Solution to address memory fragmentation and fixed memory addresses
- Smart loaders are responsible for loading these programs, which adjusts the memory references accordingly
- The binary contains flags, indicating to the loader which memory references need adjustments when it's loaded into memory
- Uses relative / offset addresses to determine final location of references
- Example:
- Expected memory address of binary: 0x1000
- Loader decides to load it at: 0x5000
- Loader will adjust all memory references in the program by adding this difference: 0x5000 - 0x1000 = 0x4000
- Address for funcA() expected to be 0x1000, updated address will be 0x5000
- Address for funcB() expected to be 0x1200, updated address will be 0x5200
- Removes the need to predetermine memory addresses at compile time, which could create conflicts between libraries / programs
- The loader can adapt the memory addresses during the loading process
- The binary contains flags, indicating to the loader which memory references need adjustments when it's loaded into memory
- Compiler also emits the names of functions as metadata
- Calling a library function would emit that name as an external reference
- Defining a library function would emit that name as an external definition
- The smart loader can link the the reference and definition once it's been loaded
- Example:
// main.c #include "utils.h" int main() { int result = add(1, 2); // Marks add() as an external reference return 0; }
// utils.c int add(int a, int b) { // Mark add() as external definition return a + b; }
- After main.c and utils.c are compiled, the linker checks object files for any unresolved external references
- Here, main.c has an external reference to add() and utils.c defines add(), linking the call to the actual definition
- The resulting binary file contains metadata about where the loader should load different functions and other data in memory
Linkers
- Early linking loaders enabled programmers to compile programs into smaller, manageable segments
- These segments can be combined into a single executable
- Made programming more modular and efficient, especially for smaller programs / libraries
- However, the process of linking at load time became too slow as code size outpaced the technology at the time
- Lots of big libraries, need to resolve lots of external references
- Slow storage devices, using magnetic tapes and disks
- The phases of linking and loading were separated into two steps
- A linker is a separate application that handles linking all object files and resolving external references to produce a single relocatable executable file
- Relocatable file contains all the linked code already and it can still be dynamically loaded at different memory addresses depending on what's available at runtime
- Linking was the slow part of the process, but enabled loading quickly at any time
- A relocating loader loads the linked relocatable file into memory very quickly
- In this two step approach:
- Linker only runs once
- Loader simply loads relocatable file into memory, only adjusting memory addresses to ensure the program runs in the correct location
- A linker is a separate application that handles linking all object files and resolving external references to produce a single relocatable executable file
- Murphy's law of program size
- "Programs will grow to fill all available compile and link time"
- Highlights the issue that even as tools continue to improve, programmers will push the boundaries therefore resulting in longer compilation and linking times
- In the 1980s, technology accelerated which led to improved computing power and memory efficiency
- Linking times reduced to seconds
- Can scrap the two phase approach and go back to linking at load time
- Concept of component plugin architecture is created, now developers can create modular, reusable pieces of code that could be dynamically linked to programs at runtime
- Adding a .jar file to a project without needing to recompile it (ex. adding mod to Minecraft)
- Including DDLs at runtime to extend VSCode functionality without modifying the core application
Conclusion
- Dynamically linked files are the core of modern component-based architectures
- They can be independently developed, tested and deployed
- They can be loaded and linked at runtime quickly without affecting the core application
- Nowadays, component plugin architectures are common practice where we see new functionality "plugged in" without the need to recompile or redeploy the entire system
- Browser extensions
- VSCode extensions
- Minecraft mods
- Systems can be scaled and maintained much easier since people have overcome the technological bottlenecks back then