Graduate Program KB

Clean Code

Chapter 13

Concurrency

  • Decoupling strategy for what gets done and when it gets done

    • Improve throughput and structures of an application
    • Can make systems easier to understand and a way to separate concerns
  • Examples where concurrency is useful

    • System handling one user at a time and requires one second per user
      • As number of users greatly increase, so does response time
    • System interpreting large data sets but only provides solution after processing everything
      • Each data set could be processed in parallel and provide partial solutions as they individually complete

Myths and Misconceptions

  • Common myths / misconceptions:
    • Concurrency always improves performance
      • Can sometimes improve performance, but only when lots of wait time can be distributed between multiple threads / processors
    • Design doesn't change when writing concurrent programs
      • A concurrent algorithm can differ from an algorithm designed for a single-threaded system
    • Understanding concurrency issues is not important when working with a container such as a Web or EJB container
      • Learn how to prevent issues of concurrently updating shared resources and deadlock
    • Concurrency incurs some overhead, both in performance as well as writing additional code
    • Correct concurrency is complex, even for simple problems
    • Concurrency bugs aren't usually repeatable, so they are often ignored
    • Concurrency often requires a fundamental change in design strategy

Challenges

  • Consider the snippet:

    public class X {
        private int lastIdUsed;
        
        public int getNextId() {
            return ++lastIdUsed;
        }
    }
    
  • Suppose an instance of X is created with lastIdUsed set to 42 and shared between two threads, there are many paths:

    • Thread 1 gets 43, thread 2 gets 44, lastIdUsed is 44
    • Thread 1 gets 44, thread 2 gets 43, lastIdUsed is 44
    • Thread 1 gets 43, thread 2 gets 43, lastIdUsed is 43
  • The unpredictable result occurs from the many possible paths that threads can take to generate different results

    • Occurs because threads are accessing and manipulating a shared resource simultaneously

Concurrency Defense Principles

  • Single Responsibility Principle

    • Concurrency design is complex, should be separated from rest of the code
    • Has its own life cycle of development, change and tuning
    • Has its own set of challenges
    • High frequency of producing miswritten concurrency code
  • Limit the scope of data

    • Modifying same resource causes unexpected behaviour, want to restrict critical sections of code
    • The more places shared data gets updated, more likely to:
      • Forget to protect places with critical code
      • More effort to ensure everything is guarded
      • Difficult to determine source of failures
  • Use copies of data

    • Avoid sharing data in the first place, each thread can operate on a copy and then merge results on a single thread
  • Threads should be as independent as possible

    • Write threaded code such that threads don't share data with other threads
    • Omit synchronisation requirements

Execution Models

  • Terminology

    • Bound resources: Resources of fixed size
    • Mutual exclusion: Only one thread can access a shared resource at a time
    • Starvation: A thread or group of threads can't proceed with execution from waiting forever / a long time
    • Deadlock: Two or more threads waiting for each other to finish. The threads depend on obtaining resources from other threads
    • Livelock: Threads in lockstep, each trying to do work but are unable to progress for a long time
  • Producer-Consumer

    • One or more producers create work and places it in a buffer or queue
    • One or more consumer threads get work from the data structure and completes it
    • Bound resource, producers wait for queue to be empty, consumers wait for queue to be non-empty
  • Readers-Writers

    • Writers wait until no readers before performing an update
    • Writers are starved if there are continuous readers
    • If there are frequent writers, they are given priority at the cost of throughput
  • Dining philosophers

    • There are many threads and few resources
    • Essentially, threads compete for the resources which causes some of them to wait for the resource to become available again
    • Can experience deadlock, livelock and low throughput

Dependencies between synchronised methods

  • Avoid using more than one method on a shared object

  • For situations using more than one method:

    • Client-based locking: Client locks the server before calling 1st method and the lock extends to the last calling method
    • Server-based locking: Within the server, create a method locking the server, call all methods then unlock. Client calls new method
    • Adapted server: Create intermediary that performs locking
  • Locks create overhead and delay, keep critical sections as minimal as possible

Extra notes

  • Test for shut down code early, can be difficult to achieve if threads deadlock before receiving a shut down signal
  • Write tests to expose potential problems then run with different program and system configurations
    • Treat spurious failures as candidate threading issues
    • Get non-threaded code passing first
    • Make threaded code pluggable and tunable
    • Run with more threads than processors
    • Run different platforms
    • Try force failures