Graduate Program KB

Chapter 13 - Concurrency

  • Concurrency is a decoupling strategy
    • Decouples what gets done from when it gets done
  • Structure of the application
    • Looks more like many little collaborating computers - separation of concerns
  • Response time
    • Handle more data at the same time
  • Myths and misconceptions
    • Concurrency always improves performance
      • Only when there is a lot of wait time that can be shared between multiple threads
    • Design does not change when writing concurrent programs
      • Decoupling of what and when has a large effect of the structure of the system
    • Understanding concurrency issues is not important when working with a container such as a Web of EJB container
      • Need to know what it is doing and how to guard against concurrent updates and deadlock
  • More balanced
    • Concurrency incurs some overhead
      • Performance as well as additional code
    • Correct concurrency is complex
    • Concurrency bugs aren't usually repeatable
      • Can get ignored as one offs
    • Concurrency often requires a fundamental change in design strategy

Challenges

  • Order that code executes matters and is arbitrary unless synchronized
    • Different possible execution paths

Concurrency defense principles

  • Single responsibility principle
    • Concurrency related code has its own life cycle of development
    • Has it's own challenges which are different from and often more difficult than non concurrency-related code
    • Keep concurrent code separate from other code
  • Limit the scope of data
    • Two threads modifying the same field of a shared object can interfere with each other
    • Can se a critical section in the code which uses the shared object
    • Important to restrict the number of critical sections
    • The more places shared data can get updated
      • Will forget to protect one of those places
      • Duplication of effort the make sure everything is guarded
      • Difficult to determine source of failures
  • Use copies of data
    • Copy objects and treat as read only
    • Copy objects, collect results from multiple threads, merge results in a single thread
    • Avoiding locking can be faster than the copy
  • Threads should be as independent as possible
    • Best if each thread can live in its own world sharing data with no other thread
    • Each thready processes one request with data from an unshared source and stored as local variables
    • Attempt to partition data into independent subsets that can be operated on separately

Know your library

  • Use provided thread safe collections
  • Use executor framework for executing unrelated tasks
  • Use nonblocking solutions where possible
  • Know what classes are and are not thread safe
    • java.util.concurrent
  • ReentrantLock
    • Reentrant: Can be interrupted and safely resumed
    • Normal lock is not reentrant
      • Does not remember which thread acquired it
      • If a thread attempts to acquire a lock it already has it will block
      • e.g. With a normal lock if we acquire a lock and run a callback which requires the same lock we will deadlock
  • Semaphore
    • Lock with a count of threads wanting to acquire resource
    • Can limit number of waiting threads
    • Multiple readers/consumers can wait on a resource and all be signaled at the same time
  • CountDownLatch
    • Lock with a count
    • Needs to count down to zero before waiting threads will resume
    • e.g. want to wait until N threads are finished with the resource or N operations have been performed

Know your execution models

  • Some definitions
    • Bound resources
      • Resources of a fixed size or number used in a concurrent environment
      • e.g. databases, fixed size read write buffers
    • Mutual exclusion
      • Only one thread can access resource at a time
    • Starvation
      • Threads are prohibited from access resource for a long period of time
    • Deadlock
      • Cyclical dependency, multiple threads each waiting for each other
    • Livelock
      • Similar to deadlock, except threads are witching state instead of stuck blocking
      • i.e. grabbing and releasing the same resource without making progress
  • Execution models
    • Producer-Consumer
      • Producer threads create work and add to a queue
      • Consumer threads take work from the queue
    • Readers-Writers
      • Shared resource
      • Read by reader threads
      • Written to by writer threads
      • Need to coordinate so that readers don't read something in the middle of being updated
    • Dining Philosophers
      • A number of philosophers sitting around a table with a bowl of spaghetti
      • Fork between each philosopher
      • Philosophers think and eat (not at the same time)
      • In order to eat philosopher must pick up fork on both sides
      • If neighbour is using a fork, must wait

Dependencies between synchronized methods

  • Synchronized methods will lock the object and not allow other synchronized methods to run until complete
  • Dependencies between synchronized methods can cause issues
    • Synchronizes individual methods, not transactions
    • Two threads may query the class, get the same value, and then update it based on that information, not knowing that the other thread is modifying
      • When second thread writes the value is no longer what it queried, been modified by the other thread
  • Avoid using more than one method on a shared object
  • Solutions when more are needed
    • Client based locking: Client locks before performing all operations and unlocks after
    • Server based locking: New method on server which locks, performs all operations and unlocks
    • Adapted server: Intermediary that performs locking, server based locking when the server can't be changed

Keep synchronized sections small

  • Locks are expensive, create delays and add overhead

Writing correct shutdown code is hard

  • Program might not shutdown because it's deadlocked
  • Order threads shutdown may matter
  • Producer/consumer, producer shuts down first, but consumer is blocking waiting for more input that will never come
  • Think about shutdown early and get it working early

Testing threaded code

  • Don't ignore a failure as a one-off
  • Get Nonthreaded code working first
    • Don't try to debug threading and nonthreading bugs at the same time
  • Make your threaded code pluggable
    • So you can run it in various configurations
    • Variable number of threads
    • Tests that run at different speeds (in each thread)
    • Run for a number of iterations
  • Make your threaded code tunable
    • Consider allowing the number of threads to change while the program is running
    • Possibly automatically based on throughput and system utilization
  • Run on different platforms
    • Different OS have different schedulers
  • Instrument code to try to force failures
    • Only a few pathways out of the many thousands of possible pathways may cause a bug
    • Probability that a failing path is taken can be low
    • Can instrument code to force it to run in different orderings
      • Two options
        1. Hand-coded
        • Add calls to wait, sleep, etc...
        • Have to manually find places to do
        • Don't want in production build
        • Might not find flaws
        1. Automated
        • Use an aspect that selects randomly between, doing nothing, sleeping, or yielding
        • If you run the tests a lot of times with random jiggling you have a higher chance of finding bugs

Other resources