Chapter 13 - Concurrency
- Concurrency is a decoupling strategy
- Decouples what gets done from when it gets done
- Structure of the application
- Looks more like many little collaborating computers - separation of concerns
- Response time
- Handle more data at the same time
- Myths and misconceptions
- Concurrency always improves performance
- Only when there is a lot of wait time that can be shared between multiple threads
- Design does not change when writing concurrent programs
- Decoupling of what and when has a large effect of the structure of the system
- Understanding concurrency issues is not important when working with a container such as a Web of EJB container
- Need to know what it is doing and how to guard against concurrent updates and deadlock
- More balanced
- Concurrency incurs some overhead
- Performance as well as additional code
- Correct concurrency is complex
- Concurrency bugs aren't usually repeatable
- Can get ignored as one offs
- Concurrency often requires a fundamental change in design strategy
Challenges
- Order that code executes matters and is arbitrary unless synchronized
- Different possible execution paths
Concurrency defense principles
- Single responsibility principle
- Concurrency related code has its own life cycle of development
- Has it's own challenges which are different from and often more difficult than non concurrency-related code
- Keep concurrent code separate from other code
- Limit the scope of data
- Two threads modifying the same field of a shared object can interfere with each other
- Can se a critical section in the code which uses the shared object
- Important to restrict the number of critical sections
- The more places shared data can get updated
- Will forget to protect one of those places
- Duplication of effort the make sure everything is guarded
- Difficult to determine source of failures
- Use copies of data
- Copy objects and treat as read only
- Copy objects, collect results from multiple threads, merge results in a single thread
- Avoiding locking can be faster than the copy
- Threads should be as independent as possible
- Best if each thread can live in its own world sharing data with no other thread
- Each thready processes one request with data from an unshared source and stored as local variables
- Attempt to partition data into independent subsets that can be operated on separately
Know your library
- Use provided thread safe collections
- Use executor framework for executing unrelated tasks
- Use nonblocking solutions where possible
- Know what classes are and are not thread safe
- ReentrantLock
- Reentrant: Can be interrupted and safely resumed
- Normal lock is not reentrant
- Does not remember which thread acquired it
- If a thread attempts to acquire a lock it already has it will block
- e.g. With a normal lock if we acquire a lock and run a callback which requires the same lock we will deadlock
- Semaphore
- Lock with a count of threads wanting to acquire resource
- Can limit number of waiting threads
- Multiple readers/consumers can wait on a resource and all be signaled at the same time
- CountDownLatch
- Lock with a count
- Needs to count down to zero before waiting threads will resume
- e.g. want to wait until N threads are finished with the resource or N operations have been performed
Know your execution models
- Some definitions
- Bound resources
- Resources of a fixed size or number used in a concurrent environment
- e.g. databases, fixed size read write buffers
- Mutual exclusion
- Only one thread can access resource at a time
- Starvation
- Threads are prohibited from access resource for a long period of time
- Deadlock
- Cyclical dependency, multiple threads each waiting for each other
- Livelock
- Similar to deadlock, except threads are witching state instead of stuck blocking
- i.e. grabbing and releasing the same resource without making progress
- Execution models
- Producer-Consumer
- Producer threads create work and add to a queue
- Consumer threads take work from the queue
- Readers-Writers
- Shared resource
- Read by reader threads
- Written to by writer threads
- Need to coordinate so that readers don't read something in the middle of being updated
- Dining Philosophers
- A number of philosophers sitting around a table with a bowl of spaghetti
- Fork between each philosopher
- Philosophers think and eat (not at the same time)
- In order to eat philosopher must pick up fork on both sides
- If neighbour is using a fork, must wait
Dependencies between synchronized methods
- Synchronized methods will lock the object and not allow other synchronized methods to run until complete
- Dependencies between synchronized methods can cause issues
- Synchronizes individual methods, not transactions
- Two threads may query the class, get the same value, and then update it based on that information, not knowing that the other thread is modifying
- When second thread writes the value is no longer what it queried, been modified by the other thread
- Avoid using more than one method on a shared object
- Solutions when more are needed
- Client based locking: Client locks before performing all operations and unlocks after
- Server based locking: New method on server which locks, performs all operations and unlocks
- Adapted server: Intermediary that performs locking, server based locking when the server can't be changed
Keep synchronized sections small
- Locks are expensive, create delays and add overhead
Writing correct shutdown code is hard
- Program might not shutdown because it's deadlocked
- Order threads shutdown may matter
- Producer/consumer, producer shuts down first, but consumer is blocking waiting for more input that will never come
- Think about shutdown early and get it working early
Testing threaded code
- Don't ignore a failure as a one-off
- Get Nonthreaded code working first
- Don't try to debug threading and nonthreading bugs at the same time
- Make your threaded code pluggable
- So you can run it in various configurations
- Variable number of threads
- Tests that run at different speeds (in each thread)
- Run for a number of iterations
- Make your threaded code tunable
- Consider allowing the number of threads to change while the program is running
- Possibly automatically based on throughput and system utilization
- Run on different platforms
- Different OS have different schedulers
- Instrument code to try to force failures
- Only a few pathways out of the many thousands of possible pathways may cause a bug
- Probability that a failing path is taken can be low
- Can instrument code to force it to run in different orderings
- Two options
- Hand-coded
- Add calls to wait, sleep, etc...
- Have to manually find places to do
- Don't want in production build
- Might not find flaws
- Automated
- Use an aspect that selects randomly between, doing nothing, sleeping, or yielding
- If you run the tests a lot of times with random jiggling you have a higher chance of finding bugs
Other resources