Graduate Program KB

Functional programming

Functional programing predates programming itself. Based on lambda calculus invented by Alonzo Church in the 1930s

Squares of integers

Procedural code with java

public class Squint {
  public static void main(String args[]) {
    for (int i=0; i<25; i++)
      System.out.println(i*i);
  }
}

Functional code with clojure

(println (take 25 (map (fn [x] (* x x)) (range))))
(println                   ; print
  (take 25                 ; the first 25
    (map (fn [x] (* x x))  ; squares
      (range)              ; of integers
    )
  )
)

The java program uses a mutable variable - i. The clojure program initializes a variable x, but it is never modified. Variables in function languages do not vary - functional programming imposes discipline on assignment.

Immutability and architecture

All race conditions, deadlock conditions, and concurrent update problems are due to mutable variables. You can't have a race condition or a concurrent update problem if no variable is ever updated. You can't have deadlocks without mutable locks - I think this statement is a bit of a cop-out, using the lock itself and ignoring the use, better to point out that you don't need a lock with immutable state.


All the problems we face in concurrent applications are due to mutable state. If you want to make sure the systems you design are robust in the presence of multiple threads and processors you should ask if immutability is practical.


Immutability is always practical if you have infinite storage and processor speed.

Segregation of mutability

One of the most common compromises is to separate the application into immutable and mutable components. The immutable components operate in a purely functional way, with no mutability. We can use some kind of transactional memory to protect th mutable variables from concurrent updates and race conditions.


Transactional memory protects memory similar to a database with a transaction or retry based scheme. This is commonly implemented in software and called software transactional memory (STM).

(def counter (atom 0))
(swap! counter inc)

Note that it's fairly common in schemes for functions which mutate state to end with an !


counter is an atomic variable. atom = indivisible (ignore protons, neutrons, electrons, quarks, etc... they're not real), specifically what we mean here is that an operation on the variable is indivisible - nothing can happen to it halfway through.


swap! takes an atomic and a function and updates the atomic using the function. First it checks the value of the counter and passes it to inc, then when inc finishes it locks the value and checks if it has been modified. If counter has changed it will fail and retry. Otherwise, it will update it with the new value. This is called compare and swap - we swap the old value with the new. Checking the value before applying ensures it was not modified by a another thread while it was unlocked.


The value of counter can only be updated under conditions enforced by swap!. STM enforces discipline on locking - can we call it a paradigm? Unlike locking STM can be composed, after the computation finishes you just need to validate a composition of the dependencies. One of the major drawbacks is that if one of the dependencies has been modified by another thread we need to rerun the transaction.


Note that smaller values like 32bit integers often have built in CAS instructions in the hardware, which do not require locking, and in fact are used to implement mutexes (e.g. use CAS to store the thread number into an integer which specifies who has access to modify).


While atomics are adequate for simple applications, there can still be problems with concurrent updates to multiple atomics as well as deadlocks. In these cases more elaborate facilities can be used.


Architects would be wise to push as much processing as possible into immutable components.

Event sourcing

With growing computing power and memory we need mutable state less. Instead of a banking application updating balances for each of it's customers it could store a list of transactions and just recompute the current balance. We now need more storage, since we need to store every transaction and not just the result, and more computing power for all the recomputation. This might be a bad idea for a banking application - but maybe we don't need the application to run on growing amounts of data forever.


I want to point out that while, in general, recomputation requires more computation power, caching results leads to memory access and can bog down cache lines. Recomputing values can actually be faster in some situations where the computation is 100x faster than the cost of the memory access.


Event sourcing: store the transactions and not the state - when the state is required then reapply all the transactions. Can take checkpoints of the state periodically and only reapply transactions since the last checkpoint - see similarity with Jibility's event streams.


Without deleting or updating your application becomes CR instead of CRUD. Since there is no updating or deletion there are no concurrent update issues. With enough storage and processing power the application can remove all mutation and become entirely functional.


This is roughly how source control works.

Conclusion

  • Structured programming is discipline on direct transfer of control
  • Object orientation is discipline on indirect transfer of control
  • Functional programming is discipline on variable assignment

Each paradigm takes something away from us - what we've learnt over the last half-century is what not to do.