Clean Code
Chapter 14
Successive Refinement
- This chapter is a case study of refining a module Args for parsing command-line arguments
- How to use Args:
- Construct an instance of Args with input arguments and a format string (schema)
- Query the instance for the values of the arguments with methods such as getBoolean, getInteger, getString
public static void main(String[] args) { try { Args arg = new Args("l,p#,d*", args); boolean logging = arg.getBoolean('l'); int port = arg.getInt('p'); String directory = arg.getString('d'); executeApplication(logging, port, directory); } catch (ArgsException e) { System.out.printf("Argument error: %s\n", e.errorMessage()); } }
Args Implementation
-
Listing 14-2 to 14-7 is the implementation of the Args class
- Args.java
- ArgumentMarshaler.java (interface)
- Abstract methods: void set(Iterator<String> currentArgument) throws ArgsException
- BooleanArgumentMarshaler.java (concrete)
- StringArgumentMarshaler.java (concrete)
- IntegerArgumentMarshaler.java (concrete)
- DoubleArgumentMarshaler.java (concrete)
- StringArrayArgumentMarshaler.java (concrete)
- ArgsException.java
-
Instance variables
- Map<Character, ArgumentMarshaler> marshalers
- Set<Character> argsFound>
- ListIterator<String> currentArgument
-
Constructor
- Inputs: schema, args
- Initialises marshalers and argsFound
- Invokes parseSchema and parseArgumentStrings
-
Methods
- parseSchema: Splits format string (schema) with comma delimiter, calls parseSchemaElement for each element
- parseSchemaElement: Adds <id, concrete marshaler> to marshalers based on schema
- Schema consists of id (index 0) and argument type (remaining substring)
- validateSchemaElementId: Used in parseSchemaElement to check if id is a letter
- parseArgumentStrings: Iterates over list of arguments, calls parseArgumentCharacters with remaining characters of argument if it starts with "-"
- parseArgumentCharacters: Given the remaining characters of an argument, iterate over each character and call parseArgumentCharacter
- parseArgumentCharacter: Retrieve marshaler from marshalers with character key, add character to argsFound set and invoke set on the marshaler with currentArgument (listIterator)
- If marshaller doesn't exist or there's an error after invoking set, throw error
- has: Checks if argsFound contains a specified argument
- nextArgument: Gets next argument of currentArgument
- getBoolean, getString, getInt, getDouble, getStringArray: Get argument from specific marshalers
-
Takeaways
- Close association of functions (dependant private functions are located closely)
- Small functions, formatted, descriptive names
- Easy to add new ArgumentMarshaler subclass if needed
How Did I Do This?
- To write clean code, you must first write dirty code then clean it
- Similar to how you create drafts and subsequent drafts before achieving the final version
- This is the process of successive refinement
- Don't leave the program is an unrefined state after the program is working
Args: The Rough Draft
-
Listing 14-8 is the rough draft of Args.java
- Many instance variables, low cohesion
- Lots of responsibilities (argument parsing and error handling)
- Some resemblance of organisation (variable names, association of related functions, formatting)
-
Listing 14-9 shows an earlier version
- Less instance variabes as a result of dealing with booleans only
- Relatively simple to understand
- But hard to scale, just adding two more argument types in Listing 14-8 created a larger mess
-
Listing 14-10 shows the version that handles two types, booleans and strings but increased the code an unreasonable amount
- Suggests adding more argument types such as integers would cause even more mess down the line, should consider refactoring
So I Stopped
- Stop implementing to reconsider approach, otherwise the final product is an unmaintainable mess (an unrefined state)
- Begin refactoring, should be a core step after passing each test case
- In the case for Args, each argument type required new code in three locations
- Each argument required a way to parse its schema element to select the HashMap for that type (parse)
- Each argument needed to be parsed in the command-line strings and converted to its proper type (set)
- Each argument needed a get method to retrieve the argument as its true type (get)
- Different argument types require similar methods, create concrete classes which depend on a common interface (ArgumentMarshaler)
- In the case for Args, each argument type required new code in three locations
On Incrementalism
-
Principle which emphasises making small, iterative changes to code rather than attempt large-scale overhauls all at once
- Making large "improvements" can cause regressions, apply TDD instead
-
With the TDD approach:
- Code running all times to see live changes
- Not allowed to make breaking changes to a system (no regressions)
- Create a test suite which specifies the behaviour of the program, where each subsequent test makes very small changes at a time
- Make the SIMPLEST modification possible
-
Building ArgumentMarshaler
private class ArgumentMarshaler { private boolean booleanValue = false; private String stringValue; public void setBoolean(boolean value) { booleanValue = value; } public boolean getBoolean() { return booleanValue; } // Private concrete classes for boolean, string, integers, etc. }
-
Applying incremental changes
- Created "skeleton" of ArgmentMarshaler and concrete classes
- In Args, booleanArgs (hash map) doesn't store a boolean value anymore, it stores an instance of ArgumentMarshaler
- parse, set and get have regressed because data type that booleanArgs stores has changed
private Map<Character, Boolean> booleanArgs = new HashMap<Character, Boolean>();
private Map<Character, ArgumentMarshaler> booleanArgs = new HashMap<Character, ArgumentMarshaler>(); private void parseBooleanSchemaElement(char elementId) { booleanArgs.put(elementId, new BooleanArgumentMarshaler()); } private void setBooleanArg(char argChar, boolean value) { booleanArgs.get(argChar).setBoolean(value); } public boolean getBoolean(char arg) { return falseIfNull(booleanArgs.get(arg).getBoolean()); }
- Remove falseIfNull, need to perform null check on ArgumentMarshaler, not for the boolean anymore
- Tests still fail but no new error
- Move null check
public boolean getBoolean(char arg) { Args.ArgumentMarshaler am = booleanArgs.get(arg); return am != null && am.getBoolean(); }
String Arguments
-
Similar implementation to adding booleans, just adjusting parse, set and get to account for the string data type
// Args private Map<Character, ArgumentMarshaler> stringArgs = new HashMap<Character, ArgumentMarshaler>(); private void parseStringSchemaElement(char elementId) { stringArgs.put(elementId, new StringArgumentMarshaler()); } private void setStringArg(char argChar) throws ArgsException { currentArgument++; try { stringArgs.get(argChar).setString(args[currentArgument]); } catch (ArrayIndexOutOfBoundsException e) { valid = false; errorArgumentId = argChar; errorCode = ErrorCode.MISSING_STRING; throw new ArgsException(); } } public boolean getString(char arg) { Args.ArgumentMarshaler am = stringArgs.get(arg); return am == null ? "" : am.getString(); }
private class ArgumentMarshaler { private boolean booleanValue = false; private String stringValue; public void setBoolean(boolean value) { booleanValue = value; } public boolean getBoolean() { return booleanValue; } public void setString(String s) { stringValue = s; } public String getString() { return stringValue == null ? "" : stringValue; } }
-
Issue to address: All the implementation is in the ArgumentMarshaler base class instead of the derivatives
- All the common behaviour added will be pushed down to subclasses after all behaviour is implemented in ArgumentMarshaler
- Keeps tests running while gradually refactoring
-
The implementation for integers is analogous to string and boolean in terms of functions you need to add
Subclasses
-
Complete one implementation at a time, starting with BooleanArgumentMarshaler
-
ArgumentMarshaler as an abstract class:
- Move setBoolean into BooleanArgumentMarshaler
- Create an abstract set method which is required to be implemented by all concrete classes
- In setBooleanArg, replace call to setBoolean with set which runs the concrete implementation of set in BooleanArgumentMarshaler
- Remove setBoolean in ArgumentMarshaler
private abstract class ArgumentMarshaler { protected boolean booleanValue = false; private String stringValue; private int integerValue; public void setBoolean(boolean value) { booleanValue = value; } public boolean getBoolean() { return booleanValue; } public void setString(String s) { stringValue = s; } public String getString() { return stringValue == null ? "" : stringValue; } public void setInteger(int i) { integerValue = i; } public int getInteger() { return integerValue; } public abstract void set(String s); }
private class BooleanArgumentMarshaler extends ArgumentMarshaler { public void set(String s) { booleanValue = true; } }
// Args private void setBooleanArg(char argChar, boolean value) { booleanArgs.get(argChar).set("true"); }
-
Same methodology with getBoolean
- Create an abstract method which the subclasses implement
- Replace with get for getBoolean in Args
- Delete getBoolean after
-
Apply the same process for strings and integers
-
In Args, there are three instance variables for storing each type of arguments
- They all store instances of ArgumentMarshaler, just merge to create a more generic data structure
- Follow TDD, create a new map alongside the three and implement changes to methods using it one at a time
private Map<Character, ArgumentMarshaler> booleanArgs = new HashMap<Character, ArgumentMarshaler>(); private Map<Character, ArgumentMarshaler> stringArgs = new HashMap<Character, ArgumentMarshaler>(); private Map<Character, ArgumentMarshaler> intArgs = new HashMap<Character, ArgumentMarshaler>(); private Map<Character, ArgumentMarshaler> marshalers = new HashMap<Character, ArgumentMarshaler>(); private void parseBooleanSchemaElement(char elementId) { ArgumentMarshaler m = new BooleanArgumentMarshaler(); booleanArgs.put(elementId, m); marshalers.put(elementId, m); } private void parseIntegerSchemaElement(char elementId) { ArgumentMarshaler m = new IntegerArgumentMarshaler(); intArgs.put(elementId, m); marshalers.put(elementId, m); } private void parseStringSchemaElement(char elementId) { ArgumentMarshaler m = new StringArgumentMarshaler(); stringArgs.put(elementId, m); marshalers.put(elementId, m); }
-
Changes to functions using get
- Lots of duplicated logic and calls to get
private boolean setArgument(char argChar) throws ArgsException { if (isBooleanArg(argChar)) setBooleanArg(argChar); else if (isStringArg(argChar)) setStringArg(argChar); else if (isIntArg(argChar)) setIntArg(argChar); else return false; return true; } private boolean isBooleanArg(char argChar) { ArgumentMarshaler m = marshalers.get(argChar); return m instanceof BooleanArgumentMarshaler; } private boolean isIntArg(char argChar) { ArgumentMarshaler m = marshalers.get(argChar); return m instanceof IntegerArgumentMarshaler; } private boolean isStringArg(char argChar) { ArgumentMarshaler m = marshalers.get(argChar); return m instanceof StringArgumentMarshaler; }
- Can inline the is functions then delete them
private boolean setArgument(char argChar) throws ArgsException { ArgumentMarshaler m = marshalers.get(argChar); if (m instanceof BooleanArgumentMarshaler) setBooleanArg(argChar); else if (m instanceof StringArgumentMarshaler) setStringArg(argChar); else if (m instanceof IntegerArgumentMarshaler) setIntArg(argChar); else return false; return true; }
-
Changes to functions using set
- Signature of setBooleanArg, etc. have changed to take an ArgumentMarshaler instead of character
- Add exception management
private boolean setArgument(char argChar) throws ArgsException { ArgumentMarshaler m = marshalers.get(argChar); try { if (m instanceof BooleanArgumentMarshaler) setBooleanArg(m); else if (m instanceof StringArgumentMarshaler) setStringArg(m); else if (m instanceof IntegerArgumentMarshaler) setIntArg(m); else return false; } catch (ArgsException e) { valid = false; errorArgumentId = argChar; throw e; } return true; } private void setBooleanArg(ArgumentMarshaler m) { try { m.set("true"); // Was: booleanArgs.get(argChar).set("true"); } catch (ArgsException e) { } } // Same with Strings and Integers
-
Change getBoolean, etc. functions to use the generic map with type casting error handling
- With all tests passing and all functions no longer depending on specific types of maps
- The three instance variables and the placeholder code can be deleted
- parseBooleanSchemaElement, etc. become a single function, might as well just inline it
private void parseSchemaElement(String element) throws ParseException { char elementId = element.charAt(0); String elementTail = element.substring(1); validateSchemaElementId(elementId); if (isBooleanSchemaElement(elementTail)) marshalers.put(elementId, new BooleanArgumentMarshaler()); else if (isStringSchemaElement(elementTail)) marshalers.put(elementId, new StringArgumentMarshaler()); else if (isIntegerSchemaElement(elementTail)) { marshalers.put(elementId, new IntegerArgumentMarshaler()); } else { throw new ParseException(String.format("Argument: %c has invalid format: %s.", elementId, elementTail), 0); } }
-
Listing 14-12 shows Args.java after the first refactoring
- Still lots of instance variables
- Removed some duplication (is, parse) but still lots of set functions
- Issues to address: Type-case in setArgument and lots of error processing
-
Removing the control structure in setArgument
- Only want a single call to set, involves moving specific set methods to their subclasses
- Edge case: setIntArg specifically uses two instance variables, args and currentArgs
- This means passing two arguments instead of one when moving setIntArg to the subclass
- The solution is to "wrap" it into a list and pass an iterator down to the set function
- Turn String[] args to List<String> argsList
public class Args { private Iterator<String> currentArgument; // Was: private int currentArgument; private List<String> argsList; public Args(String schema, String[] args) throws ParseException { this.schema = schema; argsList = Arrays.asList(args); valid = parse(); } }
- The removal of instance variable args, new type of currentArgument and addition of argsList will cause small regressions with simple fixes
- Ex. args.length replaced with argsList.size()
- Ex. for (currentArgument = 0; currentArgument < args.length; currentArgument++) replaced with for (currentArgument = argsList.iterator(); currentArgument.hasNext();)
- Start eliminating the if-else chain
- Invert error condition
private boolean setArgument(char argChar) throws ArgsException { ArgumentMarshaler m = marshalers.get(argChar); if (m == null) return false; try { if (m instanceof BooleanArgumentMarshaler) setBooleanArg(m, currentArgument); else if (m instanceof StringArgumentMarshaler) setStringArg(m, currentArgument); else if (m instanceof IntegerArgumentMarshaler) setIntArg(m, currentArgument); } catch (ArgsException e) { valid = false; errorArgumentId = argChar; throw e; } return true; } private void setBooleanArg(ArgumentMarshaler m, Iterator<String> currentArgument) { m.set("true"); } // Same with Strings and Integers
- Note that exception handling is removed from setBooleanArg despite adding it in recently, might seem redundant but is essential for progressing in small steps
- Need a set function in ArgumentMarshaler so it can be called directly without worrying about types
- Get rid of the old void set(String) method and replace with void set(Iterator<String> currentArgument)
- Re-implement the new concrete set functions in the subclasses
private boolean setArgument(char argChar) throws ArgsException { ArgumentMarshaler m = marshalers.get(argChar); if (m == null) return false; try { if (m instanceof BooleanArgumentMarshaler) m.set(currentArgument); else if (m instanceof StringArgumentMarshaler) m.set(currentArgument); else if (m instanceof IntegerArgumentMarshaler) m.set(currentArgument); } catch (ArgsException e) { valid = false; errorArgumentId = argChar; throw e; } return true; }
- Turn ArgumentMarshaler into an interface from an abstract class
- No common code among implementers
- Just simply define a contract defining the required methods
-
Easy to add support for new argument types
- Write test case first
- Add condition for detecting new argument type in parseSchemaElement
- Write the implementation for the new argument type subclass
- Write test cases for errors and implement error processing
Error handling
-
Args has multiple responsibilities
- Parsing arguments
- Error handling
-
Merge all exceptions into a single ArgsException class
- Reduces instance variables in Args, increasing cohesion
- Miscellaneous error support code is isolated to another module
- Throwing only ArgsException centalises the handling of all errors, making them easier to manage
- Overall, the Args class becomes cleaner by moving alot of the error handling code into the ArgsException class
public class ArgsException extends Exception { private char errorArgumentId = '\0'; private String errorParameter = "TILT"; private ErrorCode errorCode = ErrorCode.OK; public ArgsException() {} public ArgsException(String message) { super(message); } public enum ErrorCode { OK, MISSING_STRING, MISSING_INTEGER, INVALID_INTEGER, UNEXPECTED_ARGUMENT, MISSING_DOUBLE, INVALID_DOUBLE } }
Conclusion
- Code that merely works is not good enough
- It should be refactored, applying clean coding principles to improve maintainability
- Bad code can halt production
- Scheduling time to redefine requirements, improve code and enforce strong coding standards amongst team members is worth the time
- Maintain clean code from the very beginning of the project
- Apply TDD, making the smallest changes
- Breaking up dependencies can be a lot more difficult later without performing a refactor step between test cases