XNSIO
  About   Slides   Home  

 
Managed Chaos
Naresh Jain's Random Thoughts on Software Development and Adventure Sports
     
`
 
RSS Feed
Recent Thoughts
Tags
Recent Comments

ProTest’s System Metaphor

Saturday, March 7th, 2009

On the ProTest project, we are using the Election as our System Metaphor to identify the key objects and their interactions. .i.e. to explain the logical design.

ProTest is a library to prioritize your tests such that you get fastest feedback by executing tests that are most likely to fail first. We use different strategies like a Dependency strategy which orders the tests based on dependencies of recently changed classes. If class ‘A’ was changed, then it makes sense to run all the tests that have a dependency on class ‘A’ first. We plan to have other strategies like Last failed test strategy which will order tests based on all the tests that failed in the last run, first. Others using cyclomatic complexity, test coverage and so on.

The Election Metaphor

All the tests that need to be executed are candidates standing for election (trying to get executed first). Each strategy is a voter, who votes the candidates. (We had to slightly change this metaphor. In our case, a voter can vote for multiple candidates.) Once all the voters cast their votes, we do a rank aggregation to determine the winners and hence come up with a prioritized list of tests. We plan to further enhance the metaphor to provide different weightages for each voter. Basically some voters are more powerful than the others.

Recently I was explaining the project to team from Bolivia and this metaphor really helped. I wonder if this metaphor would make sense to the Chinese. 😉

Another Project Rescue Report

Monday, February 9th, 2009

Some time back, I spent 1 Week helping a project (Server written in Java) clear its Technical Debt. The code base is tiny because it leverages lot of existing server framework to do its job. This server handles extremely high volumes of data & request and is a very important part of our server infrastructure. Here are some results:

Topic Before After
Project Size Production Code

  • Package =1
  • Classes =4
  • Methods = 15 (average 3.75/class)
  • LOC = 172 (average 11.47/method and 43/class)
  • Average Cyclomatic Complexity/Method = 3.27

Test Code

  • Package =0
  • Classes = 0
  • Methods = 0
  • LOC = 0
Production Code

  • Package = 4
  • Classes =13
  • Methods = 68 (average 5.23/class)
  • LOC = 394 (average 5.79/method and 30.31/class)
  • Average Cyclomatic Complexity/Method = 1.58

Test Code

  • Package = 6
  • Classes = 11
  • Methods = 90
  • LOC =458
Code Coverage
  • Line Coverage: 0%
  • Block Coverage: 0%

Old Code Coverage Report

  • Line Coverage: 96%
  • Block Coverage: 97%

New Code Coverage Report

Cyclomatic Complexity Cyclomatic Complexity report before Refactoring Cyclomatic Complexity report after Refactoring
Obvious Dead Code Following public methods:

  • class DatabaseLayer: releasePool()

Total: 1 method in 1 class

Following public methods:

  • class DFService: overloaded constructor

Total: 1 method in 1 class

Note: This method is required by the tests.

Automation
Version Control Usage
  • Average Commits Per Day = 0
  • Average # of Files Changed Per Commit = 12
  • Average Commits Per Day = 7
  • Average # of Files Changed Per Commit = 4
Coding Convention Violation 96 0

Another similar report.

Project Rescue Report

Monday, February 2nd, 2009

Recently I spent 2 Weeks helping a project clear its Technical Debt. Here are some results:

Topic Before After
Project Size Production Code

  • Package = 7
  • Classes = 23
  • Methods = 104 (average 4.52/class)
  • LOC = 912 (average 8.77/method and 39.65/class)
  • Average Cyclomatic Complexity/Method = 2.04

Test Code

  • Package = 1
  • Classes = 10
  • Methods = 92
  • LOC = 410
Production Code

  • Package = 4
  • Classes = 20
  • Methods = 89 (average 4.45/class)
  • LOC = 627 (average 7.04/method and 31.35/class)
  • Average Cyclomatic Complexity/Method = 1.79

Test Code

  • Package = 4
  • Classes = 18
  • Methods = 120
  • LOC = 771
Code Coverage
  • Line Coverage: 46%
  • Block Coverage: 43%

Coverage report before Refactoring

  • Line Coverage: 94%
  • Block Coverage: 96%

Coverage report after refactoring

Cyclomatic Complexity Cyclomatic Complexity report before Refactoring Cyclomatic Complexity report after Refactoring
Obvious Dead Code Following public methods:

  • class CryptoUtils: String getSHA1HashOfString(String), String encryptString(String), String decryptString(String)
  • class DbLogger: writeToTable(String, String)
  • class DebugUtils: String convertListToString(java.util.List), String convertStrArrayToString(String)
  • class FileSystem: int getNumLinesInFile(String)

Total: 7 methods in 4 classes

Following public methods:

  • class BackgroundDBWriter: stop()

Total: 1 method in 1 class

Note: This method is required by the tests.

Automation
Version Control Usage
  • Average Commits Per Day = 1
  • Average # of Files Changed Per Commit = 2
  • Average Commits Per Day = 4
  • Average # of Files Changed Per Commit = 9

Note: Since we are heavily refactoring, lots of files are touched for each commit. But the frequency of commit is fairly high to ensure we are not taking big leaps.

Coding Convention Violation 976 0

Something interesting to watch out is how the production code becomes more crisp (fewer packages, classes and LOC) and how the amount of test code becomes greater than the production code.

Another similar report.

What is Simple Design?

Monday, February 2nd, 2009

Simple is a very subjective word. But is Simple Design as well equally subjective?

Following is what dictionary.com has to say about the word “Simple”:

  • easy to understand, deal with, use, etc.: a simple matter; simple tools.
  • not elaborate or artificial; plain: a simple style.
  • not ornate or luxurious; unadorned: a simple gown.
  • unaffected; unassuming; modest: a simple manner.
  • not complicated: a simple design.
  • not complex or compound; single.
  • occurring or considered alone; mere; bare: the simple truth; a simple fact.
  • free of deceit or guile; sincere; unconditional: a frank, simple answer.
  • common or ordinary: a simple soldier.
  • not grand or sophisticated; unpretentious: a simple way of life.
  • humble or lowly: simple folk.
  • inconsequential or rudimentary.

It turns out that some of these adjectives define the characteristics of a Simple design very well:

  • easy to understand, deal with: communicates its intent.
  • is clear or has clarity
  • not elaborate or artificial; plain: crisp and concise
  • helps you maintain clear focus
  • is unambiguous
  • not ornate or luxurious; unadorned: minimalistic; least possible components (classes and methods).
  • unaffected; unassuming; modest: does not have unanticipated side-effects.
  • not complicated: avoids unnecessary conditional logic.
  • not complex or compound; single: just does one thing and does it well.
  • occurring or considered alone; mere; bare: to the point.
  • free of deceit or guile; sincere; unconditional: abstracts implementation from intent, but does not deceive someone by concealing or misrepresenting the actual concept.
  • common or ordinary: built on standard patterns which are well understood.
  • not grand or sophisticated; unpretentious: fulfills today’s needs without unnecessary bells and whistles (over-engineering).
  • humble or lowly.
  • inconsequential or rudimentary: does not draw your attention to unnecessary details; achieves good abstractions

What is Simple Design?

A design that allows you to keep moving forward with least amount of resistance. Its like traveling light; low up-front investment, and not much to slow you down when you want to change. Its like clay in the hands of an artist. Simple is a direction (dynamic) not a location (static). To achieve this:

  • Do the Simplest thing that could possibly work. In this context the “doing” is very important; just thinking will not help.
  • YAGNI – You Aren’t Gonna Need It. Don’t design for something that isn’t needed today. Think about the future, but test, code and design for today’s needs. Don’t design for future’s complexity that may not happen or change.
  • The use of Design Patterns contributes to the simplicity by using standard constructs and approaches that have been observed over many years of software development.
  • Code Smells have a wealth of knowledge on symptoms of rotting design. Being aware of them is very important for every programmer.
  • Similarly, the Unix Programming Philosophy and good Object-Oriented design principles will guide the code to be simple and maintainable.
  • Simple Design and Test Driven Development (TDD) go hand in hand. Since the code is written to make the test pass, it tends to be more focused and much simpler. Check out: Smells in Test that indicate Design problems.

You know you have achieved a Simple Design when: (the official scoop):

  • The System Works: all the tests are passing.
  • Communicates Well: expresses every idea that we need to express.
  • Contains no duplication: says everything Once-And-Only-Once and follows Don’t Repeat Yourself (DRY) principle.
  • Has no superfluous parts: is concise. Has the least possible number of classes and methods without violating the first 3 guideline

I would like to add a 5th guideline here. If any developer on your team cannot draw (explain) the design in a couple of minutes, there is scope for simplification.

In my experience design is a very involved activity. Every now and then, one needs to make trade-off decisions. Some of the guiding principles I use while designing (listed below), do tend to compete and forces me to make a balanced trade-off decision. Sometimes I make the wrong decision, but Refactoring gives me another chance to  set it right.

  • Lessons learnt from The Art of Unix Programming
    • Modularity: Write simple parts connected by clean interfaces
    • Clarity: Clarity is better than cleverness.
    • Composition: Design programs to be connected to other programs.
    • Separation: Separate policy from mechanism; separate interfaces from engines
    • Simplicity: Design for simplicity; add complexity only where you must
    • Parsimony: Write a big program only when it is clear by demonstration that nothing else will do
    • Transparency: Design for visibility to make inspection and debugging easier
    • Robustness: Robustness is the child of transparency and simplicity
    • Representation: Fold knowledge into data so program logic can be stupid and robust
    • Least Surprise: In interface design, always do the least surprising thing
    • Silence: When a program has nothing surprising to say, it should say nothing
    • Repair: When you must fail, fail noisily and as soon as possible
    • Economy: Programmer time is expensive; conserve it in preference to machine time
    • Generation: Avoid hand-hacking; write programs to write programs when you can
    • Optimization: Prototype before polishing. Get it working before you optimize it
    • Diversity: Distrust all claims for “one true way”
  • Bob Martin’s OO Design Principles: SOLID
    • Single Responsibility Principle (SRP): There should never be more than one reason for a class to change.
    • Open Closed Principle (OCP): A module should be open for extension but closed for modification
    • Liskov Substitution Principle (LSP): Subclasses should be substitutable for their base classes or Design by Contract
    • Interface Segregation Principle (ISP): Depend upon Abstractions. Do not depend upon concretions. Abstractions live longer than details.
    • Dependency Inversion Principle (DIP): Many client specific interfaces are better than one general purpose interface or Narrow Interface
  • OAOO – Once and only once: Mercilessly kill duplication. Whether its code duplication or conceptual duplication. It all gets in the way sooner or later.
  • DRY – Don’t Repeat yourself: Every piece of knowledge must have a single, unambiguous, authoritative representation within a system. DRY is similar to OAOO, but DRY applies to effort as well, not just code.
  • Tell Don’t Ask: As the caller, you should not be making decisions based on the state of the called object which then results in you changing the state of some other object. The logic you are implementing is probably the called object’s responsibility, not yours. For you to make decisions outside the object violates its encapsulation.
  • The Law of Demeter: Any method of an object should only call methods belonging to:
    • itself
    • any composite objects
    • any parameters that were passed in to the method
    • any objects it created
  • Triangulate: When you are not sure what the correct abstraction should be, instead of pulling out an abstraction upfront, you get the second case to work by duplicating and modifying a small piece of code. Once you have both the solutions working, find the “generic” form and create an abstraction.
  • Influence from Functional Programming:
    • Separate Query from Modifier: always separate methods which have side-effects from those which don’t. If possible make the method signature express that. And if you really want to spice-up things a bit, try having side-effect free methods and classes as much as possible.
    • Prefer immutable objects over objects whose state changes after construction. Better for concurrency and better for freely passing them around.

Also don’t forget:

Smells in Test that indicate Design problems

Sunday, February 1st, 2009

At the Simple Design and Testing Conference in 2007, we had an interesting discussion on “what are my tests telling me about my design?

Following are some of the conclusions from the discussion:

  • Too many test cases per method: may indicate that the method is doing too much. We discussed the fact that complex business logic algorithms, with lots of special case, often appear to be atomic and indivisible; and thus only testable as a unit. But there is often a way to break them down into smaller pieces. Also sometimes one needs to think if all those special cases are really required now or we are speculating?
  • Poorly factored edge cases: this is the case where there are many variations of input tested, when a few carefully-chosen edge cases would suffice. We discussed how this sometimes emerges when the algorithm under test has too many special cases, and the same result could be arrived at with a more general algorithm.
  • Increasing access privilege of members (methods or instance variables) to protected or public only for testing purpose: sometimes indicates that you are coupling your tests too much with the code. Sometimes it indicates that may be the private thing has enough behavior that it needs to be tested. In that case may be you should consider pulling it out as a separate object
  • Too much setup/teardown: indicates strong coupling in the class under test.
  • Mocks returning mocks: indicate that the method under test has too many collaborators.
  • Poorly-named tests: sometimes means that the naming and/or design of the classes under test isn’t sufficiently thought-out.
  • Lots of Duplication in tests: sometimes indicate that the production code should be providing a way to avoid some of that duplication.
  • Extensive Inheritance in test fixtures: indicate that your design might heavily rely on inheritance instead of composition.
  • Double dots in the test code: indicates that the code violates the law of Demeter. In some cases it might be better to hide the delegate.
  • Changing one thing breaks many tests: may just indicate bad factoring of tests, but can also indicate excess dependencies in the code.
  • Dynamic stubs (stubs with conditional behavior): indicates lack of control over the collaborator that is being stubbed out. This sometimes indicate the behavior is not distributed well amongst the classes.
  • Too many dependencies that have to be included in the test context: indicates tight coupling in the design
  • Random test failures when running them in parallel: indicates that the code is not thread safe and has side-effects that are not factored correctly.
  • Tests run slowly: indicates that your unit tests might be hitting external systems like network, database or filesystem. This usually indicates that the class under test might have multiple responsibility. One should be able to stub out external dependencies.
  • Temporal coupling – tests break when run in a different order: may just be a test smell; may be coupling in the code under test.

Based on this its very apparent that tests do influence you design. If done well, it will surely result in Simple, Elegant Design.

Programming in the 21st Century

Sunday, February 1st, 2009

Programming is “the action or process of writing computer programs”.

Programming by definition encompasses analysis, design , coding, testing, debugging, profiling and a whole lot of other activities. Beware Coding is NOT Programming. Depending on which school of thought you belong to, you will define the relationship and boundaries between these various activities.

For Example:

  • In a waterfall world, each activity is a phase and you want a clear sign-off between each phases. Also these phases are sequential by nature with very limited or no feedback. Hence you are expected to have the full design in place before you can code. Else, what do you code?
  • In RUP (so-called Iterative and Incremental model) even though it follows a spiral model with some feedback cycle every 3 months or so, one is expected to have the overall architecture of the project and a documented design (in UML notation) of the subset of use cases planned for the current spiral ready before the construction (coding) phase.
  • In the unconventional model (where we don’t have process & tool servants and team members can do what they think is most appropriate in the given context), we fail to understand these sequential, rigid processes. We have burnt our fingers way too many times trying to retrofit ourselves into this sequential, well-defined process boundaries guarded by process police. So we have given up the hope that we’ll ever be as smart as the rest of the “coding community” and have chosen a different route.

So how do we design systems then?

  • Some of us start with a test (not all, but just one) to understand/clarify what we are trying to build.
  • While others might write some prototype code (read it as throw away code) to understand what needs to be build.
  • Some teams also start by building a paper prototype (low-fidelity prototype) of what they plan to build and jump straight to the keyboard to validate their thought process (at least once very few hours).
  • Yet some others use plain old index cards to model the system and start writing a test to put their thoughts in an assertive medium.

This is just the tip of the iceberg. There are a million ways people program systems. We seem to use a lot of these practices in conjunction (because they are not mutually exclusive practices and can actually be done in parallel).

People who are successful in this model have recognized that they are dealing with a complex adaptive system (CAS) and not a complicated system, where you can define rigid boundaries and be successful. In a CAS, there are multiple ways to do something and if someone makes a claim that you always have to do X before Y, we can sense the desire of putting rigid constraints which by nature are fragile. This is the same reason why there is no such thing called Best Practices in our dictionary. Instead we keep an eye on emerging patterns. If we want to see a particular pattern impact the system, we introduce attractors. But if we don’t want a pattern to impact our system we disrupt that pattern. (rip-off from Dave Snowden, creator of the Cynefin model and leading personality in Knowledge Management Community)

The open source community in general, is yet another classic example which fits into the unconventional category. I’ve never been on an open source project where we had a design phase. People live and breath evolutionary design. At best you might have a simple wiki defining some guidelines.

Anyway, I’m not saying that upfront design is bad. All I’m saying is, don’t tell me that one always has to design first. In CAS, you tend to “Probe-Sense-Respond” and not “Analyze-Respond”. In software development “Action precedes Clarity”, almost always.

Source Code is the Design

Sunday, February 1st, 2009

Following a DesignFest @ Directi, there has been a lot of discussion on “what is Design?”, “should one design before they code?” and so on.

In this blog I’ll be mostly focusing on object design. There are other designs related to Usability which is not the focus. However I believe it is very important to incorporate various other design elements into your object design to achieve Conceptual Integrity.

I strongly recommend you read the following before you continue reading this blog:

So where does this whole notion of design followed by coding come from?

On the C2 Wiki there is a really nice response to this question:

I propose that the real issue is that design is not really a beneficial activity in software development, and to say “The Source Code Is The Design” is trying to use semantics to gloss over the issue.

I feel this is an important distinction if the goal is to remove the “design” stage from the software development process. Rather than being afraid of being accused of “not doing design”, we need to turn the debate around to be “Why should we do design?”

For some tasks (in other industries like Manufacturing), it may be much more cost effective to create a design and evaluate the design before building the actual product. For software, this is not the case. For years, software has struggled to come up with something to use for “design.” We had flow charts, PDL, Data Flow Diagrams, prose descriptions, and now UML. With software, however, it takes as much time to create the design as the actual software; the design is more difficult to validate than the actual software; and the simplifying assumptions made in the design are often the critical issues to evaluate in the software. For these reasons, it simply is not cost effective to design, iteratively correct the design, then write the software (and then iteratively correct the software). It is better to start with the software and iteratively correct it.

I believe it is time to explicitly state the long held secret of software, we do not need to do design; design is ineffective and costly.

I would saying mandating that you need to design first does not seem right. It’s also important to understand that:

UML diagrams and other associated paperwork (if any) are documentation about the design. A documentation can have different views as appropriate to shed some light (if any) on various aspects of the object described. However, the source code has a privileged status: it is not just “documentation about”, it is the object itself.

If documentation is important why not use source code as that document? Of course this means programmers will now have to write self-documenting code which embraces Simple Design.

Conceptual Integrity

Sunday, February 1st, 2009

Conceptual Integrity is the principle that anywhere you look in your system, you can tell that the design is part of the same overall design. This includes low-level issues such as formatting and identifier naming, but also issues such as how modules and classes are designed, etc.

Imagine a Library where:

  • Few classes threw an exception
  • Few other classes returned you an integer error code
  • And rest had void methods, but you had to call a method HasErrors() & GetErrors()

This lack of inconsistency can lead to poor communication and complex client code. Having a consistent style of design is the very essence of Conceptual Integrity.

While developers focus a lot on High Cohesion and Low Coupling, they seem to underestimate the importance of Conceptual Integrity. Some times despite high cohesion and low coupling, the system might not have conceptual integrity. This is because the overall style, themes, mood, does not tie it all together.
For example according to the Pragmatic Programmer, in computer languages, Smalltalk has conceptual integrity, so does Ruby, so does C. C++ doesn’t: it tries to be too many things at once, so you get an awkward marriage of concepts that don’t really fit together well.

Coding is NOT Programming

Sunday, February 1st, 2009

I see people using the terms Coding and Programming interchangeably.

Coding is the act of converting something in one form to another.

Think of it as coding and decoding. Typically you take a design and convert (code) it into source code. Which is a very waterfallish thought process. (I’m not saying its good or bad.)

Where as,

Programming is “the action or process of writing computer programs”.

Programming by definition encompasses analysis, design , coding, testing, debugging, profiling and a whole lot of other activities. Depending on which school of thought you belong to, you will define the relationship and boundaries between these various activities.

For this very reason Pair Coding is not econmical. A lot of times teams try to introduce Pair Programming in a waterfall culture (analysis followed by design followed by coding) and it does not work.

Software Evolves: An Anology

Friday, January 30th, 2009

Joel pointed out this blog which has a JavaScript app that builds an image by starting with a clean slate. Gradually it tries to add different types and sizes of polygons of different colors. Joel pointed out that the way this app builds the image reminds him of how we build software. You start with a vague idea of what you want to build and gradually as you start developing the product clarity emerges. In the process you might try various different things letting the software product evolve.

    Licensed under
Creative Commons License