XNSIO
  About   Slides   Home  

 
Managed Chaos
Naresh Jain's Random Thoughts on Software Development and Adventure Sports
     
`
 
RSS Feed
Recent Thoughts
Tags
Recent Comments

Refactoring Teaser 1: Take 1

Tuesday, July 14th, 2009

Last week I posted a small code snippet for refactoring under the heading Refactoring Teaser.

In this post I’ll try to show step by step how I would try to refactor this mud ball.

First and foremost cleaned up the tests to communicate the intent. Also notice I’ve changed the test class name to ContentTest instead of StringUtilTest, which means anything and everything.

public class ContentTest {
    private Content helloWorldJava = new Content("Hello World Java");
    private Content helloWorld = new Content("Hello World!");
 
    @Test
    public void ignoreContentSmallerThan3Words() {
        assertEquals("", helloWorld.toString());
    }
 
    @Test
    public void buildOneTwoAndThreeWordPhrasesFromContent() {
        assertEquals("'Hello', 'World', 'Java', 'Hello World', 'World Java', 'Hello World Java'", helloWorldJava.toPhrases(6));
    }
 
    @Test
    public void numberOfOutputPhrasesAreConfigurable() {
        assertEquals("'Hello'", helloWorldJava.toPhrases(1));
        assertEquals("'Hello', 'World', 'Java', 'Hello World'", helloWorldJava.toPhrases(4));
    }
 
    @Test
    public void returnsAllPhrasesUptoTheNumberSpecified() {
        assertEquals("'Hello', 'World', 'Java', 'Hello World', 'World Java', 'Hello World Java'", helloWorldJava.toPhrases(10));
    }
}

Next, I created a class called Content, instead of StringUtil. Content is a first-class domain object. Also notice, no more side-effect intense statics.

public class Content {
    private static final String BLANK_OUTPUT = "";
    private static final String SPACE = " ";
    private static final String DELIMITER = "', '";
    private static final String SINGLE_QUOTE = "'";
    private static final int MIN_NO_WORDS = 2;
    private static final Pattern ON_WHITESPACES = Pattern.compile("\\p{Z}|\\p{P}");
    private List phrases = new ArrayList();
 
    public Content(final String content) {
        String[] tokens = ON_WHITESPACES.split(content);
        if (tokens.length > MIN_NO_WORDS) {
            buildAllPhrasesUptoThreeWordsFrom(tokens);
        }
    }
 
    @Override
    public String toString() {
        return toPhrases(Integer.MAX_VALUE);
    }
 
    public String toPhrases(final int userRequestedSize) {
        if (phrases.isEmpty()) {
            return BLANK_OUTPUT;
        }
        List requiredPhrases = phrases.subList(0, numberOfPhrasesRequired(userRequestedSize));
        return withInQuotes(join(requiredPhrases, DELIMITER));
    }
 
    private String withInQuotes(final String phrases) {
        return SINGLE_QUOTE + phrases + SINGLE_QUOTE;
    }
 
    private int numberOfPhrasesRequired(final int userRequestedSize) {
        return userRequestedSize > phrases.size() ? phrases.size() : userRequestedSize;
    }
 
    private void buildAllPhrasesUptoThreeWordsFrom(final String[] words) {
        buildSingleWordPhrases(words);
        buildDoubleWordPhrases(words);
        buildTripleWordPhrases(words);
    }
 
    private void buildSingleWordPhrases(final String[] words) {
        for (int i = 0; i < words.length; ++i) {
            phrases.add(words[i]);
        }
    }
 
    private void buildDoubleWordPhrases(final String[] words) {
        for (int i = 0; i < words.length - 1; ++i) {
            phrases.add(words[i] + SPACE + words[i + 1]);
        }
    }
 
    private void buildTripleWordPhrases(final String[] words) {
        for (int i = 0; i < words.length - 2; ++i) {
            phrases.add(words[i] + SPACE + words[i + 1] + SPACE + words[i + 2]);
        }
    }
}

This was a big step forward, but not good enough. Next I focused on the following code:

    private void buildAllPhrasesUptoThreeWordsFrom(final String[] words) {
        buildSingleWordPhrases(words);
        buildDoubleWordPhrases(words);
        buildTripleWordPhrases(words);
    }
 
    private void buildSingleWordPhrases(final String[] words) {
        for (int i = 0; i < words.length; ++i) {
            phrases.add(words[i]);
        }
    }
 
    private void buildDoubleWordPhrases(final String[] words) {
        for (int i = 0; i < words.length - 1; ++i) {
            phrases.add(words[i] + SPACE + words[i + 1]);
        }
    }
 
    private void buildTripleWordPhrases(final String[] words) {
        for (int i = 0; i < words.length - 2; ++i) {
            phrases.add(words[i] + SPACE + words[i + 1] + SPACE + words[i + 2]);
        }
    }

The above code violates the Open-Closed Principle (pdf). It also smells of duplication. Created a somewhat generic method to kill the duplication.

    private void buildAllPhrasesUptoThreeWordsFrom(final String[] fromWords) {
        buildPhrasesOf(ONE_WORD, fromWords);
        buildPhrasesOf(TWO_WORDS, fromWords);
        buildPhrasesOf(THREE_WORDS, fromWords);
    }
 
    private void buildPhrasesOf(final int phraseLength, final String[] tokens) {
        for (int i = 0; i <= tokens.length - phraseLength; ++i) {
            String phrase = phraseAt(i, tokens, phraseLength);
            phrases.add(phrase);
        }
    }
 
    private String phraseAt(final int currentIndex, final String[] tokens, final int phraseLength) {
        StringBuilder phrase = new StringBuilder(tokens[currentIndex]);
        for (int i = 1; i < phraseLength; i++) {
            phrase.append(SPACE + tokens[currentIndex + i]);
        }
        return phrase.toString();
    }

Now I had a feeling that my Content class was doing too much and also suffered from the primitive obsession code smell. Looked like a concept/abstraction (class) was dying to be called out. So created a Words class as an inner class.

    private class Words {
        private String[] tokens;
        private static final String SPACE = " ";
 
        Words(final String content) {
            tokens = ON_WHITESPACES.split(content);
        }
 
        boolean has(final int minNoWords) {
            return tokens.length > minNoWords;
        }
 
        List phrasesOf(final int length) {
            List phrases = new ArrayList();
            for (int i = 0; i <= tokens.length - length; ++i) {
                String phrase = phraseAt(i, length);
                phrases.add(phrase);
            }
            return phrases;
        }
 
        private String phraseAt(final int index, final int length) {
            StringBuilder phrase = new StringBuilder(tokens[index]);
            for (int i = 1; i < length; i++) {
                phrase.append(SPACE + tokens[index + i]);
            }
            return phrase.toString();
        }
    }

In the constructor of the Content class we instantiate a Words class as follows:

    public Content(final String content) {
        Words words = new Words(content);
        if (words.has(MIN_NO_WORDS)) {
            phrases.addAll(words.phrasesOf(ONE_WORD));
            phrases.addAll(words.phrasesOf(TWO_WORDS));
            phrases.addAll(words.phrasesOf(THREE_WORDS));
        }
    }

Even though this code communicates well, there is duplication and noise that can be removed without compromising on the communication.

     phrases.addAll(words.phrasesOf(ONE_WORD, TWO_WORDS, THREE_WORDS));

There are few more version after this, but I think this should give you an idea about the direction I’m heading.

Refactoring Teaser Part 1

Wednesday, July 8th, 2009

How would you refactoring the following code? (This code is in Java, but you can refactoring using any language of your choice).

Following test explains the functionality of the production code:

1
2
3
4
5
6
7
8
9
public class StringUtilTest {
  @Test
  public void testSplit() {
    assertEquals("'Hello', 'World', 'Java', 'Hello World', 'World Java', 'Hello World Java'", StringUtil.split("Hello World Java", 6));
    assertEquals("'Hello', 'World', 'Java', 'Hello World', 'World Java', 'Hello World Java'", StringUtil.split("Hello World Java", 10));
    assertEquals("'Hello', 'World', 'Java', 'Hello World'", StringUtil.split("Hello World Java", 4));
    assertEquals("'Hello'", StringUtil.split("Hello World Java", 1));
  }
}

General use case is that for a given string (content), users might want split the same string to get different numbers of keywords in the output.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
public final class StringUtil {
  private static final Pattern REGEX_TO_SPLIT_ALONG_WHITESPACES = Pattern.compile("\\p{Z}|\\p{P}");
 
  public static String split(final String content, final int number) {
    String listOfKeywords = "";
    int count = 0;
    String[] tokens = REGEX_TO_SPLIT_ALONG_WHITESPACES.split(content);
    List<string> strings = Arrays.asList(tokens);
    List<string> allStrings = singleDoubleTripleWords(strings);
    int size = allStrings.size();
    for (String phrase : allStrings) {
      if (count == number) {
        break;
      }
      listOfKeywords += "'" + phrase + "'";
      if (++count < size && count < number) {
        listOfKeywords += ", ";
      }
    }
    return listOfKeywords;
  }
 
  private static List<String> singleDoubleTripleWords(final List<string> strings) {
    List<string> allStrings = new ArrayList<string>();
    int numWords = strings.size();
 
    if (hasEnoughWords(numWords) == false) {
      return allStrings;
    }
 
    // Extracting single words. Total size of words == numWords
 
    // Extracting single-word phrases.
    for (int i = 0; i < numWords; ++i) {
      allStrings.add(strings.get(i));
    }
 
    // Extracting double-word phrases
    for (int i = 0; i < numWords - 1; ++i) {
      allStrings.add(strings.get(i) + " " + strings.get(i + 1));
    }
 
    // Extracting triple-word phrases
    for (int i = 0; i < numWords - 2; ++i) {
      allStrings.add(strings.get(i) + " " + strings.get(i + 1) + " " + strings.get(i + 2));
    }
    return allStrings;
  }
 
  private static boolean hasEnoughWords(final int numWords) {
    if (numWords < 3) {
      return false;
    }
    return true;
  }
}
    Licensed under
Creative Commons License