Tuesday, February 21, 2012

Coding for reuse versus coding for specific use

One day, a C# developer came to me asked me to troubleshoot their application, which involves some data transformation code. In the course of troubleshooting this particular application's runtime issue, I saw this particular code implementation:

private string[] SplitTextwithSpecialCharacter(string lineFromGUI, int splitLength) 
{

    List<string> lines = new List<string>();
    int start = 0;


    while (start < lineFromGUI.Length)
    {
        string temp = string.Empty;
      
        //if length is longer it will throw exception so size the last segment appropriately
        string line = lineFromGUI.Substring(start, Math.Min(splitLength, lineFromGUI.Length - start));
        
        //temp = line.Replace(SpecialCharacter, string.Empty);

        temp = line.Replace("^", ""); //New 6/30
        if (temp.Length > 0) 
        {
            //lines.Add(line.Replace(SpecialCharacter, string.Empty));
            lines.Add(temp);
        }
        //lines.Add(line);
        start += splitLength;
    }
    return lines.ToArray();
}

When I saw this code, I inwardly cringed due to the very narrow scope in the code implementation to solve a seemingly generic problem. This code screams for refactoring to me. From the commented out section of the code, I could see that one of the developers tried to implement some reuse before reverting back to the specific use.

I see a couple of issues with this particular implementation. The first issue being that this method is trying to do two very different things; one is trying to split the text into lines and the second is removing all the caret characters from the text. In my opinion, these 2 different operations should be separated out into 2 different methods. The second issue is that the caret character removal function should be generalized such that you should be able to remove any character or perhaps even a set of characters beyond just the caret symbol. In the specific implementation, what would happen if they needed code to remove an asterisk instead of caret? What would happen if they need to remove more than one special character? Do they add additional methods? Do they modify the existing method? If they modify the existing method to take parameters, then they would have to change all the existing client code that calls this method. So it would seem to be easier for future maintenance to go ahead and make these 2 functions reusable.

Here's how I would refactor this particular method into some utilities namespace and use extension methods to add capability to the string object. Here's how that code would look like:

public static class Extensions
{
    public static string FilterOut(this string line, ISet<char> filter)
    {

        if (line == null ||  filter == null || filter.Count == 0) return line;


        return new String(line.Where(c => !filter.Contains(c)).ToArray());
    }


    public static IList<string> SplitIntoLines(this string data, int splitLength) 
    {

        // Check preconditions
        if (data == null) throw new NullReferenceException();

        if (splitLength <= 0) throw new IndexOutOfRangeException("Must be greater than 0!");

        IList<string> lines = new List<string>();

        while (data.Length > splitLength)
        {
            string line = data.Substring(0,splitLength);
            data = data.Substring(splitLength, data.Length-splitLength);
            lines.Add(line);
        }
        lines.Add(data);
        return lines;
    }
}

With this extension method, you can now use this utility method as follows:

int linesize = 30;
ISet<char> filter = new HashSet<char> { '^', '*', '&', '$' };
string[] results = data.SplitIntoLines(linesize)
                       .Select(line => line.FilterOut(filter))
                       .ToArray();

No comments: