One of my friends asked me for help with his code. He had a simple error with IndexOutOfBound exception, it was quite simple but I had other suggestion to his code. Hopefully this post will help other “starting” programmers by providing good tips and some ideas relating the code and refactorization.
Here is the code. This method takes string as a parameter and returns another string. It is not obvious but this function is supposed to change Polish characters to other chars in order to clean up provided parameter.
For input “żenił Jak może budować” this function should return “zenil Jak moze budowac”
As you can see we get rid of “ż ć ł”.
There is a very simple error inside this code that gives IndexOutOfBound Exception for some cases. Code works when lengths of strings in array , array2 and parameter string “str” are the same. It is clearly a problem linked with two for loops and their stop conditions.
Fix in the Code
Fix is simple and involves changes to the stop conditions inside loops. Outer loops should be dependent on length from array being changed. Inner loops is iterating through elements inside arrays that are defining char – replacer relationship.
After those fixes function works fine and does the required job.
After fixing the bug let’s talk about quality of the code, conventions and refactorization.
First of all I don’t like the name of the function - “replace_pl” , this one doesn’t fit into standard camel case convention in .Net, also this name gives us little information about purpose of the function. Only clear message is that it replaces something, we don’t know what kind of thing. I changed the name to ReplacePolishChars, a lot better and quite descriptive.
Then I changed the parameter name and name of the object that will be returned to ”inputString” and “outputString” . With names containing input and output, I can easily track variables that are visible to the user of the method.
Internal variables array and array2, that’s also a bad idea. From their names we don’t know what’s their usage and purpose. We need to analyze code in order to get the general idea what are they. I changed their names to “polishChars” and “replacers” now we know (I don’t like the idea of two strings here but I will get back to this later ) .
This code can be refactored further. Instead of “==” operator we could use .Equals or Compare. For loops could be replaced with enumarator and foreach.
I mentioned earlier that I don’t like the idea with two strings containing data about chars being replaced and their replacers. With string you have to remember correct order, little change can cause painful and “hidden” bugs. It is also hard to track which pair of char and its replacer.
It is better to use Dictionary with key equal to char being replaced and value equal to his replacer. With this solution we can delete one for loop and just check for each char in “inputString” if there is a key for it, and if it is just replace current char with value from the Dictionary.
With dictionary we can easily add new chars to be replaced, also it is a lot easier to track which char is replacing another one
Remember about the standards and conventions. Name your methods and variables accordingly retain nice descriptive style without using the comments. Commenting should be only used in special cases. Try to find different approaches and it is always good idea to code review your creations . Fresh pair of eyes is always helpful.