Invisible character when parsing DateTime

Today we faced a crazy situation. we have a piece of code in one of our applications which parses date and time value entered into a text box. Being lazy/smart developers we normally copy paste the sample value from the help file in order to test it. but crazy part was that when we copy paste it, .net could not parse the value throwing exception saying that the string was not in correct format. but if we typed exactly the same value it would work! aha let me copy paste both values into notepad and compare them char by char. ok they are exactly the same.

to cut the story short after spending some times on the issue we noticed that there is an invisible character at the beginning of the value copied from the help file. then we quickly created a console app to get the char code of the character. and yes that’s a valid unicode character with 8203 code called “Zero Width Space (U+200B)”.

having found the problem we could easily fix our help files and remove this character from them.

here is the small piece of code to remove the Zero Width Space character from a string.

   1: string pattern = @"\u200B";
   2: Regex regex = new Regex(pattern);
   3: var matches = regex.Matches(content);
   4: Console.WriteLine("Found {0} Matches", matches.Count);
   5: content = regex.Replace(content, "");

Hope this saves you some time in future.

Written by vahid

Wednesday, March 14, 2012 at 2:56 PM

Tagged with

Leave a Reply