Thursday, June 23, 2011

Demise of The Multiple Length String of "a" 's

Ok, well I've discovered an important thing about .NET String objects.  They suck, for the following two reasons:

-They are immutable, meaning every alteration to a string you make produces a NEW COPY of the string in memory.  So if you append 10 items to a string in a loop, you will have 10 progressively longer copies of the string in memory! (in that specific case, use a StringBuilder object instead, much more efficient)
-They are by default Unicode, which is a good thing if you're dealing with web-text, or multiple languages.  Not so good with testing crypto code or passing certain types of passwords.  Use SecureString's instead or just deal with raw byte arrays.  For example, if I build a string of single byte 'a's, in memory that becomes a string of DOUBLE byte 'a's in Unicode.  Boom!  Just that fast I've doubled my memory usage for that one string!

These two reasons drove my testing rig for Skein into using 4GB of memory (that's GIGABYTES, with a G) every time it hit the larger strings.  That was the reason I created a "low memory" test that used shorter and fewer strings for doing full-bodied but not over-burdening line-item tests.

So, what have I learned?  Don't use strings!  At least not for this purpose.

Also, I learned that if my documentation says I do something, I BETTER BE FUCKING WELL DOING THAT THING!  :(  I had listed my encryption functions as using a specific type of padding, when in reality I was doing a completely different type of padding!  Not completely incompatible, but now that I've fixed it, this is a breaking change

Unfortunately this also means this will potentially break my GoogleAuthCLONE if the old DLL is replaced with the new DLL, any old accounts might get blown away.  Such is development....

No comments:

Post a Comment