Wednesday, July 23, 2008

Murdered by Numbers

"JavaScript has a single number type. Internally, it is represented as 64-bit floating point, the same as Java's double. Unlike most other programming languages, there is no separate integer type, so 1 and 1.0 are the same value. This is significant convenience because problems of overflow in short integers are completely avoided, and all you need to know about a number is that it is a number. A large class of numeric type errors is avoided."

Douglas Crockford, JavaScript: The Good Parts



When I was twenty-something, I liked watching The Young Indiana Jones Chronicles. Besides the regular stuff that your average Indy fan loves, I was particularly fond of Indy's apparently insatiable appetite for traveling and learning different languages. So while I was contemplating what it would take for me to follow in his footsteps, I figured I'd better start with learning foreign languages. By that time I had already got my English certificate and I had a few years of learning French under my belt that had at least provided me with some means to court women (with little success, regrettably). So I started learning Italian, which everyone said was easy to pick up, especially if you had already mastered some other foreign language. That turned out to be true, and I managed to get a degree in Italian after two years of intensive studies. What made that period frustrating (and occasionally funny) however, were the times that I would inadvertently mix all three languages in the same sentence, creating my own version of Esperanto. Due to the similarities among them, I might be trying to speak Italian, but use an English noun while constructing the past tense of a verb in French. Sometimes I would even be oblivious to my mistake until someone else pointed it out to me. It all seemed very natural as I was doing it.

Lately I'm having a déjà vu, when I find myself coding in Java, JavaScript and C++, often in the same day. More than once I tried to initialize a Java object using a JavaScript object literal. Sadly, the compiler was not very accommodating. While writing a GWT-based front-end, I often transmitted POJOs without being aware that a long value was silently converted to a JavaScript Number, which essentially amounts to a Java double. Reading David Flanagan's "Rhino" book had already left me with the impression that JavaScript was rather flawed for not having separate types for integers, bytes, chars and all the other goodies that languages like C/C++ and Java spoil us with. But after getting a copy of Douglas Crockford recent, highly opinionated, "Good parts" book, his argument resonated with me: "A large class of numeric type errors is avoided." It's Richard Gabriel's "Worse Is Better" principle, or alternatively "Less Is More", in new clothes. Recent events made sure that the message was permanently bolted in my brain.

I've been writing a desktop application in Java that communicates with various Bluetooth and USB devices. The communication protocol is some sort of terminal-like commands and responses that initiate at the Java application and travel through a thin JNI layer down to the C/C++ device driver, and then to the actual device. The protocol documentation describes in... er, broad terms, the sequences of bytes that constitute the various requests and responses. Suffice it to say that my system administrator's experience in sniffing and deciphering network packets proved invaluable.

Sending the command was easy (or so I thought): fill a byte[] array with the right numbers and flush it through the JNI layer. There we get the jbyteArray and put it inside an unsigned char array, which is later send through the device driver to the actual device. When receiving responses the sequence was reversed. It all seemed to work fine for quite some time, until suddenly I discovered that one particular command caused the device to misbehave. I couldn't be sure if the device was faulty or my code was buggy, but since I had zero chances of proving the former, I focused on investigating the latter. A couple of days of debugging later I was still on square one, since as far as I could tell the command reached the device unscathed. Logic says that if a fine command reaches a fine device, then one would be entitled to a fine response. Since I wasn't getting it, I began questioning my assumptions.

I resisted the urge to blame the device, since I couldn't prove it conclusively, and started blaming the command. There was definitely something fishy about it, and to be honest, I had a bad feeling all along. The other commands were simple sequences, like:

0x14, 0x18, 0x1a, 0x3e

or

0x13, 0x17, 0x19, 0x26

This particular one however, was icky:

0x11, 0x15, 0x17, 0xf0

If you can't see the ickiness (and I won't blame you), let me help you.

Java's byte type is a signed 8-bit integer. C++'s unsigned char type is an unsigned 8-bit integer (at least in 32-bit Windows). Therefore we can represent values from -128 to 127 in Java and values from 0 to 255 in C. So, if you have a value between 128 and 255, ickiness ensues. 0xf0 is, you guessed it, between 128 and 255. It is 240 to be precise, if Windows Calculator is to be trusted.

Now, of course I am not that dumb. I knew that you can't assign 0xf0 to a byte in Java, so I had already made the conversion. You see, what is actually transmitted is a sequence of 8 bits. If you get the sequence right, it will reach it's destination no matter what. When you convert 0xf0 to bits you get 11110000. The first bit is the sign bit, which is causing all the trouble. If it was zero instead, we would be dealing with 1110000, or 0x70, or 112 if you're into decimal numbers.

So that's what I had done. I'd constructed the negative version of 112 and used that to fill my command buffer:

0x11, 0x15, 0x17, -112

Looking at it made me feel a bit uneasy, without being able to explain why. I used to think it was the mixed hex and decimal numbers. Yeah, I'm weird like that. However, the zillionth time I reread the Java's byte type definition, a lightbulb lit up over my head. I actually paid attention to the words in front of me: "The byte data type is an 8-bit signed two's complement integer".

Sure, it's 8 bits wide and yes it's signed, using the most common representation for negative binary numbers, two's complement. What's new here? I know how to negate in two's complem...

Whoa! Wait a minute. What I just described above isn't how you negate a number in two's complement. It's actually how you do it in the sign-and-magnitude variant. In two's complement you invert the bits and add one to the result:

11110000 -> 00001111 -> 00010000 or 16, or 0x10

Yep, 16, not 112. So the proper command sequence becomes

0x11, 0x15, 0x17, -16

and, yes, the device seems quite happy with that. As happy as devices get, that is.

So, basically, I wasted many many hours and suffered innumerable hair-pulling episodes for falling prey to a numeric type error. The kind Doug Crockford alludes to in his book. Now, don't get me wrong, I like mucking with bits as much as the next guy. But had I been living solely in JavaScript-land, with the single number type, I'd feel less tired and probably not hate my work as much. Though, granted, I might be needing a haircut right now.

1 comment:

Anonymous said...

Common in functional languages.

Here's a quote from Joe Armstrong's book:

"Erlang users arbitrary-sized integers for performing integer arithmetic. In Erlang, integer arithmetic is exact, so you don't have to worry about arithmetic overflows or not being able to represent an integer in a certain word size"

Creative Commons License Unless otherwise expressly stated, all original material in this weblog is licensed under a Creative Commons Attribution 3.0 License.