This turns out to be a codeset-setting issue, where unicode() returns different (perfectly valid) encodings of the words depending upon which codeset it's given.
There's also a problem with using locale.getpreferredencoding() on OSX: It returns "mac roman", pretty much regardless of the environment locale settings. This isn't correct for recent versions of OSX under any circumstances, so this change also disables calling it on macs.
svn: r17860