Button Button Button Button

Back in 1996 I remember reading Russian web pages written in “translit” – i.e. transliterated with Roman alphabet. Dealing with Cyrillic was so complicated that people sometimes just didn’t bother, while the more advanced sites offered visitors the option of reading the content either in “translit” or in any of the four common encoding schemes (the official ISO one, which eventually died, the unofficial KOI-8 which is probably most popular today, a Windows one and a MS-DOS one). At the time, I remember seeing serious discussions of whether Russian would move to the Roman alphabet permanently because of the Internet. A friend of mine actually read a whole book in translit at one point. The situation was improving slowly and I remember switching to IE5 in 1999 specifically because it finally handled Russian properly. As of today, most browsers I've come across recently seem to support viewing most scripts out of the box. (Though, I did have to install extra fonts manually to view Mandarin in Firefox on Ubuntu.)

If only the damn web programmers kept up their end of the bargain! Every time I venture onto the Portuguese Internets, I end up having to face crap like this:

Why? Because of course it is too much work to specify in your page what encoding you are using. Now, the interesting thing is that the browsers seem to enable this evil practice by covering up for it. They've gotten clever about guessing the encoding as long as you stick with English and one other language, so most people reading Brazilian websites probably don’t see this problem. However, since I read Russian more often the Portuguese, my browser seems to assume that any page without specified encoding is in Russian and all accented Portuguese letters are turned into Cyrillic. When will this end? Will people get around to finally declare encoding or will this only be resolved when the browsers are clever enough to actually identify the language?

(Speaking of language woes, after spending some time converting freewisdom to django I discovered that while django can handle Unicode when using sqlite, it does not do so when used with mySQL, at least in the configuration used on dreamhost. So, all the Portuguese on freewisdom is currently screwed up. I hope to bring it back, but I am not sure if this is even possible without moving to different hosting.)