Mozilla bug: The history database is endian.

Introduction

This page demonstrates the endian-problems in Mozilla's history database.

What's Wrong

The bug is that Mozilla records and displays website titles in an endian-dependent fashion. If a user only ever uses one platform, they'll never notice. If the user changes between big-endian and little-endian machines, their history file will be filled half with big-endian titles and half with little-endian titles, and will not display properly on either platform.

The bug is particularly noticeable when typing URLs into the location bar, because the autocompletion window that pops up will be full of random chinese characters where web site titles should be. For example, compare these two screenshots, one taken on a big-endian machine, the other on a little-endian machine:
screenshot of autocompletion pop-up menu
screenshot of pop-up autocompletion menu

Details

I ran Mozilla in Linux on a PC, and visited a number of sites. Mozilla stored these sites in my history.
I then quit, and moved over to a Sun machine running Solaris. On this machine I visited BBC's World News site. Mozilla remembered that site in my history.

When I view my history on the big-endian Sun, I see gibberish titles for the sites that the Intel machine visited, and correct titles for the sites that the Sun machine visited:
screenshot, big-endian platform

If I go back to the little-endian PC, the little-endian titles are readable, but the titles stored on the big-endian machine aren't readable.
screenshot, little-endian platform

(http://web.mit.edu/ is my home page, which is the reason that it is readable in each of these screenshots.)

What should have happened

Mozilla should either store titles in an endian-independent fashion, or should use the unicode Byte Order Mark (U+FEFF) to label (and detect) the endian-ness of each title string.

Any other information?

The history.dat file is nearly impenetrable to me. If you're interested, though, here are the first 30 lines of my history.dat file. I can provide the whole file if it would be helpful.

// <!-- <mdb:mork:z v="1.4"/> -->
< <(a=c)> // (f=iso-8859-1)
  (80=ns:history:db:row:scope:history:all)
  (81=ns:history:db:table:kind:history)(82=URL)(83=Referrer)
  (84=LastVisitDate)(85=FirstVisitDate)(86=VisitCount)(87=Name)
  (88=Hostname)>
 
<(B3=http://web.mit.edu/)(6189B=1004255174550000)(81=991086087059000)
  (B4=web.mit.edu)(B5
    =M$00a$00s$00s$00a$00c$00h$00u$00s$00e$00t$00t$00s$00 $00I$00n$00s$00t$00i\
$00t$00u$00t$00e$00 $00o$00f$00 $00T$00e$00c$00h$00n$00o$00l$00o$00g$00y$00)
  (6189C=257)(103=http://web.mit.edu/jmorzins/www/hotlist.html)(60DA1
    =1004237645513000)(EA=991087596540000)(104
    =B$00o$00o$00k$00m$00a$00r$00k$00s$00 $00f$00o$00r$00 $00J$00a$00c$00o$00b\
$00 $00M$00o$00r$00z$00i$00n$00s$00k$00i$00)(5BC57=170)(30E
    =http://news.bbc.co.uk/hi/english/world/default.stm)(5F8FF
    =1004207097547000)(281=991088681881000)(410=news.bbc.co.uk)(411
    =B$00B$00C$00 $00N$00e$00w$00s$00 $00|$00 $00W$00O$00R$00L$00D$00)
  (5F900=229)(433=http://dailynews.yahoo.com/headlines/ts/)(38D85
    =1002256812840000)(412=991088892584000)(58C=dailynews.yahoo.com)
  (30B8F=T$00o$00p$00 $00S$00t$00o$00r$00i$00e$00s$00)(30672=144)(5F6
    =http://www.ucomics.com/stonesoup/viewss.htm)(6111D=1004245946860000)
  (5F7=www.ucomics.com)(5F8
    =W$00e$00l$00c$00o$00m$00e$00 $00t$00o$00 $00u$00C$00o$00m$00i$00c$00s$00 \
$00W$00e$00b$00 $00S$00i$00t$00e$00 $00f$00e$00a$00t$00u$00r$00i$00n$00g$00 $00\
S$00t$00o$00n$00e$00 $00S$00o$00u$00p$00 $00-$00-$00 $00T$00h$00e$00 $00B$00e$00\
s$00t$00 $00C$00o$00m$00i$00c$00 $00S$00i$00t$00e$00 $00I$00n$00 $00T$00h$00e$00\
 $00U$00n$00i$00v$00e$00r$00s$00e$00!$00)(5B8A1=148)(5F9
    =http://www.ucomics.com/calvinandhobbes/viewch.htm)(611F5
    =1004245953890000)(5FA



Update:

July 2003: this bug is fixed. Check bugzilla for details.