A warning to others [dive into mark]

Thursday, November 21, 2002

A warning to others

When Hixie speaks, I damn well listen. To wit:

This site is valid XHTML 1.1 again. I upgraded to XHTML 1.1 once before, then downgraded when I discovered unresolvable conflicts with Bobby. (Specifically, Bobby only recognized the lang attribute as identifying a page’s primary language, not XHTML 1.1’s xml:lang attribute, thus denying me my (in my opinion) rightful AAA rating, thus leading to all sorts of nasty flame letters from less than enlightened individuals who used Bobby’s failure report as a stepping stone to dismissing everything I’ve ever said about accessibility. I kid you not.

I reported the problem to the Bobby developers ages ago, but (unbeknownst to me) they were bankrupt and in the process of getting bought out, so they never fixed it. Happily, the new owners have fixed this and several other lingering bugs, and I have once again (and hopefully for the last time) upgraded to XHTML 1.1.
This site is now served up with a MIME type of application/xhtml+xml to Mozilla and its ilk (Netscape, Chimera, Phoenix, Galeon, GhostZilla…), and text/html to legacy browsers like Lynx, Links, OmniWeb, iCab, Konqueror, Netscape 4, Opera, every version of Internet Explorer on every platform, and pretty much every other browser in the world other than Mozilla. (Here is a list of browsers and how they cope with various MIME types.) You should not see any difference, unless I screw up and miss an end tag or something, in which case Mozilla will display XML debugging information instead of the intended page. This is a tad hostile, but I did bring it upon myself; if I chose to give Mozilla my XHTML 1.1 markup as text/html, it would be its usual forgiving self.

Originally I had tried to serve up application/xhtml+xml by default, and make exceptions for known legacy browsers. Unfortunately, for reasons that are still not clear to me, the detection routine missed a very small percentage of Internet Explorer users, which meant they were unable to read my site at all (IE helpfully offers to download the page, since it knows nothing of application/xhtml+xml). So it’s back to text/html by default, and application/xhtml+xml for specific browsers that I know can handle it.

Here’s the rub: as Ian pointed out in email, it is actually invalid according to the specification to send XHTML 1.1 with a MIME type of text/html. (It is discouraged with XHTML 1.0, but invalid with XHTML 1.1. The SHOULD NOT got upgraded to a MUST NOT.) This also seems a tad hostile, and it puts me in the uncomfortable position of intentionally breaking the specification so I don’t completely lose the 80% of my readers who are still browsing with what are, by this definition, legacy browsers. Combined with Mozilla’s user-hostile behavior when encountering an invalid page delivered as application/xhtml+xml (whatever happened to the founding principle of the Internet: be conservative in what you do, be liberal in what you accept from others and all that), I do not know why any sane individual (other than an alpha male with a weblog) would subject themselves to XHTML 1.1 (or any future version).

I should also point out, just in case anyone is still tempted to upgrade to XHTML 1.1, that my permalinks no longer work in Netscape 4. It has nothing to do with MIME types; it’s just a general XHTML 1.1 problem. (New XHTML version 1.1! Now more hostile than ever!) Netscape 4 only recognizes anchors that use the name attribute, which no longer exists in XHTML 1.1. It exists in XHTML 1.0 but is deprecated in favor of the id attribute, which is why you see many XHTML 1.0 templates use both. In XHTML 1.1, name is gone completely. Again, SHOULD NOT got upgraded to MUST NOT. Someday, I’ll upgrade myself from SHOULD NOT chase after bleeding edge technologies that don’t solve real world problems to MUST NOT chase after bleeding edge technologies that don’t solve real world problems. But not today. Maybe after I turn 30. Until then, my only hope is that I may serve as a warning to others.
I cut my templates down to the minimum amount of markup possible while keeping the same visual layout. Several div elements were deemed unnecessary and, on closer examination and much to my surprise, turned out truly to be unnecessary. Ditto empty paragraphs (<p></p>), and a few other pieces of markup weirdness that were put in at one point or another to work around Bobby bugs, or legacy browser bugs, or cruft that had simply built up over time.

As an example, I used to have <div id="logo"> around my site name. Previous designs actually used this for CSS positioning and such, but this design does not, so out it goes. Whoosh, 23 bytes. Also, I was using empty anchor tags such as <a id="this_post"></a> for permalinks, when in fact this can be put directly on the post’s title element, such as <h3 id="this_post">Post title</h3>. Whoosh, 9 bytes.

I acknowledge that my definition of cruft may be slightly different than yours.
Auxiliary files, such as my RSS feed and my FOAF profile, are sent with MIME types that match the link tags by which I point to them. (I couldn’t bring myself to tell Ian that application/rss+xml isn’t a registered MIME type. There was an attempt to register it last year, but it expired without being approved. So serving up my RSS feed as application/rss+xml probably isn’t actually much better than serving it up as text/xml, text/plain, or application/surrey-with-the-fringe-on-top. Sssh, the entire universe of RSS autodiscovery is based on unapproved MIME types. Don’t tell anyone.)

I doubt that anyone actually cares about MIME types except Ian (and, apparently, me). They are next to useless if your client doesn’t support content negotiation, which (as far as I know) none of the current crop of desktop news aggregators do. Which is just as well, since I know of only one site which supports it for its RSS feed, and even it has a fallback address that always returns the RSS feed regardless of whether the client claims to understand it or not.

I am reminded at this juncture of a poignant quote from the movie Clueless, which I am convinced will eventually be recognized as the greatest masterpiece of 1990’s mainstream cinema. Josh: Do you know what you’re talking about? Cher: No. Why, do I sound like I do?

I was hoping to also announce that I was now gzip-compressing all my pages, but I haven’t heard back from my system administrator yet (and I can’t fake it with PHP headers, since I’m not using PHP). What, what’s that you say? Why, it’s mod_gzip, of course, the answer to all your bandwidth problems. Marked up text (such as HTML, or RSS, or even RDF) generally compresses down to 1/3 of its original size. Web browsers have supported this transparently for years. Hixie’s Natural Log sends pages gzip-compressed to any client who claims (via the HTTP_ACCEPT_ENCODING header) to be able to handle it. It sure would be nice if more sites used it (check whether yours does), and even nicer if desktop news aggregators supported it. It’s not hard; here’s a 7-line Python function that correctly reads both normal and gzip-compressed web pages. That would include RSS feeds.

This stuff is out there. It’s all out there, just waiting to be learned and re-learned by every new generation of developers who, like teenagers, believe no one has ever had their problems before.

Those that tremble as if they were mad

← New reading | Home | Tinkering →

dive into mark

Thursday, November 21, 2002

A warning to others

Now available!