General Question

MedivhX's avatar

HTML or XHTML?

Asked by MedivhX (129points) February 1st, 2008

I want to start learning one of these two markup languages, but first i’d like to ask you people what’s the difference between them? I heard XHTML is more strict (and that kind of fits me).

Observing members: 0 Composing members: 0

18 Answers

Vincentt's avatar

XHTML is more strict and more modern. Actually, it’s just HTML but to be interpreted as XML instead of XGML, which indeed basically means that it’s more strict (gotta love acronyms, don’t you? :). The only people who’d want to write HTML are people who have never been using XHTML and don’t feel like going through the trouble shaking off their bad habits.

Also, because XHTML is more strict, it can be parsed faster by web browsers.

Bottom line: you’ll want to go with HTML.

cwilbur's avatar

There’s very little practical difference between XHTML and HTML aside from quirks of syntax. The two principal differences that you may care about are that XHTML is also valid XML, which means that tools that validate XML also work on XHTML, and that MSIE doesn’t handle XHTML well.

Do you care if XHTML is also valid XML? Well, if you’re trying to generate it with automated tools, or parse it with automated tools, and you’re familiar with an XML parser or generator, it’s a decent benefit. If you’re writing it by hand, it doesn’t matter nearly as much; it’s no harder to get HTML to be valid than it is to get XHTML to be valid.

Do you care if MSIE has problems with XHTML? Well, if the most popular browser worldwide interprets your code by treating it as poorly-formed HTML, you aren’t getting much of the benefit of XHTML.

(For information on Microsoft’s support of XHTML, see http://en.wikipedia.org/wiki/XHTML, section “Adoption”: “For example, Internet Explorer by Microsoft (MSIE) has had XML parsing capabilities since version 5.0 in 1999, but even in mid-2007, the current version (IE7) still does not support XHTML documents served as XML; it only renders them correctly when they are served as HTML and are authored in accordance with the HTML compatibility guidelines.”)

@Vincentt: the XHTML spec is no more strict than the HTML spec; it’s just that browsers have been far more lenient than they should have been in expecting HTML conformance. Because using XHTML indicates to the browser that the document being served will be valid, browsers essentially have permission to stop parsing and rendering at the first error in the XHTML, instead of trying to make the best sense they can out of nonsense HTML markup. If you write valid HTML 4.01 Strict, and use a doctype declaration to indicate that, XHTML parsing is no more strict and no faster than HTML parsing.

My advice would be to go with HTML, because thanks to the lack of support from Microsoft XHTML is largely a dead end—there’s almost no point in using XHTML over valid HTML—and the web world is starting to focus on HTML 5. And that said, once you understand HTML, XHTML just has a few quirks of syntax to learn.

glial's avatar

Go with XHTML and CSS. IE7 renders XHTML and CSS much better than it ever has, all other modern browsers also do. IE6 will cause a few anxious moments though.

This page is XHTML 1.0 Strict.

Serving XHTML as XML does not seem like a big issue. I personally have never had the need to do so.

Most modern sites and modern developers are using XHTML and CSS.

cwilbur's avatar

@glial: cite, please, showing that MSIE renders XHTML as XHTML at all? And cite, please, if you’re going to make a claim about “most modern sites and modern developers”?

http://blogs.msdn.com/ie/archive/2005/09/15/467901.aspx – MSIE blogger points out that in MSIE7 XHTML is understood only if it’s “backwards compatible,” because correct XHTML support would be difficult to graft onto the Trident parsing engine.

It’s not just the issue of what headers to use; it’s that when you write XHTML, MSIE, even in version 7, interprets it as HTML. So you get zero benefit from the extra effort you put into XHTML.

http://dig.csail.mit.edu/breadcrumbs/node/166 – Tim Berners-Lee, explaining the rationale for the W3C to largely abandon work on XHTML in favor of HTML5.

If you have comparable citations supporting your claims about MSIE and about “most modern sites and modern developers,” I’d love to see them.

felipelavinz's avatar

In response to the question: I agree with cwilbur that there’s little practical difference between XHTML and HTML, but I would say that XHTML it’s somewhate more strict than HTML, at least in one aspect: the elements should be correctly nested, which implies documents have to be well formed as in XML. There are other differences between HTML 4 and XHTML 1 which you can see on that link.

On the other hand, XHTML as XML is a big issue, it would actually deserve an entire article on the subject (which, by the way, I wrote on my blog [in spanish, but you could check the links that I used as source to get an idea]), but in two lines, there are some things that should be clarified:

· XHTML 1.0 may be served as text/html, but should be served as application/xhtml+xml (source). If you serve it as application/xhtml+xml, your document has to be well formed, otherwise the browser will just display an error message.

· If you serve XHTML as text/html, you wouldn’t get the benefits of XML parsing, such as faster rendering or integration of other XML languages (for instance, MathML), but instead it will be parsed as HTML… and I think that in most browsers that just means “making some sense out of tag soup”.

All in all, XHTML just doesn’t make much sense given the very strict XML parser, and it’s practically senseless serving it as text/html

HTML 5 it’s not my personal favorite (I think it lacks the clear focus of XHTML and they’re deviating from the minimization of elements/attributes which IMHO would constitute a better markup language), but at least it’s realistic in the sense that there are millions of web sites that just don’t have good code, and that there should also be some way for browsers to display a document even when the code isn’t perfect. In other words: backwards compatibility.

And, by the way, you can still use CSS with HTML, in fact, there’s absolutely no reason not to.

Of course, even when this is a somewhat lengthy response, it not even near to be “definitive”, since I haven’t even considered other factors such as Quirks/Standards mode, and many others.

soundedfury's avatar

@cwilbur – XHTML is widely used by developers within the standards community, which could be viewed as being more “modern” than coding without regard to standards (certainly chronologically speaking). A few examples – ESPN, Facebook, MTV, Pownce, Twitter, Fluther, Digg and Newsvine. It will render faster on all non-MS browsers and will degrade gracefully in IE6 & 7.

The fact is that XHTML is the most “modern” of the choices. Yes, HTML 5 will supplant that, but we’re still looking at 16+ months before we have browsers that are accurately parsing HTML 5. So, at this point, it could be argued that XHTML is the most current and modern choice.

@MedivhX – I find that XHTML 1.0 Strict is significantly leaner and does not lend itself to poor coding, which is why I switched to preferring it over HTML several years ago. However, I still use HTML 4 Transitional often in my day job, since I’m coding for legacy applications or often working within a CMS that require more time and energy to convert to XHTML.

cwilbur's avatar

@felipelavinz: in valid HTML, elements need to be correctly nested as well.

@soundedfury: I’m not denying that XHTML is widely used; I’m denying that there’s any advantage to it if you’re not already using XML tools—it’s just as easy to validate HTML as it is to validate XHTML—and I’m asking for citations from people who want to assert something as true about “most” developers.

Further, if you want to claim that valid XHTML renders faster than valid HTML, I want to see citations and benchmarks.

“It’s better because it’s newer” is a common logical fallacy. If the only reason you can come up with for using XHTML is because it’s newer, well, that’s not a very good reason; and coding to standards can mean coding to HTML standards just as easily as coding to XHTML standards. You seem to be assuming that coding in HTML necessarily means tag soup and coding in XHTML necessarily means producing valid code, and a trivial look at the Internet should be enough to demonstrate that that just isn’t so.

soundedfury's avatar

@cwilbur – You challenged the assumption that “most modern sites and modern developers” use XHTML, and I provided you with examples of recently designed sites that use XHTML. It’s true that most websites do not use XHTML, but I think the assumption that most recently-launched/redesigned sites use XHTML is justifiable – and certainly that most well-known web developer/designers are preferring XHTML.

I’m looking to see where I said that one is better than the other. I said that XHTML is faster on some browsers, but that is certainly different than bettter. It seems that you are the one equating the term “modern” with “better because it’s newer.” Do I have specific benchmarks available? No, but the differences in syntax alone would ensure that XHTML is faster than HTML, even if it is trivial and situational. I don’t think speed of rendering is a reason to switch, either way.

The only reason I choose to use XHTML when available is that it is slimmer markup that forces you to correctly separate content from presentation. I can achieve the same separation in HTML, but HTML does not prohibit it. Since I know that others in my office will also be working in the same code at a later date, and since we require that all markup validate before it is pushed live, it helps me to curtail behavior that could pose a problem.

But, again, I have never stated that one is better than the other. I’m giving reasons for why people choose one over the other.

glial's avatar

My point being, not to get any more panties in a wad, that if you are starting off, you should go with XHTML and CSS rather that legacy HTML embedding style in the markup, in order separate style from content.

I believe soundedfury got it, sorry you didn’t cwilbur.

To me your explanation made it sound like XHTML won’t render in IE7 which, I thought, may have confused the person who asked this question.

cwilbur's avatar

@glial: I understood your point; I merely disagree with it, and I think it’s an incorrect conclusion based on wrong information.

You’re associating tag soup code with HTML, and you seem to believe that separating style from content is a necessary feature of XHTML. Both are wrong.

I concur that if you’re starting out, you should go with separating content and markup, and you should use correctly-formed markup. But these are not necessarily aspects of XHTML; it’s just as easy, or as hard, to write good HTML as to write good XHTML. That said, I believe (for reasons stated above) that XHTML offers no benefits over HTML and is, in fact, a dead end.

Vincentt's avatar

@cwilbur – XHTML is definitely more strict than HTML. For example, in HTML, elements do not necessarily have to be closed, e.g. <img src=“bla.png”> would be correct, whereas in XHTML it’d have to be <img src=“bla.png” /> or <img src=“bla.png”></img>.

Also, it definitely isn’t true that IE6 cannot display XHTML properly – I always code in XHTML and (with a little bit of work) it displays just fine in IE6. Though it is true that it won’t display if you set the content-type as you should, like felipelavinz said.

And even if XHTML provides no clear advantage in IE6 (and 7?), if it does so for the other ~23% of the market then that’s already a huge advantage. Plus, it doesn’t really harm, too.

cwilbur's avatar

@Vincentt: Requiring an explicit close tag for tags that cannot have content, such as img and br tags, is not “more strict”; it’s merely consistency with XML, which requires all tags to be closed. The HTML spec is perfectly clear on which tags require closing and which do not.

And IE6 and IE7 interpret XHTML as HTML with odd characters; there’s no advantage to it.

felipelavinz's avatar

@cwilbur: “in valid HTML, elements need to be correctly nested as well” are they? I was just looking at the XHTML 1.0 and it said that Well-formedness is a new concept introduced by [XML], and even though incorrectly-nested HTML won’t validate, and it even is illegal in SGML, it’s widely tolerated in existing browsers

@soundedfury: as cwilbur said, I would like to see some benchmarks on XHTML parsing. As I said before, unless XHTML is served as application/xml+xhtml (which is barely never), it will be parsed as HTML; and HTML should be parsed as SGML, but it actually isn’t, it’s just parsed as tag soup even if it’s valid code

All in all, I would say this: there’s no real “advantage” in XHTML itself; the only pro I can think of it’s that it establishes a more rigorous way of thinking and writing code, but not even this is inherent to the language, but only an attribution made by many, and that might be traced to the standards advocates.

cwilbur's avatar

@felipelavinz: in valid HTML, elements need to be correctly nested. Many browsers tolerate invalid HTML, sure, but that doesn’t make it valid—just as, when MSIE accepts invalid XHTML because it’s parsing it as HTML tag soup, that doesn’t make that XHTML valid either.

And I concur: the rigor that comes with XHTML is not the result of any change in the language, but the existence of browsers that simply die on the first error they find instead of treating the XHTML as tag soup. It’s just an implementation detail, and not inherent to the language.

Vincentt's avatar

@cwilbur – when some tags in SGML can be either left open or be closed, while in XML they always need to be closed, that fits my definition of “more strict”.

Being XML, XHTML can easily be integrated in other technologies, such as easy converting into a PDF document. It is also easier to read for e.g. screenreaders, which improves accessibility.

Furthermore, as there is no clear disadvantage to using XHTML, while there already exist several advantages with the road paved to more advantages, XHTML is the way to go.

segdeha's avatar

I once gave a presentation that consisted of 2 sock puppets—one a euro-designer named Lars, or something, complete with chunky dark glasses and spikey blonde hair and the other a ponytailed, old-school webmaster—debating the relative merits of XHTML versus HTML. Which obviously qualifies me to comment on this topic.

For me, it boils down to the following:

1. Internet Explorer can’t handle the mime type application/xml
2. For XHTML to gain the benefits of XML, it needs to be served as application/xml
3. HTML 4.01 Strict is a perfectly legitimate standard with a well-known, published doctype
4. HTML is served as text/html, which is understood by Internet Exploder

I prefer to serve my content with both a correct doctype and a correct mime type. Therefore, my only option is HTML.

MedivhX's avatar

Thanks everyone for answering. :) I think I’ll start with HTML, and then update my knowledge to XHTML.

Vincentt's avatar

@MedivhX – oh, then you’ll want to follow W3Schools‘s tutorials (upper left-hand corner). They have an HTML tutorial and an XHTML tutorial basically explaining what you need to know to write XHTML if you already know HTML. In any case, learning the update doesn’t take more than, say, ten minutes, so that’ll work out :)

Answer this question

Login

or

Join

to answer.

This question is in the General Section. Responses must be helpful and on-topic.

Your answer will be saved while you login or join.

Have a question? Ask Fluther!

What do you know more about?
or
Knowledge Networking @ Fluther