View Full Version : Anybody here know how to use pandoc to output XHTML?


carlosbcg
02-21-2013, 12:54 AM
I am creating an EPUB3 which requires "serialized HTML5" (I think that is XHTML5).

Anyway I use a program called pandoc to output the HTML (which I want to have it be XHTML5).

The command line I presently use is:

pandoc --strict some.md -o some.html

In other words I go from an .md file containing markdown to a file containing...well...what seems like regular HTML.

Now if that all makes sense...therein lies my dilemna.

How do I output XHTML and NOT regular ol HTML using pandoc?

I've Googled and Googled yesterday and today and can't find a thing on this.

I know that the differences between HTML and XHTML5 are pretty minor really (for purposes of P tags and such) but still...I prefer to have a program like pandoc spit out properly formed XHTML instead of going through by hand to convert HTML to XHTML.

If pandoc doesn't cut it as far as outputting XHTML anybody know of any other wonderful program like pandoc that will?

Anybody?

Carlos

PS. Hmm...I wonder what is with the huge space the forum puts between the word "Code" and the code? Oops...the huge space went away after I edited and added this PS. Hmm...

dgatwood
02-21-2013, 03:08 AM
Try running it through a validator. It will probably "just work". Most software that outputs HTML does so using an XML-compatible form, or very nearly so. The likely exceptions can usually be fixed with a simple regular expression or other substitution, e.g.

cat file.html | sed 's/<hr>/<hr \/>/g' > newfile.html


or, in English, replace <hr> with <hr />. That's just about the only difference you're likely to run into. That and possibly the need to add </link> closing tags if the files include any CSS or </meta> tags if the files include any meta tags.

Ah. According to Pandoc's documentation, its html output mode is actually xhtml 1.0. Weird. So just add -t html and you should be good.

carlosbcg
02-21-2013, 03:24 AM
Ah. According to Pandoc's documentation, its html output mode is actually xhtml 1.0. Weird. So just add -t html and you should be good.

I saw that little snippet of conversion goodness a couple of hours after I posted my thread here but thanks for pointing that out.

One can also apparently use an -o html5 flag to output HTML5.

Or is it also an XHTML variety? I'll have to check the output code on that I guess.

That is what can make all this so confusing. Is that all these companies and software will sometimes intermix all these terms as being one thing when they actually mean another. People get sloppy with all these terms and are not precise and one ends up in a mess of confusion.

Oh well.

Carlos

dgatwood
02-21-2013, 09:57 PM
No idea. HTML5 can be XML or it can be old-style.