I have worked quite hard in the last period with an interesting xhtml rendering java library called "flying saucer" ( I will post more about that in the next future).
This library takes a valid xhtml string and is capable to convert it into a pdf document ( and more formats ) .
What makes this lib special is the ability to format the output using css 2 and 3 stataments. And I could not imagine ahow many things css 2 and 3 is capable of....
I was just worried about the source doc cause that must be a perfectly valid xml document. Sometimes, depending on how the string to be parsed is generated, respecting the strict xml rules can be difficult.
I then discovered that Railo ships an htmlFormat() function that is capable to "repair" your bad html and gives you back a nice xhtml format.
Look at this example:
This is a very bad html piece of code saved into a variables and parsed into htmlParse()
<cfsavecontent variable="html">
Text
<h3>Title
<ul>
<li>item<li>
item
</ul>
<li>More text<img src="" ></li>
</cfsavecontent>
<cfset xml=htmlParse(html)>
Here is the output generated by htmlParse()
<?xml version="1.0" encoding="UTF-8"?>
<html>
<body>Text
<br />
<h3>Title</h3>
<ul>
<li>item</li>
<li>item</li>
</ul>
<ul>
<li>More text <img src=""/></li>
</ul>
</body>
</html>
As you can see this is a perfectly valid xhtml string.
I am now sure that I can give to my xhtml rendrering library the xhtml formts it needs with no worry about how bad the source can be.
Thanks to Railo for this amazing feature.