Discussion:
ResultSetFormatter outputs illegal XML
Rick Liao
2017-03-01 03:46:38 UTC
Permalink
Hello,

I am using ResultSetFormatter to write out the contents of a ResultSet into
a XML file. My data in ResultSet contain an illegal character for XML,
\u0010. The formatter still outputted the character into the XML file
resulting in an incorrect XML file.

Can I configure the formatter to ignore the illegal character? Or do I have
to clean the data before giving it to the formatter?

Code:
ResultSetFormatter.outputAsXML(outputStream, sparqlResultSet)

Thanks for your time!
Rick
Andy Seaborne
2017-03-01 09:07:29 UTC
Permalink
Rick,

The data needs to be cleaned or treated as XML 1.1.

XML 1.0 does not have way to encode or escape such a character. There
is no way to write the results legally in XML 1.0.

XML 1.1 does allow it.

See XML rule [2], the only characters allowed in an XML document.
Rule [66] requires &# entities to be in the chars of rule [2].

Other formats like SPARQL Results in JSON allow the character.

Andy
Post by Rick Liao
Hello,
I am using ResultSetFormatter to write out the contents of a ResultSet into
a XML file. My data in ResultSet contain an illegal character for XML,
\u0010. The formatter still outputted the character into the XML file
resulting in an incorrect XML file.
Can I configure the formatter to ignore the illegal character? Or do I have
to clean the data before giving it to the formatter?
ResultSetFormatter.outputAsXML(outputStream, sparqlResultSet)
Thanks for your time!
Rick
Loading...