Friday, 11 December 2015

Problem and solution: "Content not allowed in prolog" when loading XML as string, even without extra characters

I was parsing XML files in a unit test, but was doing it incorrectly.

I was loading them from classpath, then wanted to have them in a string and compare them using XMLUnit.


WRONG WAY

InputStream devStream = getClass().getClassLoader().getResourceAsStream("api-response--dev.xml");

String devXml = devStream.toString();


I was getting "Content not allowed in prolog" when I tried to parse this as XML, even though there was no BOM or extra characters in the source XML file.

Turns out I was doing it incorrectly. Because if you call toString(), you end up using the default encoding of your platform. 


RIGHT WAY


InputStream devStream = getClass().getClassLoader().getResourceAsStream("api-response--dev.xml");

String devXml = CharStreams.toString(new InputStreamReader(devStream, "UTF-8"));