Bilbao Arechabala, Sonia
2015-01-30 13:14:55 UTC
Hi all,
I found a bug in class com.itextpdf.text.pdf.parser.TextRenderInfo.class of version 5.5.4.
When I called PdfTextExtractor.getTextFromPage(reader, pageNumber) some of the pages returned an IndexOutOfBounds Exception.
I had to add a line in method getCharCode from class TextRenderInfo to check that string is not empty.
private int getCharCode(String string) {
try {
if (string.isEmpty()) {
string = " "; // Bug solved
}
byte[] b = string.getBytes("UTF-16BE");
int value = 0;
for (int i = 0; i < b.length - 1; i++) {
value += b[i] & 0xff;
value <<= 8;
}
value += b[b.length - 1] & 0xff;
return value;
} catch (UnsupportedEncodingException e) {
}
return 0;
}
Hope this helps.
Regards,
Sonia
I found a bug in class com.itextpdf.text.pdf.parser.TextRenderInfo.class of version 5.5.4.
When I called PdfTextExtractor.getTextFromPage(reader, pageNumber) some of the pages returned an IndexOutOfBounds Exception.
I had to add a line in method getCharCode from class TextRenderInfo to check that string is not empty.
private int getCharCode(String string) {
try {
if (string.isEmpty()) {
string = " "; // Bug solved
}
byte[] b = string.getBytes("UTF-16BE");
int value = 0;
for (int i = 0; i < b.length - 1; i++) {
value += b[i] & 0xff;
value <<= 8;
}
value += b[b.length - 1] & 0xff;
return value;
} catch (UnsupportedEncodingException e) {
}
return 0;
}
Hope this helps.
Regards,
Sonia