You're making progress!
What you are still missing the processing of /Filter. When /Filter == /DCTDecode then you can leave the data "unfiltered" and treat it as a JPEG, due to a special case in PDF. However, for all other values of /Filter, you need to DECODE the data and then treat that as the image data. You may need to use something such as JAI to then convert that information into a usable image format.
Leonard
-----Original Message-----
From: ChristinaD [mailto:***@rediffmail.com]
Sent: Thursday, December 03, 2009 4:49 AM
To: itext-***@lists.sourceforge.net
Subject: Re: [iText-questions] Extract PDF embedded images using iText
Hi Leo,
below is my pdfstream{/Filter=/DCTDecode, /Type=/XObject, /Length=52803,
/BitsPerComponent=8, /Height=375, /ColorSpace=/DeviceRGB, /Subtype=/Image,
/Width=500}
byte[] imagedata= PdfReader.getStreamBytesRaw((PRStream) stream);
int width = Integer.parseInt((stream.get(PdfName.WIDTH)).toString());
int height = Integer.parseInt((stream.get(PdfName.HEIGHT)).toString());
int bpc =
Integer.parseInt((stream.get(PdfName.BITSPERCOMPONENT)).toString());
int components = 3;
Image img = Image.getInstance(width, height, components, bpc, imagedata);
Facing proble:
* Getting 0 dpi : img.getDpiX()
* Tried to write this image in document then not getting the proper image
like original.
img.scalePercent(20.0f);
document.add(img);
* Below code is working fine with the jpeg images embedded in pdf document.
Image img = Image.getInstance(imagedata);
Post by Leonard RosentholPDF images are NOT in a standard format - they are "arrays of color
values" in a specific colorspace with a certain number of bits per
component and potentially processed with one or more "filters". Details
are described in ISO 32000-1.
As such, you will need to extract the image stream into some "image
processing library" that knows what to do with the various structures and
then can also, possibly, save them out to various image formats. JAI is
probably a good place to look.
-----Original Message-----
Sent: Sunday, November 29, 2009 9:12 AM
Subject: [iText-questions] Extract PDF embedded images using iText
Hi All, I am trying to extract images from pdf document using iText library.
I am able to create the instance of only JPEG format(*.jpg, *.jpeg, *.jpe).
**** Image imageObject = Image.getInstance(image); ****
Not other format images are embedded in PDF document.
public void extractImagesInfo(){
try{
PdfReader chartReader = new
PdfReader("MyPdf.pdf");
for (int i = 0; i < chartReader.getXrefSize(); i++) {
PdfObject pdfobj = chartReader.getPdfObject(i);
if (pdfobj != null && pdfobj.isStream()) {
PdfStream stream = (PdfStream) pdfobj;
PdfObject pdfsubtype = stream.get(PdfName.SUBTYPE);
//System.out.println("Stream subType: " + pdfsubtype);
if (pdfsubtype != null &&
pdfsubtype.toString().equals(PdfName.IMAGE.toString())) {
byte[] image = PdfReader.getStreamBytesRaw((PRStream)
stream);
Image imageObject = Image.getInstance(image);
System.out.println("Resolution" + imageObject.getDpiX());
System.out.println("Height" + imageObject.getHeight());
System.out.println("Width" + imageObject.getWidth());
}
}
}
}catch(Exception e){
e.printStackTrace();
}
}
--
http://old.nabble.com/Extract-PDF-embedded-images-using-iText-tp26562385p26562385.html
Sent from the iText - General mailing list archive at Nabble.com.
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
iText-questions mailing list
https://lists.sourceforge.net/lists/listinfo/itext-questions
Buy the iText book: http://www.1t3xt.com/docs/book.php
http://www.1t3xt.info/examples/
http://1t3xt.info/tutorials/keywords/
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
iText-questions mailing list
https://lists.sourceforge.net/lists/listinfo/itext-questions
Buy the iText book: http://www.1t3xt.com/docs/book.php
http://www.1t3xt.info/examples/
http://1t3xt.info/tutorials/keywords/
--
View this message in context: http://old.nabble.com/Extract-PDF-embedded-images-using-iText-tp26562385p26623379.html
Sent from the iText - General mailing list archive at Nabble.com.
------------------------------------------------------------------------------
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing.
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
_______________________________________________
iText-questions mailing list
iText-***@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/itext-questions
Buy the iText book: http://www.1t3xt.com/docs/book.php
Check the site with examples before you ask questions: http://www.1t3xt.info/examples/
You can also search the keywords list: http://1t3xt.info/tutorials/keywords/