May 24, 201213 yr just hadn't had time to investigate the possibility but we routinely grab a federal document from a website but we only care about including the second page from this document with our document is there an easy automated way to extract just that page from the pdf and place it into a container or super container? thanks
May 25, 201213 yr Hi Stephen, Page extraction is very straightforward using iText's PdfCopy class. /** * iText_ExtractSinglePage ( fm_pathToSrc ; fm_pathToDest ; fm_getPageNum ) * by clem 2010-11-04 * Extrait une page d'un document PDF. * * === Parameters === * fm_pathToSrc: path to pdf input file. * fm_pathToDest: path to pdf output file. * fm_getPageNum: the page number to be extracted. **/ import com.itextpdf.text.Document import com.itextpdf.text.DocumentException import com.itextpdf.text.pdf.PdfCopy import com.itextpdf.text.pdf.PdfReader try{ def reader = new PdfReader(fm_pathToSrc) def document = new Document() def copy = new PdfCopy(document, new FileOutputStream(fm_pathToDest)) document.open() copy.addPage copy.getImportedPage(reader, fm_getPageNum.toInteger() ) document.close() return true } catch (IOException ioe){ return "ERROR: $ioe.message" } catch (DocumentException de){ return "ERROR: $de.message" }
May 25, 201213 yr Stephen Same thing but using the PdfSmartCopy class // PDFextractPage2 ( fm_fileIn ; fm_fileOut ; fm_num ) // 11_09_12 JR // v1.4A import com.itextpdf.text.Document import com.itextpdf.text.pdf.PdfSmartCopy import com.itextpdf.text.pdf.PdfReader document = new Document() copy = new PdfSmartCopy(document, new FileOutputStream(fm_fileOut)) try { reader = new PdfReader(fm_fileIn) } catch (Exception e) { if (e.toString().contains('BadPassword')) { return 'PASSWORD ERROR' } //end if } //end try document.open() try { copy.addPage(copy.getImportedPage(reader, fm_num.toInteger())) document.close() } catch(e) { //return e return 'ERROR' } //end try return true PdfSmartCopy has the same functionality as PdfCopy, but when resources (such as fonts, images,...) are encountered, a reference to these resources is saved in a cache, so that they can be reused. This requires more memory, but reduces the file size of the resulting PDF document. This has more impact on multi-page PDF files, but is a better class to use regularly
May 25, 201213 yr Hi John, `PdfSmartCopy' is of course a better class to use if you concatenate PDF documents containing duplicate resources and a has very little impact on splitting PDF Documents. No ?
May 25, 201213 yr Author I tried both versions however the 'Smart' one threw up an error and the dialog and subsequent more info were empty. The first one seems to work.
Create an account or sign in to comment