Jump to content
Server Maintenance This Week. ×

ScriptMaster erroring on processing scanned PDFs


This topic is 3431 days old. Please don't post here. Open a new topic instead.

Recommended Posts

We have a SM process that combines PDFs into one. Most of these PDFs are ones created via some print or save as PDF process from a word processor or the like.... but SOME of them are scans of printed pages. When our processor encounters these pages it often (always?) errors and causes a failure.

 

Has anyone else encountered this and, more importantly, does anyone have any suggestions on how to get SM to process these scanned PDFs properly.

 

More info on or process is available if it might help to diagnose the problem...

 

Thanks,

mark

Link to comment
Share on other sites

I can't post a "bad" file because of privacy concerns, these are parts of applications for a major grant competition. I will get the code we are using and post that up though. We CAN see that the ones that ALWAYS fail were done on Konica/Minolta scanners. Attached is the Acrobat file info.

 

Mark

 

post-92156-0-27510500-1417612874_thumb.j

Link to comment
Share on other sites

what about scanning any old piece of printed paper on the scanner then as its whats inside that is likely to be causing the problems not whats in the info...

 

if you open the failing file in Adobe Reader first then try to concatenate does it still fail??

Link to comment
Share on other sites

The "bad" scans are not coming from us... they are being uploaded into our system by applicants from all over. We have no control over those source files.

 

I can test a bad file to see if some pre-processing would fix the problem but in our system this would not be possible (at least not manually) as it processes somewhere in the neighborhood of 20,000 pages per competition (per year).


We are using this to do the PDF merging:

 

 

 

 

 

RegisterGroovy( "mergePDFs( files ; pdfOut )" ; "import java.io.FileOutputStream;¶

import java.util.ArrayList;¶
import java.util.List;¶
import com.lowagie.text.pdf.*;¶
import com.lowagie.text.*;¶
String[] inFiles = files.split("n");¶
int fileIndex = 0;¶
int pageOffset = 0;¶
Document document = null;¶
PdfCopy copy = null;¶
ArrayList bookmarks = new ArrayList();¶
while (fileIndex < inFiles.length) {¶
// Create a reader for the next document¶
PdfReader reader = new PdfReader(new RandomAccessFileOrArray(inFiles[fileIndex]), null);¶
reader.consolidateNamedDestinations();¶
// Retrieve the total number of pages¶
int numberOfPages = reader.getNumberOfPages();¶
// Create the master document¶
if (fileIndex == 0) {¶
// step 1: Create the document-object¶
document = new Document(reader.getPageSizeWithRotation(1));¶
// step 2: Create the copy that listens to the document¶
a copy = new PdfCopy(document, new FileOutputStream(pdfOut));¶
// step 3: Open the document¶
document.open();¶
// cache bookmarks form the first file¶
ArrayList temp = SimpleBookmark.getBookmark(reader);¶
if( temp != null ) bookmarks.addAll(temp);¶
} else {¶
// cache bookmarks from subsequent files and adjust the page number references¶
ArrayList tmp = SimpleBookmark.getBookmark(reader);¶
if( tmp != null ) {¶
SimpleBookmark.shiftPageNumbers(tmp, pageOffset, null);¶
bookmarks.addAll(tmp);¶
// step 4: Add content¶
PdfImportedPage page;¶
for (int i = 0; i < numberOfPages; ) {¶
++i;¶
page = copy.getImportedPage(reader, i);¶
copy.addPage(page);¶
//update counters¶
pageOffset += numberOfPages;¶
fileIndex++;¶
// add cached bookmarks¶
if(bookmarks.size() > 0) copy.setOutlines(bookmarks);¶
//close document¶
document.close();¶
return true;" )
Link to comment
Share on other sites

OK, its some code which is more than 6 years old in a version which has long past being supported, this in itself might be enough to cause your method to fail.

 

read some stuff here about why the latest version is a better deal. http://itextpdf.com/salesfaq

There will be some news soon about licensing for FileMaker users... watch this space.

 

I would start by removing the bookmarks code and just add the pages to see what happens.

Also take a PDF which breaks your code and add it just one more page at a time  - so +1, +2, +3 etc to see if there is a specific page which causes the problem

Link to comment
Share on other sites

This topic is 3431 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.