Jump to content

Splitting pdfs


This topic is 4858 days old. Please don't post here. Open a new topic instead.

Recommended Posts

There are AppleScript methods to convert PDFs to text, if you can go that route. I've also messed about testing PDF splitters; and it's a mixed bag. There are command line tools, some Java, some Python. But the older one I had does not work anymore in Snow Leopard, due to the 64 bit; nor did want to mess about changing Python to run in 32 bit more. I ain't that kind of programmer. If anyone here knows a web link to a good tool of any kind for splitting PDFs, please share.

There is a built-in Automator action, to render each page as an image. But it creates files that each one is bigger than the original multi-page file; and the text is no longer selectable. That's just nuts. Next I tried the freeware app Skim. It creates very nice files, pretty small. The only downside is that it is going to have to open that multi-page file, and flash a bit. But it is fast, especially as you're only doing small files. It can do 6 pages in a second.

I'm writing the files to a fixed folder on the Desktop. But you could change that to wherever. I'm just naming the files "1_page.pdf", "2_page.pdf", etc.. But you could do more if you want. You can see where I did that. Be sure to leave the parenthesis around it (or it will get split into a list). I'm also choosing the file to split, so it could be anywhere.

I just added a line to tell app "System Events" (bulit-in background app for many things) to do the "write" commands; which are in a subroutine. FileMaker does not like those commands within its own tell block (which is implied if you run a FileMaker Perform AppleScript script step); someone else has to do it. You can also use "Finder", but that's not best practice (on long operations especially, ties up the Finder).

P.S. The AppleScript will run as is in any FileMaker file, Perform AppleScript (Native) script step. You may want to read FileMaker fields for names, of folders/files, move/delete files, etc..

set DT to path to desktop as text

set dest_folder to (DT & "PDFs_Split:")

set file_to_split to choose file of type ("com.adobe.pdf") with prompt "Choose a PDF file to split"

-- leaving the original as is; but you could do whatever with it (afterwards; if splitting works :-)

tell application "Skim"

activate

open file_to_split

tell document 1

set pages_ to get pages

set counter to 0

repeat with p in pages_

set counter to counter + 1

set page_data to grab page counter

set file_path to (dest_folder & counter & "_page.pdf")

-- save the data to a file

my write_to_file(file_path, page_data)

end repeat

close

end tell

end tell

-- Write each page to its file path

-- Return a boolean value (false/true) (not using it, yet - FEJ)

on write_to_file(filepath, content)

tell application "System Events" -- needed for FileMaker (doesn't like "write" commands)

try

set fileobj to open for access filepath with write permission

write content to fileobj

close access fileobj

on error errmsg number errnum

try

close access fileobj

end try

-- return errmsg

return false

end try

end tell

return true

end write_to_file

Link to comment
Share on other sites

I doubt you want to hear this, but I'd be looking to get rid of the pdfs completely. What are they? Can data be gathered in a better form?

Whats the best way how can you get an ocr read without pdf, if better way means easy automated data collection?

Link to comment
Share on other sites

This topic is 4858 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.