PDF command line tools

Discussion in 'Mac Apps and Mac App Store' started by Oligarch, Apr 6, 2012.

  1. Oligarch macrumors newbie

    Nov 10, 2008
    I wrote three simple, single-purpose command line utilities to manipulate PDF documents:

    pdfcat - concatenate (join, fuse) PDF documents
    pdfcrop - crop (adjust margins of) PDF documents
    pdfsplit - split (extract pages from) PDF documents

    They can be downloaded here. Each comes with its own man page. There also is a small shell script to move the tools and man pages to the appropriate directories, but this is not mandatory. Read the Readme file.

    The idea is to have something lightweight (the tar ball is less than 60 kB in size) and panic-proof that works immediately after download, without any dependencies, compilation, installation, setup or learning curve. The two tools (pdfcrop and pdfsplit) which take arguments beyond the obvious file names understand plain English, and the order of the arguments is optimized for interactive use: the file to crop or split, the crop size or page ranges, and redirection of the standard output to a new file. For basic use, there are no options to memorize.

    The tools work just as well in scripts, of course, and each can be used as a filter in a pipe.

    For those less familiar with the Terminal, these are the keys one uses to navigate man pages: space bar, d, e (down one page, half a page, one line), b, u, y (up one page, half a page, one line), g, G (start, end), and /, n (search, next), q (quit). Surprisingly efficient. Factored differently: space bar, b (down, up one page), d, u (down, up half a page), e, y (down, up one line).

    To quickly check the result of your command line PDF manipulations in Preview, use OS X's "open" command: $ open file.pdf (the dollar sign is the bash prompt).

    As a MacRumors member it seemed natural to post here first, but if you know other places where people would find the PDF tools, please come forward. Also, any comments, questions, ideas, complaints or other feedback you might have will be greatly appreciated; use the e-mail address at the bottom of the man pages and Readme file, or post here to have your voice heard by (and tap the wisdom of) thousands of people instead of just me.
  2. miles01110 macrumors Core


    Jul 24, 2006
    The Ivory Tower (I'm not coming down)
  3. Oligarch thread starter macrumors newbie

    Nov 10, 2008
    Thanks, I know that. The commands are meant for when you are already busy in Terminal, for when there are many documents or documents with many pages to process, and for scripts. I find them convenient.

    P.S.: I forgot in my original post:

    HELP *** Testers needed !!! *** HELP

    If you are running Leopard, Snow Leopard, Lion, or Mountain Lion, especially on 64-bit Intel hardware, have 5 minutes to spare and an idle PDF lying around, it would be great if you could try at least one of the tools and report back that it works.
  4. superwoman macrumors regular

    Apr 25, 2005
    I admire you efforts, but how are these tools better than pdftk that you can compile stand-alone or via Mac Ports?
  5. Oligarch thread starter macrumors newbie

    Nov 10, 2008
    Thank you for asking, superwoman. pdftk is a great package and far more comprehensive, except for cropping which I think it can't do. I would say that it plays in another league. Here is a comparison:

    pdftk is multi-platform and comes with its own PDF library.
    These tools are OS X only and use the system PDF library (the same as Preview).

    pdftk is a 16 MB download.
    These tools are a 55 kB (!) download.

    pdftk must be compiled (requiring the developer tools, and time) or installed, and installs its libraries.
    These tools work out of the box, on Intel & PowerPC, and don't install or modify anything.

    pdftk is rather strict with regard to option syntax.
    These tools are extremely lenient and parse plain English as well as various option idioms.

    pdftk rolls concatenation and splitting into one with its smart "cat" operator.
    Theses tools make a clean distinction between the two, resulting in simpler syntax at the expense of flexibility.

    pdftk can't crop. (Please correct me if I am wrong.)
    These tools can, and have all the units and paper sizes I could find on Wikipedia built-in.

    pdftk can perform sophisticated operations (rotate pages, fill in forms, add watermarks, encrypt documents, ...).
    These tools can't; they do just the basics (split, combine, and crop).

    When I needed this kind of basic PDF functionality three years ago, I found pdftk at once overkill and lacking, discovered the neat OS X Cocoa APIs, and wrote these tools, purely for myself. Since they turned out well, I thought it a shame to have them sit idle on my hard disk while they could be helping others, so I recently took the time to polish them up, write the man pages, and package everything for distribution.

    Some of the above points are totally moot on today's hardware, e.g., download sizes or load times and memory consumption of libraries, but I believe my emphasis on convenience and ease of use has merit; people tend to go looking for tools like these when they are operating in panic mode, with a deadline looming, and 1000 other things to do they hadn't thought about. In such cases, unless both the dev tools and the right package manager are already installed, compilation from source is not an option. The pdftk home page does offer pre-compiled builds for Snow Leopard and Panther, though, but I have no experience with them.

    To summarize: these tools are OS X native and "just work", instantly, have a straightforward syntax, but are limited to basic operations.
  6. DanShockley macrumors newbie

    Sep 24, 2008
    Appropriate for specific uses

    I just used these for an in-house situation at work where I wanted to be able to easily run some kind of split/join command on a user's computer without having to install a package like pdftk on-the-fly. Installing these two little binaries in a temporary directory was a process that is much simpler and less likely to run into issues.

    Obviously, I'd use something like pdftk for situations where that much flexibility and features is needed, but these work great for what I need.
  7. Oligarch thread starter macrumors newbie

    Nov 10, 2008
    Following a user request, I added a command line tool to burst multipage PDF documents into single pages. From the man page:

         pdfburst -- burst (split) PDF documents into single pages
         pdfburst file [path]
         The pdfburst utility bursts (splits) the PDF document file into single
         pages which it writes to path, appended by an underscore character and
         zero-padded page numbers.
         If file is a single dash (-), the PDF document is read from the standard
         If path is omitted, the base name (last path component) of file is used
         and the single page files are created in the current working directory.
         If path ends with a slash (/), it designates a directory and the single
         page files are named with just the page number.
         Missing directories along path are created.
    The complete set of four command line tools (pdfburst, pdfcat, pdfcrop, pdfsplit) can be downloaded here. For more information, read this thread or see the Readme file and man pages included with the package.
  8. plexxer macrumors newbie

    Jun 15, 2013
    Good work! Unfortunately, 10.8.4 broke them :/


    Thanks for the work you did in releasing these tools! They made quick work of some daily PDF processing jobs I use. However, it seems that the recently released 10.8.4 update changed the core libraries and now PDFSPLIT fails with:

    2013-06-15 08:51:20.596 pdfsplit[21234:707] -[__NSCFNumber annotations]: unrecognized selector sent to instance 0x144410
    2013-06-15 08:51:20.651 pdfsplit[21234:707] *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '-[__NSCFNumber annotations]: unrecognized selector sent to instance 0x144410'
    *** Call stack at first throw:
    0 CoreFoundation 0x95851e8b __raiseError + 219
    1 libobjc.A.dylib 0x944ac52e objc_exception_throw + 230
    2 CoreFoundation 0x95855afd -[NSObject(NSObject) doesNotRecognizeSelector:] + 253
    3 CoreFoundation 0x9579de87 ___forwarding___ + 487
    4 CoreFoundation 0x9579dc32 _CF_forwarding_prep_0 + 50
    5 PDFKit 0x90207227 -[PDFDocument removePageAtIndex:] + 360
    6 pdfsplit 0x0000366f pdfsplit + 9839
    7 pdfsplit 0x00001d9a pdfsplit + 3482
    8 pdfsplit 0x00001cc1 pdfsplit + 3265
    Trace/BPT trap: 5

    I hope an update would be quick and easy. Again, thanks for everything!
  9. coherent macrumors newbie

    Aug 19, 2010
    There are some free tools for manipulating PDF files on the command line. The coherent pdf tools are now free for non-commercial use:


Share This Page