Skip to content

the Call class aka the DSL

elmatou edited this page Jun 18, 2011 · 3 revisions

The lower level of this wrapper is the Call class, each instance of this class should prepare the command line from a hash of parameters, run the command in the shell passing data from inputs, and retreiving output or error.

Note

In most of cases, you won't use this class to run pdftk commands, you should prefer abstraction layers, such as Wrapper, Form or Metadata. This section is mostly for development purpose, if you want to build a new amazing abstraction layer.

Let's start

It is really easy to run a command first create an instance, it will locate your pdftk binary (except on Windows for now).

call = Call.new(dsl_hash)

here, dsl_hash is default options for this Call instance, one of them could be the path to the pdftk binary (usefull on Windows system, or if you have several version of pdftk and want to use a specific one)

and then run a command, it return the output file or nil.

output = call.pdftk(dsl_hash)

Ok, it is not very complicated, but what is this dsl_hash parameter in these statement ?

That's the good question ! From here, we will digg in the DSL (Domain-specific language) of this gem.

The DSL

As you know command line programs take often a bunch of arguments, in a very specific syntax. that is our case. In order to call pdftk in a easy way, we build a DSL, it has a hash patern, with four keys : input, operation, output, options. an additional key :path can be set to specify the path to the pdftk binary.

{:input => some_input, :operation => some_operation, :output => some_output, :options => some_options}

:input

It represent your input PDF(s), there several ways to reach them. it could be a simple string, giving the full path to the pdf file :

:input => 'path/to/file.pdf'

In this case the file is not yet open, you could need to give the password (here we pass 'foo' as a password) of this file.

:input => {'path/to/file.pdf' => 'foo'}

Could also be a File, StringIO or Tempfile ruby object :

:input => File.new
OR
:input => StringIO.new
OR
:input => Tempfile.new

Here you don't need any password, as we pass the data stream directly from stdin.

Ok, some operations allow you to have several input files (as :cat or :shuffle), in this case, you will give a hash with several files as keys :

:input => {'a.pdf' => 'foo', 'b.pdf' => nil}

if a file doesn't need any password, just pass nil.

:operation

Now we want to choose one of the operation allowed by pdftk, to apply it on the set input(s) the general syntax is :

:operation => {:some_operation => operation_argument}
OR 
:operation => 'some_operation'

if you do not need to give any argument for the operation, just use the second form (note that strings and symbols are allowed).

All operation supported by pdftk should be supported here, for now, some are not fully supported (contribution highly accepted) :

nil => nil                                  # no operation.
:cat => [Hash/Range]                                  # *.pdf wildcards not supported for now, also blank options for full file cat not supported (must pass :pdf inputs)
:shuffle => [Hash/Range]                              # *.pdf wildcards not supported for now, also blank options for full file cat not supported (must pass :pdf inputs)
:burst => nil
:generate_fdf => nil
:fill_form => String || File || StringIO || Tempfile
:background => String || File || StringIO || Tempfile
:multibackground => String || File || StringIO || Tempfile
:stamp => String || File || StringIO || Tempfile
:multistamp => String || File || StringIO || Tempfile
:dump_data => nil
:dump_data_utf8 => nil
:dump_data_fields => nil
:dump_data_fields_utf8 => nil
:update_info => String || File || StringIO || Tempfile
:update_info_utf8 => String || File || StringIO || Tempfile
:attach_files => [String, String, ...]            # to_page is not supported for now
:unpack_files => nil

Here keys are operations supported by pdftk (and the gem), and values are expected arguments.

  • nil means no argument are expected (you should use the second form).

  • String || File || StringIO || Tempfile any of these input objects is expected (as for :input).

  • [...] an array of something is expected.

  • [Hash/Range] are array of ranges written as hashes (?!). better check an exemple !

[
{:start => 1, :end => 'end', :pdf => 'a.pdf'},
{:pdf => 'b.pdf', :start => 12, :end => 16, :orientation => 'E', :pages => 'even'}
]

Don't forget to provide the same filenames in the :input part (I now it is boring, but the wrapper, make it easier)

As we respect the pdftk terms, you should watch the Pdftk man page (just a bit improved) for more information.

As inputs can use path to files or stdin data stream, you should take care to have only one input data stream in a single command, otherwise an MultipleInputStream exception will be raised.

:output

it could be any of NilClass || String || File || StringIO || Tempfile if no output is specified (or set to nil), the result will be routed to stdout and returned by the pdftk method Here we do net set any password keep reading to understand.

:operation => nil
:operation => 'path/to/target.pdf
:operation => StringIO.new

:options

Options can be given by a hash of one or several of possibilities below :

:owner_pw => String
:user_pw => String
:encrypt  => :'40bit' || :'128bit'
:flatten  => true || false
:compress  => true || false
:keep_id  => :first || :final
:drop_xfa  => true || false
:allow  => ['Printing', 'DegradedPrinting', 'ModifyContents', 'Assembly', 'CopyContents', 'ScreenReaders', 'ModifyAnnotations', 'FillIn', 'AllFeatures']

For example

:options => {:owner_pw => 'foo'}
:options => {:owner_pw => 'foo', :encrypt  => :'128bit', :keep_id  => :first}
:options => {:owner_pw => 'foo', :encrypt  => :'128bit', :keep_id  => :first, :allow  => ['Printing', 'DegradedPrinting', 'ModifyContents']}

Gimme full examples !

You can just check the specs for more examples, here is copy/paste of one of them :

@pdftk.set_cmd(:input => {'a.pdf' => 'foo', 'b.pdf' => 'bar', 'c.pdf' => nil}, :operation => {:fill_form => @stringio}, :output => @file_object,:options => { :flatten => false, :owner_pw => 'bar', :user_pw => 'baz', :encrypt  => :'40bit'})

Anything else ?

Yes, some instance methods could be usefull.

As not all version of pdftk support all the features we add three methods pdftk_version & xfdf_support? & utf8_support?