Tutorial

Sejda shell interface tutorial

Contents

Requirements

The shell interface requires Java runtime 1.8 or higher to be installed. It works on Linux, Windows and Mac.

Running the shell interface

After you dowloaded the sejda-console binaries and extracted the files, you should have the following directory structure:

./lib
./lib/[...].jar
./bin
./bin/sejda-console.bat
./bin/sejda-console
./etc
./etc/logback.xml

On linux and mac you might have to grant execution permission for the executable file first:

chmod +x bin/sejda-console

To run the console, simply execute the binary file in /bin folder. Choose the binary executable depending on your platform: windows – sejda-console.bat, linux and mac – sejda-console.

./bin/sejda-console

Detailed instructions can be found in this video tutorial.

Available commands

To list the available commands, run the console without any parameters, or with -h option.

> ./bin/sejda-console
Configuring Sejda 2.9.0
Loading Sejda configuration form default sejda.xml
Starting execution with arguments: ''
Java version: '1.8.0_101'
Sejda Console

Basic commands:

 addbackpages         Takes one or more pages from a PDF document and adds them to
                      one or more PDF documents after each 'n' pages.

 alternatemix         Given two PDF documents, creates a single output PDF
                      document taking pages alternatively from the two input.
                      Pages can be taken in straight or reverse order and using a
                      configurable step (number of pages before the process switch
                      from a document to the other).

 combinereorder       Combines multiple PDF documents reordering the pages if
                      required.

 compress             Compress PDF by optimizing images inside, reducing their
                      dpi, size and/or quality.

 crop                 Given a PDF document and a set of rectangular boxes, creates
                      an output PDF document where pages are cropped according to
                      the input rectangular boxes. Input boxes are set as cropbox
                      on the resulting document pages (see PDF 32000-1:2008,
                      chapter 7.7.3.3, Table 30). Resulting document will have a
                      number of pages that is the the number of pages of the
                      original document multiplied by the number of rectangular
                      boxes.

 decrypt              Given a collection of encrypted PDF documents and their
                      owner password, creates a decrypted version of each of them.

 encrypt              Given a collection of PDF documents, applies the selected
                      permission using the selected encryption algorithm and the
                      provided owner and user password.

 extractbybookmarks   Extracts chapters to separate documents based on the
                      bookmarks in the outline at a given level (optionally
                      filtered by a given regex).

 extractpages         Extracts pages from a PDF document creating a new one
                      containing only the selected pages. Page selection can be
                      done using a predefined set of pages (odd, even) or as a set
                      of ranges (from page x to y).

 extracttext          Given a collection of PDF documents, creates a collection of
                      text files containing text extracted from them.

 extracttextbypages   Extracts text from a single PDF document creating a
                      collection of text files each containing text extracted from
                      a single page.

 merge                Given a collection of PDF documents, creates a single output
                      PDF document composed by the selected pages of each input
                      document taken in the given order.

 nup                  Composes multiple PDF pages (4, 8, 16, 32) per sheet.

 pdftojpeg            Converts a PDF document to multiple JPEG images (one image
                      per page).

 pdftomultipletiff    Converts a PDF document to multiple TIFF images (one image
                      per page).

 pdftosingletiff      Converts a PDF document to a single TIFF image (TIFF format
                      supports multiple images written to a single file).

 portfolio            Creates a portfolio/collection of attachments.

 rotate               Apply page rotation to a collection of PDF documents.
                      Rotation can be applied to a specified set of pages or to a
                      predefined set (all, even pages, odd pages).

 scale                Scales pages or pages content of multiple PDF documents.

 setheaderfooter      Adds a header or a footer to a PDF document or part of it.

 setmetadata          Sets metadata (title, author, subject, keywords) to an input
                      PDF document.

 setpagelabels        Given a collection of PDF documents, applies the selected
                      page labels as defined in the PDF 32000-1:2008, chapter
                      12.4.2.

 setpagetransitions   Given a PDF document, applies the selected pages transitions
                      (to use the document as a slide show presentation) as
                      defined in the PDF 32000-1:2008, chapter 12.4.4.

 setviewerpreferences Given a collection of PDF documents, applies the selected
                      viewer preferences.

 simplesplit          Splits a given PDF document at a predefined set of page
                      numbers (all, odd pages, even pages).

 splitbybookmarks     Splits a given PDF document before each page that is a
                      destination in the document outline (bookmarks) at the
                      specified level (optionally matching a provided regular
                      expression).

 splitbyevery         Splits a given PDF document every 'n' pages creating
                      documents of 'n' pages each.

 splitbypages         Splits a given PDF document after each one of the selected
                      page numbers.

 splitbysize          Splits a given PDF document in files of the selected size
                      (roughly).

 splitbytext          Splits a PDF document by text content, extracting separate
                      documents when specific text changes from page to page.

 splitdownthemiddle   Splits document pages in two, reordering pages if necessary.

 unpack               Unpacks all the attachments of a given collection of PDF
                      documents.

 watermark            Stamps a watermark image on multiple PDF documents.

Use "sejda-console <command> -h" for help regarding a specific command

Command specific options

To get more details regarding one specific command, run sejda-console -h <command>

> ./bin/sejda-console -h decrypt
09:39:48.310 Given a collection of encrypted pdf documents and their owner password, creates a decrypted version of each of them.

Example usage: sejda-console decrypt -f /tmp/file1.pdf:secret123 -o /tmp -p decrypted_

Usage: sejda-console decrypt options
	[--compressed] : compress output file (optional)
	--existingOutput -j value : policy to use when an output file with the same name already exists. {overwrite, skip, fail}. Default is 'fail' (optional)
	--files -f value... : pdf files to operate on: a list of existing pdf files (EX. -f /tmp/file1.pdf or -f /tmp/password_protected_file2.pdf:secret123) (required)
	[--help -h] : prints usage information. Can be used to detail options for a command '-h command' (optional)
	--output -o value : output directory (required)
	--outputPrefix -p value : prefix for the output files name (optional)
	--pdfVersion -v value : pdf version of the output document/s {1.2, 1.3, 1.4, 1.5, 1.6 or 1.7}. Default is 1.6. (optional)

Configuring custom logging

By default, the sejda-console logs to STDOUT on level INFO.

Logging settings can be changed by editing the etc/logback.xml file

For example, in order to turn off logging completely, simply comment out the STDOUT appender

<root level="INFO">
	<!-- <appender-ref ref="STDOUT" /> -->
	<!-- <appender-ref ref="FILE" /> -->
</root>

Inside the logback.xml file there is an (commented out) example on how to configure the logging to a log file instead of STDOUT.

Configuring memory arguments

By default, the shell interface is configured to use 1024Mb of memory. If you encounter an OutOfMemoryError you can configure a higher value than the provided default.

On Linux/Mac edit bin/sejda-console

-Xmx1024M

On Windows edit bin/sejda-console.bat

-Xmx1024M