Skip to main content
Skip table of contents

PDF to Text Converter

This adapter reads a PDF document (also password-protected), extracts the text contents of a specific page range or of the entire document and outputs the extracted character strings in an XML or text document with a freely selectable character encoding.

Properties

Operation

Determines which operation the adapter executes

Possible values: Extract: Extract text from the input PDF document.

Parameter

Adapter

Main class of the adapter (do not change!)

Possible values: en.softproject.integration.adapter.pdf.PDF2Text: Main class (default)

password

Password (for a protected PDF document)

Possible values: Any string

startPage

First page number from which the texts are to be extracted

Possible values:

  • Any positive integer or 0

  • 0: Start from the first page (default)

endPage

Last page number up to which the text extraction is to be carried out

Possible values:

  • Any integer or 0

  • 0: Extract text to the last page (default)

encoding

Character encoding of the result document

Possible values: Any valid character encoding (e.g. UTF-8).

force

Try to extract text even on invalid PDF pages

Possible values:

  • yes: Process invalid PDF pages

  • no: Ignore invalid PDF pages (default)

toXML

Output text contents in an XML document

Possible values:

  • yes: Output XML document

  • no: Output text document (default)

Status values

-1The operation was executed successfully.
1The operation failed due to a technical error.
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.