PDF to Text Converter | X4Documentation

This adapter reads a PDF document (also password-protected), extracts the text content of a specific page range or the entire document, and outputs the extracted strings in an XML or text document with a freely selectable character encoding.

Properties

Operation

Describes which operation the adapter performs.

Possible values: Extract: Extract text from the input PDF document

Parameters

password	Password (for a protected PDF document) Possible values: Any string
startPage	First page number from which the texts are to be extracted Possible values: Any positive integer or `0` `0`: Start from the first page (default)
endPage	The last page number up to which you want to perform the text extraction Possible values: Any positive integer or `0` `0`: Extract text to the last page (default)
encoding	Character encoding of the result document Possible values: Any valid string (e.g. `UTF-8`)
force	Also try to extract text on invalid PDF pages Possible values: `yes`: Process invalid PDF pages `no`: Ignore invalid PDF pages (default)
toXML	Output text content in an XML document Possible values: `yes`: Output an XML document `no`: Output a text document (default)

Status values

`-1`	The operation was successful.
`1`	The operation failed due to a technical error.