This adapter reads a PDF document (also password-protected), extracts the text contents of a specific page range or of the entire document and outputs the extracted character strings in an XML or text document with a freely selectable character encoding.
Properties
|
Operation |
Determines which operation the adapter executes Possible values: |
Parameter
|
Adapter |
Main class of the adapter (do not change!) Possible values: |
|
password |
Password (for a protected PDF document) Possible values: Any string |
|
startPage |
First page number from which the texts are to be extracted Possible values:
|
|
endPage |
Last page number up to which the text extraction is to be carried out Possible values:
|
|
encoding |
Character encoding of the result document Possible values: Any valid character encoding (e.g. |
|
force |
Try to extract text even on invalid PDF pages Possible values:
|
|
toXML |
Output text contents in an XML document Possible values:
|
Status values
|
|
The operation was executed successfully. |
|
|
The operation failed due to a technical error. |