PDF Miner


PDF Miner

Author: Jerry Chae


Description

This plugin gives your bot a capability to search a text string in a PDF document and returns coordinates (x – y location information) of the target text string on the page.



Need help?

Technical contact to tech@argos-labs.com


May you search all operations,



Required Input

  • A PDF file (please not that “image pdf” does not work.)
  • A text string to look for


Output/Return Value

  • A CSV will be returned.


CSV headers are like below (case-sensitive)


pageid

width

height

x0

y0

x1

y1

text


Headers

      - pageid             page # that starts with 1 (integer)

      - width               width of the page in pixel

      - height              height of the page in pixel

      - x0                    top left corner location in number of pixels from the left edge of the page

      - y0                    top left corner location in number of pixels from the top edge of the page

      - x1                    bottom right corner location in number of pixels from the left edge

      - y1                    bottom right corner location in number of pixels from the left edge

      - text                  shows the targe text string in full


To extract data, use the standard STU variable format i.e. {{groupname.text(index)}}


How to set parameters









All Plugins