PDF Reader(Gemini)-26.123.1630

PDF Reader(Gemini)-26.123.1630

icon.png

 

PDF Gemini(GPT)

 

Author   Wanjin Choi(truewan@vivans.net)

 

Primary Features

Gemini API integrates some of the key features of the comprehensive API that is provided by Google AI Studio https://aistudio.google.com/api-keys?hl=en

It returns key information using Gemini API based on the prompt entered by the user.

 

The functions included in this plugin are

  • PDF Text Extraction (Text Mode, default)

    • Reads text-based PDFs using a PDF parser and extracts raw text content.

  • PDF OCR / Vision Extraction (Vision Mode)

    • Vision mode handles scanned for noisy/dirty documents via OCR

  • Prompt-based Information Structuring

    • Uses a user prompt to determine what fields to extract.

Need help?

Technical contact to tech@argos-labs.com



May you search all operations,

Input (Requirement)

  • APIKey

  • Pdf document(with path)

  • Prompt

Advanced Input (Optional)

  • Text mode is best for clean digital PDFs with selectable text(default option)

  • Vision mode handles scanned for noisy/dirty documents via OCR

Use Model

  • gemini-2.0-flash

Return Value

  • CSV format value

Return Code

  • 0 Success

  • 1 misc. errors



Parameter Setting samples

gemini.png

Example Reuslt

Prompt : Please provide the item name, item details, quantity, retail price, supply unit price, and supply amount for each item.
Result

gemini_result.png