• Instructions to Ask a Question

    For any assistance, please click the "Ask a Question" button and select the Pabbly product for which you require support.

    We offer seven comprehensive applications designed to help you efficiently manage and grow your business:

    Our support team endeavors to respond within 24 business hours (Monday to Friday, 10:00 AM to 6:00 PM IST). We appreciate your understanding and patience.

    🚀 Exclusive Lifetime Offers 🚀

    We invite you to take advantage of our special one-time payment plans, providing lifetime access to select applications:

    • 🔥 Pabbly Connect — Lifetime Access for $249View Offer
    • 🔥 Pabbly Subscription Billing — Lifetime Access for $249View Offer

    Make a one-time investment and enjoy the advantages of robust business management tools for years to come.

LLM - Chat GPT and Gemini to "see" images and then take action

Good morning!

is it possible to connect ChatGPT or Gemini API with Google Drive to:

The key point here (I think): The action shouldn't be a pre-made action but a prompted action

  1. Retrieve images from a specific folder.
  2. Analyze their content using AI.
  3. Determine their proper sequence based on relevance.
  4. Place them in a Google Docs file in the correct order using AI.
    PS.: every file will be slightly different I am not able to create one template.
    E.G.: LLM reads the images based on my prompt, and after defining it, moves and saves the images in Google Docs based on the organization said on my prompt.
 

ArshilAhmad

Moderator
Staff member
Hi @Guilherme Oliveira,

Currently, it's not possible to analyze images using OpenAI or Gemini action steps in Pabbly Connect. Please try using the "Google Cloud Vision: Detect Text in Images" action step to extract text from the images and then pass it to OpenAI or Gemini action step. Let us know the results after you've tried this.

1739293762561.png


 

Fagun Shah

Well-known member
Good morning!

is it possible to connect ChatGPT or Gemini API with Google Drive to:

The key point here (I think): The action shouldn't be a pre-made action but a prompted action

  1. Retrieve images from a specific folder.
  2. Analyze their content using AI.
  3. Determine their proper sequence based on relevance.
  4. Place them in a Google Docs file in the correct order using AI.
    PS.: every file will be slightly different I am not able to create one template.
    E.G.: LLM reads the images based on my prompt, and after defining it, moves and saves the images in Google Docs based on the organization said on my prompt.
This is not possible by pabbly type of automation softwares.

May be possible by some google drive plugins or extensions or app script.
 
Top