Gemini AI

Description

This module allows making calls to the Google Gemini API for text generation and image analysis. It supports two operation modes:

Chat mode (chat): Sends a text prompt to the model and receives a generated response. It is a simple call without history or memory.
Image mode (image): Sends a text prompt along with a base64-encoded image for the model to analyze. Useful for image description, OCR, visual classification, etc.

The module:

Obtains the credentials (apiKey) from the credentials system.
Validates that the prompt exists and that in image mode image_base64 is provided.
Builds the payload with text parts and optionally the image.
Sends the request to the Gemini REST API.
Returns the complete model response.

Unlike the agentChat module, this module does not maintain history or memory between calls. It is ideal for one-off generation or analysis tasks.

Configuration

Parameters (Chat Mode)

Parameter	Type	Required	Description
credentials_id	credentials	Yes	Google Gemini API credentials (apiKey).
prompt	textarea	Yes	Instructions or question for the model. Supports {{variable}} variables.
model	select	No	Gemini model to use. Options: gemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flash. Default: gemini-2.0-flash.

Additional Parameters (Image Mode)

Parameter	Type	Required	Description
image_base64	text	Yes	Base64-encoded image or {{image}} variable.

Credentials

credentials_id is required with an object containing apiKey (Google Gemini API key, obtainable from https://aistudio.google.com/app/apikey).

Output

{
  "nextModule": "siguiente-nodo",
  "data": {
    "candidates": [
      {
        "content": {
          "parts": [
            { "text": "Respuesta generada por el modelo" }
          ],
          "role": "model"
        },
        "finishReason": "STOP"
      }
    ],
    "usageMetadata": {
      "promptTokenCount": 10,
      "candidatesTokenCount": 50,
      "totalTokenCount": 60
    }
  }
}

Usage Example

Basic case - Text

{
  "credentials_id": "credencial-gemini",
  "mode": "chat",
  "model": "gemini-2.0-flash",
  "prompt": "Explica en 3 puntos las ventajas de la automatizacion"
}

Basic case - Image

{
  "credentials_id": "credencial-gemini",
  "mode": "image",
  "model": "gemini-2.0-flash",
  "prompt": "Describe que hay en esta imagen",
  "image_base64": "{{imagen_capturada}}"
}

API Used

Google Gemini API: POST https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent
Authentication via X-goog-api-key header.

Notes

This module makes stateless calls. For conversations with history, use the agentChat module.
Image mode sends the image as inlineData with mimeType image/jpeg.
The response is returned as-is from the Gemini API, without additional parsing.
Available models are: gemini-2.0-flash (fast and economical), gemini-1.5-pro (more capable), gemini-1.5-flash (balance between speed and capability).
Common errors: invalid apiKey, image too large for the context, model unavailable.

agentChat - Conversational AI Agent (with history and memory)
openaiAccess - OpenAI API (OpenAI alternative)
openaiImages - OpenAI Generate Image (image generation)