Workflow Engine - General Operation

1. What is a Workflow

A workflow is a sequence of connected nodes that process data in an orderly fashion. Each node performs a specific action (HTTP call, logical decision, data transformation, SQL query, email sending, etc.) and passes the result to the next node in the chain.

The execution flow is directed: data enters through an initial node (trigger), is transformed and enriched as it passes through each node, and finally reaches a terminal node or stops due to a specific condition.

Workflows allow automating business processes without the need to write code, connecting predefined modules through a visual editor interface.

2. Engine Architecture

The engine uses a queue-based execution model. The general process is as follows:

Entry Point

The main function is:

processWorkflowV1(workflowName, workflowId, queryParams, webHook)

Parameters:

workflowName: Identifier name of the workflow to execute.
workflowId: Unique identifier of the execution. If not provided, it is generated automatically.
queryParams: Initial parameters injected as input data.
webHook: HTTP response object (when the workflow is started by webhook).

Initialization Sequence

Loading the definition: The workflows table is queried to obtain the workflow_json column, which contains the entire workflow structure (nodes, connections, configurations).
WorkflowId generation: A unique identifier (UUID) is generated for this particular execution, allowing each execution to be tracked independently.
Loading memory variables: The memorystorage table is queried to obtain the persistent variables associated with the workflow.
Execution context initialization: A context object is built that includes the nodeLabels mapping (readable labels for each node), the step history (steps), memory variables, and the workflowId.

3. Data Flow Between Nodes

Each module (node) receives three arguments and returns an object with the result:

Module Input

module(data, config, context)

data: The output of the previous node. This is the main data that the current node must process. In the case of the first node, it corresponds to the initial trigger data.
config: The node configuration. Contains the parameters defined by the user in the visual editor (for example, the URL of an HTTP request, the SQL query, the conditions of a decision, etc.).
context: The execution context. Includes:
- workflowId: Unique identifier for this execution.
- steps: Array with the accumulated execution history (results from previous nodes).
- nodeLabels: Mapping of node identifiers to their readable labels.
- memoryVariables: Persistent variables loaded from memorystorage.

Module Output

{
  nextModule: "next_node_id",  // or array, or null
  data: { ... },               // output data
  error: null,                 // error object if failed
  _meta_: { ... }              // optional metadata
}

nextModule: Determines which node executes next.
data: The output data that becomes the input of the next node.
error: Error object in case of failure (null if successful).
meta: Optional module metadata (execution time, flags, etc.).

4. Main Execution Loop

The engine uses a FIFO (First In, First Out) queue called moduleQueue to manage the execution order of nodes.

Loop Algorithm

1. Initialize moduleQueue with the start node
2. While moduleQueue is not empty:
   a. Take the next element from the queue
   b. Resolve dynamic variables in the node configuration
   c. Load the module code corresponding to the node type
   d. Execute the module with (data, config, context)
   e. Store the result in the workflowstates table
   f. Add the result to the context steps array
   g. Determine the next module(s)
   h. Insert the next module(s) into the queue
3. End of execution

Dynamic Variable Resolution

Before executing each node, the engine resolves the dynamic variables present in the configuration. These variables allow referencing data from previous nodes or memory variables. The typical format is {{variableName}} or references to previous steps like {{steps.previous_node.field}}.

State Storage

After each node execution, the result is persisted in the workflowstates table with information such as:

Workflow and node identifier
Execution status (success, error)
Input and output data
Response time in milliseconds

5. Next Node Resolution

The nextModule value returned by each node determines how the flow continues:

Linear Flow (nextModule is a string)

The node returns a single identifier. Execution continues sequentially to the indicated node.

nextModule: "http_2"

Parallel Branches (nextModule is an array)

The node returns multiple identifiers. Each one is added to the queue and executed in order.

nextModule: ["branch_a_1", "branch_b_1"]

Decision Nodes (truePath / falsePath)

Decision-type nodes evaluate a condition and return the corresponding path:

If the condition is true: follows truePath
If the condition is false: follows falsePath

nextModule: condition ? config.truePath : config.falsePath

End of Branch (nextModule is null)

When nextModule is null, the current branch ends. If there are other elements in the queue, execution continues with them.

nextModule: null

Workflow Stop (_stopflowprocess)

When a node returns the _stopflowprocess flag, the workflow execution stops completely and the HTTP response is returned to the client. This flag is used for webhook-initiated workflows that need to respond immediately.

6. Workflow JSON Structure

The workflow definition is stored as JSON in the workflow_json column of the workflows table. The general structure is as follows:

{
  "id": "123",
  "name": "my_workflow",
  "start": "start_1",
  "active_schedule": true,
  "schedule": "0 9 * * *",
  "modules": {
    "start_1": {
      "type": "start",
      "config": {
        "label": "Start",
        "nextModule": "http_2"
      }
    },
    "http_2": {
      "type": "http",
      "config": {
        "label": "Query API",
        "method": "GET",
        "url": "https://api.example.com/data",
        "headers": {},
        "nextModule": "decision_3"
      }
    },
    "decision_3": {
      "type": "decision",
      "config": {
        "label": "Check result",
        "conditions": [],
        "truePath": "mail_4",
        "falsePath": "end_5"
      }
    },
    "mail_4": {
      "type": "sendmail",
      "config": {
        "label": "Send notification",
        "to": "admin@example.com",
        "subject": "Result",
        "nextModule": "end_5"
      }
    },
    "end_5": {
      "type": "end",
      "config": {
        "label": "End"
      }
    }
  }
}

Main Fields

Field	Description
`id`	Unique workflow identifier
`name`	Workflow name (used as reference in URL or scheduling)
`start`	Initial node identifier
`active_schedule`	Whether the workflow has active scheduling (true/false)
`schedule`	Cron expression for scheduled execution
`modules`	Object with all workflow nodes, indexed by their identifier

Module Fields

Field	Description
`type`	Module type (start, http, decision, sendmail, sql, etc.)
`config`	Module-specific configuration
`config.label`	Visible node label in the editor
`config.nextModule`	Next node to execute

7. Error Handling

The engine implements several mechanisms to manage errors during execution.

Automatic Retries

By default, each node has up to 3 retries in case of failure. The engine retries the node execution before marking it as definitively failed.

continueOnError Flag

When a node has the continueOnError flag enabled, the workflow execution continues to the next node even if the current node has failed. This is useful for non-critical nodes where a failure should not stop the entire process.

Error Types

Error Type	Description
`EMPTY_INPUT_DATA`	The node received empty or null input data
`EMPTY_RETURN_DATA`	The node did not return output data
`UNKNOWN_ERROR`	Unclassified or unexpected error

Error Logging

All errors are recorded in the workflowstates table along with:

The error type
The descriptive message
The input data that caused the failure
The timestamp of the occurrence

This allows diagnosing problems by reviewing the execution history of each node.

8. Tracking and Traces

The engine provides two tracking mechanisms to monitor and debug executions.

workflowstates Table

Records the execution of each individual node. Each row contains:

Field	Description
`workflow_id`	Unique execution identifier
`node_id`	Identifier of the executed node
`status`	Execution status (success, error)
`data`	Node output data
`response_time_ms`	Node execution time in milliseconds
`created_at`	Date and time of execution

workflow_traces Table

Stores detailed traces with different severity levels:

Level	Usage
`TRACE`	Low-level information, detailed execution flow
`INFO`	General informational events
`WARN`	Warnings that do not stop execution
`ERROR`	Errors that affect node or workflow execution
`DEBUG`	Debugging information for development

Steps Array

During execution, the context maintains a steps array that accumulates the result of each executed node. This array is available to all subsequent nodes, allowing access to data from any previous node in the flow.

context.steps = [
  { nodeId: "start_1", data: { ... }, status: "success" },
  { nodeId: "http_2", data: { ... }, status: "success" },
  // ... more steps
]

9. Special Flow Control Flags

The engine recognizes several special flags that alter the normal execution behavior.

_stopflowprocess

Completely stops the workflow execution and returns an HTTP response to the client. It is primarily used in webhook-initiated workflows where an immediate response needs to be sent.

{
  nextModule: null,
  data: { result: "ok" },
  _stopflowprocess: true
}

When this flag is active, the execution queue is emptied and the node data is sent as an HTTP response.

_respondata

Allows sending a response to the HTTP client without stopping the workflow execution. The workflow continues processing in the background while the client has already received their response.

{
  nextModule: "next_node",
  data: { ... },
  _respondata: { message: "Received, processing..." }
}

_iterating

Indicates that the current node is in batch iteration mode. The engine executes the same node (or subset of nodes) multiple times, once for each element in a data array.

_batchSteps

Contains the accumulated results of each iteration when using batch iteration mode. When all iterations are complete, this array is passed to the merge node to consolidate the results.

{
  _batchSteps: [
    { iteration: 0, data: { ... } },
    { iteration: 1, data: { ... } },
    // ... one entry for each iterated element
  ]
}

10. Flow Diagram

The following ASCII diagram shows a typical example of a workflow execution with a decision:

                    +----------+
                    |  START   |
                    +----+-----+
                         |
                         v
                    +----------+
                    |  Node 1  |
                    | (HTTP)   |
                    +----+-----+
                         |
                         v
                   +-----------+
                  /  Decision   \
                 /  (condition)  \
                +--------+--------+
               |                  |
          true |                  | false
               v                  v
         +----------+      +----------+
         |  Node 2  |      |  Node 3  |
         | (SendMail)|     | (Logger) |
         +----+-----+      +----+-----+
              |                  |
              v                  v
            +----------------------+
            |       MERGE          |
            +----------+-----------+
                       |
                       v
                  +----------+
                  |   END    |
                  +----------+

Flow Description

START: The trigger initiates execution and provides the initial data.
Node 1 (HTTP): Makes an HTTP request to an external API and passes the response to the next node.
Decision: Evaluates a condition on the received data. If true, continues down the left branch (true); if false, down the right branch (false).
Node 2 (SendMail): Sends an email with the obtained data.
Node 3 (Logger): Logs the data for auditing.
MERGE: Consolidates the parallel branches into a single flow.
END: Finishes the workflow execution.