AI Workflow Automation Patterns using MindsDB's Jobs

Cover Image for AI Workflow Automation Patterns using MindsDB's Jobs

The concept of scheduling jobs to automate tasks is widely known from different operating systems. There are cron jobs in Linux and a task scheduler in Windows. Similarly, MindsDB, as an AI workflow automation platform, enables users to schedule jobs.

With MindsDB, you can schedule jobs to automate the entire AI workflow in your production environment. In this article, you will learn about the common and general use cases.

Automate AI Workflows with MindsDB

MindsDB provides the CREATE JOB statement that creates and deploys a job. Users can automate virtually any AI and data-related tasks. By setting such automation in place, you can easily account for dynamic data and use it as part of your feature or application.

Let’s go over the custom SQL syntax provided by MindsDB to schedule a job.

CREATE JOB job_name (
   <mindsdb_sql_query_1>;
   <mindsdb_sql_query_2>
)
START <date>
END <date>
EVERY [number] <period>;

First, the job name is defined. It is followed by a set of SQL statements that will be executed by the job. Finally, the scheduling of the job takes place. You can set specific start and end dates and the periodicity of the job.

Check out more examples here.

Common Use Case Patterns

Dynamic Data

It is very common to have dynamic data that is updated every day or even every minute, like in this tutorial. In such cases, it is recommended you regularly retrain an AI/ML model with the new data to improve its accuracy and performance. With MindsDB, you can create a job to do it for you.

The AI workflow includes the following:

  1. Retrain or finetune your AI/ML model with the updated data.

  2. Query for predictions and save them into a data table using INSERT INTO, UPDATE, or CREATE TABLE commands.

You can automate this AI workflow by scheduling a job.

CREATE JOB retrain_model_and_save_predictions (

   RETRAIN my_model
   FROM data_source (SELECT * FROM training_data)
   USING
      join_learn_process = true;

   INSERT INTO my_integration.my_table (
      SELECT m.predicted_value
      FROM my_model AS m
      JOIN data_source.training_data AS d
   )
)
EVERY 2 days;

This job will run once every two days. Every time it is executed, it will use the new training data to retrain the model. Since the retraining process is time-consuming, the join_learn_process parameter is used to ensure that the INSERT INTO command will not execute until the retraining process is completed. Once the model is retrained, it is queried for updated predictions that are saved into a data table.

Data Pipeline

A Data Pipeline is a set of data processing elements connected in series, where the output of one element is the input of the next one. MindsDB enables you to create and automate an AI data pipeline that includes fetching data from a data source, processing it with one or more AI models to get the desired output, and saving it to a data source. As mentioned before, a data source may be any relational or non-relational database, data warehouse, or application. Find out more about available data source integrations here.

The AI workflow includes the following:

  1. Get input data from a data source.

  2. Use an AI model, known as an AI table, to process input data and make predictions.

  3. Save predictions into a data source or set up a notification system for your email or Slack.

You can automate this AI workflow by scheduling a job.

CREATE JOB save_predictions (

   CREATE TABLE my_integration.`result_{{START_DATETIME}}` (
      SELECT m.predicted_value
      FROM input_data_table AS d
      JOIN ai_table AS m;
   )
)
EVERY 7 days;

The inner SELECT statement gets the input data from input_data_table and joins it with ai_table. The predictions are saved into a newly created data table. Please note that my_integration is a data source that must be connected to MindsDB with a user having write access. A table is created in this data source every time the job executes. The {{START_DATETIME}} variable ensures table name uniqueness as it is replaced with the current timestamp of the job run.

Instead of creating a table to save predictions, you could alternatively set up a notification system. By connecting your email, Slack, or any other application integration that MindsDB supports, to MindsDB, you can send predictions straight to your inbox.

This flow enables you to create various applications, including alert systems and chatbots. Here are sample AI workflows:

  1. Get input data from Twitter, Gmail, or any other app or data source integrated with MindsDB.

  2. Use an AI model to create desired output for alert systems or responses for chatbots.

  3. Send the AI model’s output back to Twitter, Gmail, Slack, or any other app or data source integrated with MindsDB.

Thanks to numerous integrations with apps and data sources, with MindsDB, you can mix and match to create and automate your custom AI workflow.

Develop and Automate AI Workflows with Ease

MindsDB enables users to develop and automate AI workflows easily. You can integrate data from over 100 data sources, including popular databases, like PostgreSQL, MySQL, MS SQL Server, and MongoDB, and applications, such as Slack, Twitter, Shopify, YouTube, and more. Moreover, you can choose from over 15 ML frameworks, including OpenAI, Nixtla, LangChain, LlamaIndex, and more, to create and deploy AI models with a single command.

Take a hands-on approach to exploring MindsDB by installing MindsDB locally (via pip or Docker). And if you plan to use MindsDB for production systems, we recommend considering MindsDB Starter which provides managed instances, ensuring greater security and scalability for your projects.

Whichever option you choose, MindsDB provides flexibility and ensures a smooth experience for all your AI projects.