MCP Tools

Tools exposed by the OpenHEXA MCP server

OpenHEXA v0.0.1 · Protocol 2025-03-26 · 25 tools

list_connections1 param

List connections (external data sources) configured in a workspace. Returns connection name, slug, type (S3, GCS, POSTGRESQL, DHIS2, IASO, CUSTOM), and their fields. Field values are not visible to not be tempted to use them directly. Connections are used as parameters when running pipelines — use the connection slug as the parameter value.

Parameters

workspace_slugstringrequired

list_datasets3 param

List datasets in a workspace. Returns dataset summaries. Use get_dataset with the dataset slug to get full details including versions and files.

Parameters

workspace_slugstringrequiredpageintegeroptionalper_pageintegeroptional

get_dataset4 param

Get full details of a dataset: metadata, permissions, all versions with their files, and the latest version's file list. Use a file 'id' from the response with preview_dataset_file to see sample data. Use the dataset 'id' with create_dataset_version to add a new version.

Parameters

workspace_slugstringrequireddataset_slugstringrequiredversions_pageintegeroptionalversions_per_pageintegeroptional

preview_dataset_file1 param

Preview the content of a dataset file by its ID (from get_dataset's file list). Returns a sample of the data for tabular files (CSV, Parquet, etc.), file properties, and metadata. The sample status can be PROCESSING (still generating), FINISHED (sample ready), or FAILED.

Parameters

file_idstringrequired

create_dataset4 param

Create a new dataset in a workspace with an initial version (v1) containing the provided files. The files_json parameter is a JSON array of {uri, contentType, content} objects, e.g. '[{"uri": "data.csv", "contentType": "text/csv", "content": "a,b\n1,2"}]'. Use create_dataset_version to add more versions later.

Parameters

workspace_slugstringrequirednamestringrequiredfiles_jsonstringrequireddescriptionstringoptional

create_dataset_version4 param

Create a new version of a dataset with optional inline files. Requires the dataset ID (from get_dataset or create_dataset) and a version name (e.g. 'v1', '2024-01'). Optionally provide a changelog describing what changed. To include files, provide files_json as a JSON array of {uri, contentType, content} objects, e.g. '[{"uri": "data.csv", "contentType": "text/csv", "content": "a,b\n1,2"}]'.

Parameters

dataset_idstringrequirednamestringrequiredchangelogstringoptionalfiles_jsonstringoptional

list_files4 param

List files and directories in a workspace bucket. Use prefix to browse subdirectories (e.g. 'data/'). Returns file name, path, type (file/directory), size, and last update. Use read_file with the file path to read text content.

Parameters

workspace_slugstringrequiredprefixstringoptionalpageintegeroptionalper_pageintegeroptional

read_file4 param

Read the content of a text file from a workspace bucket. Only works for UTF-8 text files up to 1 MB (includes .py, .csv, .json, .ipynb, .sql, .txt, etc.). Use list_files first to check file size and path. For Jupyter notebooks (.ipynb), the content is JSON that you can parse to read/modify cells. Use start_line and end_line (1-indexed, inclusive) to read a specific range of lines instead of the entire file.

Parameters

workspace_slugstringrequiredfile_pathstringrequiredstart_lineintegeroptionalend_lineintegeroptional

write_file3 param

Write text content to a new file in a workspace bucket. Fails if the file already exists. Maximum 1 MB. For Jupyter notebooks, provide valid .ipynb JSON content. Requires createObject permission on the workspace.

Parameters

workspace_slugstringrequiredfile_pathstringrequiredcontentstringrequired

get_help1 param

Call this tool when you are stuck, unsure what to do next, or need guidance on OpenHEXA. Provide a reason describing why you need help (e.g. 'unsure which tool to use', 'pipeline failed', 'cannot find dataset').

Parameters

reasonstringoptional

list_pipelines3 param

List pipelines in a workspace. Returns pipeline summaries (id, code, name, description). Use get_pipeline with the pipeline code to get full details including source code and run history.

Parameters

workspace_slugstringrequiredpageintegeroptionalper_pageintegeroptional

get_pipeline4 param

Get full details of a pipeline: metadata, schedule, permissions, current version source code with all files, parameters, and recent runs. Use the returned 'id' field when calling run_pipeline or update_pipeline. Use a run 'id' from the runs list with get_pipeline_run to inspect outputs and logs.

Parameters

workspace_slugstringrequiredpipeline_codestringrequiredruns_pageintegeroptionalruns_per_pageintegeroptional

get_pipeline_run1 param

Get detailed information about a specific pipeline run. Returns status, configuration used, messages (warnings/errors), outputs (files, database tables), and execution logs. Use this after run_pipeline to check results, or to inspect any run from get_pipeline's runs list.

Parameters

run_idstringrequired

run_pipeline2 param

Run a pipeline. Requires the pipeline UUID (from get_pipeline's 'id' field) and a JSON config string mapping parameter codes to values. Check the pipeline's parameters with get_pipeline first to see required parameters and their types. Example config: '{"param1": "value1", "param2": 42}'. Returns the created run's ID — use get_pipeline_run to monitor progress and get results.

Parameters

pipeline_idstringrequiredconfigstringoptional

update_pipeline4 param

Update a pipeline's properties. Provide the pipeline UUID (from get_pipeline's 'id' field) and any fields to change. For schedule, use a CRON expression (minute hour day-of-month month day-of-week), e.g. '0 6 * * 1' for Mondays at 6AM, '0 */2 * * *' for every 2 hours. Pass schedule='none' to disable scheduling. Only provided non-empty fields are updated.

Parameters

pipeline_idstringrequirednamestringoptionaldescriptionstringoptionalschedulestringoptional

create_pipeline5 param

Create a new pipeline in the current workspace. Optionally upload Python source code as the first version (v1). Always provide a meaningful description summarizing what the pipeline does. If the pipeline has no clear purpose or is blank, use "" as the description. Only name, description, and functional_type are supported at creation time. If source_code is omitted, the pipeline is created without any version. The source_code must follow this structure: from openhexa.sdk import current_run, pipeline @pipeline("Simple ETL") def simple_etl(): count = task_1() task_2(count) @simple_etl.task def task_1(): current_run.log_info("In task 1...") return 42 @simple_etl.task def task_2(count): current_run.log_info(f"In task 2... count is {count}") if __name__ == "__main__": simple_etl()

Parameters

workspace_slugstringrequirednamestringrequireddescriptionstringoptionalfunctional_typestringoptionalsource_codestringoptional

create_pipeline_version5 param

Upload a new version of an existing pipeline. Requires the workspace slug, the pipeline code (from get_pipeline), and the Python source code for the new version. Optionally provide a version name and description. The version number is auto-incremented. The source_code must follow the OpenHEXA SDK structure (use @pipeline and @task decorators). Use get_pipeline first to read the current source code, then modify it and pass it here. Returns the created version details including id, version number, and parsed parameters.

Parameters

workspace_slugstringrequiredpipeline_codestringrequiredsource_codestringrequirednamestringoptionaldescriptionstringoptional

list_pipeline_templates3 param

List available pipeline templates. Optionally filter by search query. Templates are reusable pipeline blueprints. Workflow: list_pipeline_templates -> get_pipeline_template (to review code) -> create_pipeline_from_template (to instantiate in a workspace).

Parameters

searchstringoptionalpageintegeroptionalper_pageintegeroptional

get_pipeline_template1 param

Get full details of a pipeline template including its description, config, version history, and the current version's source code and parameters. Use the currentVersion.id as the template_version_id when calling create_pipeline_from_template.

Parameters

template_codestringrequired

create_pipeline_from_template2 param

Create a new pipeline in a workspace from a template version. Use get_pipeline_template first to find the template_version_id (the currentVersion.id). The new pipeline will have the template's code, parameters, and configuration pre-configured.

Parameters

workspace_slugstringrequiredtemplate_version_idstringrequired

list_static_webapps3 param

List static web apps in a workspace. The returned URL can be used to access each webapp in a browser.

Parameters

workspace_slugstringrequiredpageintegeroptionalper_pageintegeroptional

create_static_webapp4 param

Create a static web app in a workspace. Provide files_json as a JSON array of {path, content} objects, e.g. '[{"path": "index.html", "content": "<html>...</html>"}, {"path": "style.css", "content": "body { ... }"}]'. An index.html file is required at minimum. Returns the webapp URL to access it in a browser.

Parameters

workspace_slugstringrequirednamestringrequiredfiles_jsonstringrequireddescriptionstringoptional

update_static_webapp4 param

Update an existing static web app. Provide the webapp UUID (from list_static_webapps) and any fields to change. To update files, provide files_json as a JSON array of {path, content} objects — this replaces all files. Only provided non-empty fields are updated.

Parameters

webapp_idstringrequiredfiles_jsonstringoptionalnamestringoptionaldescriptionstringoptional

list_workspaces3 param

List workspaces accessible to the current user. Optionally filter by name using the query parameter. This is typically the first tool to call to discover available workspaces before accessing pipelines, datasets, or files.

Parameters

querystringoptionalpageintegeroptionalper_pageintegeroptional

get_workspace1 param

Get details of a specific workspace by its slug. Returns workspace metadata and permissions. Use list_workspaces first if you don't know the slug.

Parameters

slugstringrequired