Arcaflow Getting Started Guide
Running Workflows
An Arcaflow workflow is a definition of steps structured together to perform complex actions. Workflows are defined as machine-readable YAML and therefore can be version-controlled and shared easily to run in different environments. A workflow is a way of encapsulating and sharing expertise and ensuring reproducible results.
The requirements for running a workflow are simple. You just need the Arcaflow engine binary, a workflow definition file, and, typically, an input file. You can also provide a config file, which allows for setting workflow defaults, such as log levels. The contents of the input and configuration files, like the workflow file, are YAML. Finally, you need an appropriate container platform, such as Podman, Docker, or Kubernetes, as the target of the workflow execution.
Note
The default container platform for the Arcaflow engine is Podman. To use another platform, a configuration file is required.
A repository of example workflows is available for reference and practice. Let’s try running the basic example.
First we will clone the example workflows repository:
git clone https://github.com/arcalot/arcaflow-workflows.git
Then we will run the workflow, setting the workflow directory as the context, and defining the workflow, configuration, and input files to use:
arcaflow --context arcaflow-workflows/basic-examples/basic/ \
--workflow workflow.yaml --config config.yaml --input input.yaml
Arcaflow will display logs, the detail of which determined by the configuration file, and then will return the machine-readable output of the workflow in YAML format:
output_data:
example:
message: Hello, Arcalot!
output_id: success
It’s that simple! And the basics of running a workflow are the same, whether it’s this single-step hello-world example:
flowchart LR
%% Success path
steps.example.deploy-->steps.example.starting
steps.example.running-->steps.example.outputs
steps.example.starting-->steps.example.running
steps.example.starting-->steps.example.starting.started
steps.example.disabled-->steps.example.disabled.output
steps.example.outputs-->steps.example.outputs.success
steps.example.enabling-->steps.example.enabling.resolved
steps.example.enabling-->steps.example.starting
steps.example.enabling-->steps.example.disabled
steps.example.outputs.success-->outputs.success
input-->steps.example.starting
steps.example.cancelled-->steps.example.outputs
… or a much more complex workflow like this stress-ng plus PCP data collection example:
%% Mermaid markdown workflow
flowchart LR
%% Success path
steps.pcp.enabling-->steps.pcp.disabled
steps.pcp.enabling-->steps.pcp.enabling.resolved
steps.pcp.enabling-->steps.pcp.starting
steps.stressng.disabled-->steps.stressng.disabled.output
steps.stressng.cancelled-->steps.stressng.outputs
steps.pre_wait.cancelled-->steps.pre_wait.outputs
steps.pcp.outputs.success-->outputs.success
steps.pcp.disabled-->steps.pcp.disabled.output
steps.uuidgen.outputs.success-->outputs.success
steps.uuidgen.outputs-->steps.uuidgen.outputs.success
steps.pre_wait.running-->steps.pre_wait.outputs
steps.pcp.starting-->steps.pcp.starting.started
steps.pcp.starting-->steps.pcp.running
steps.pcp.starting.started-->steps.pre_wait.starting
steps.pcp.running-->steps.pcp.outputs
steps.uuidgen.starting-->steps.uuidgen.starting.started
steps.uuidgen.starting-->steps.uuidgen.running
steps.pre_wait.disabled-->steps.pre_wait.disabled.output
steps.stressng.deploy-->steps.stressng.starting
steps.pcp.outputs-->steps.pcp.outputs.success
steps.stressng.outputs-->steps.stressng.outputs.success
steps.stressng.outputs-->steps.post_wait.starting
steps.post_wait.cancelled-->steps.post_wait.outputs
steps.stressng.outputs.success-->outputs.success
steps.pre_wait.enabling-->steps.pre_wait.enabling.resolved
steps.pre_wait.enabling-->steps.pre_wait.starting
steps.pre_wait.enabling-->steps.pre_wait.disabled
steps.post_wait.outputs-->steps.post_wait.outputs.success
steps.post_wait.outputs-->steps.pcp.cancelled
steps.uuidgen.disabled-->steps.uuidgen.disabled.output
steps.uuidgen.cancelled-->steps.uuidgen.outputs
steps.pcp.cancelled-->steps.pcp.outputs
steps.stressng.enabling-->steps.stressng.starting
steps.stressng.enabling-->steps.stressng.disabled
steps.stressng.enabling-->steps.stressng.enabling.resolved
steps.stressng.running-->steps.stressng.outputs
steps.post_wait.starting-->steps.post_wait.starting.started
steps.post_wait.starting-->steps.post_wait.running
steps.uuidgen.deploy-->steps.uuidgen.starting
steps.uuidgen.running-->steps.uuidgen.outputs
steps.post_wait.deploy-->steps.post_wait.starting
steps.stressng.starting-->steps.stressng.starting.started
steps.stressng.starting-->steps.stressng.running
steps.pcp.deploy-->steps.pcp.starting
steps.post_wait.disabled-->steps.post_wait.disabled.output
steps.post_wait.running-->steps.post_wait.outputs
steps.pre_wait.outputs-->steps.pre_wait.outputs.success
steps.pre_wait.outputs-->steps.stressng.starting
steps.uuidgen.enabling-->steps.uuidgen.enabling.resolved
steps.uuidgen.enabling-->steps.uuidgen.starting
steps.uuidgen.enabling-->steps.uuidgen.disabled
steps.pre_wait.starting-->steps.pre_wait.starting.started
steps.pre_wait.starting-->steps.pre_wait.running
steps.post_wait.enabling-->steps.post_wait.enabling.resolved
steps.post_wait.enabling-->steps.post_wait.starting
steps.post_wait.enabling-->steps.post_wait.disabled
steps.pre_wait.deploy-->steps.pre_wait.starting
input-->steps.stressng.starting
input-->steps.pcp.starting
Learn more about running workflows »
Writing Workflows
As a workflow author, you determine the steps of the workflow, how data will pass between the steps, what input is required from the workflow user, and what output will be returned. It is possible to build very complex workflows with data translations, sub-workflows, parallelisim and serialization, and multiple output paths.
Let’s start with something simple. Our workflow will collect a nickname
input from the
user and will pass that input to an example “Hello world!” step. The workflow will also
run a UUID generation step in parallel to the example step, and it will return both the
UUID and a greeting.
In the first part of the workflow.yaml
file we will define the workflow compatibility
version and the input schema for the workflow. In the input schema, we are expecting
only a single input called nickname
with a type of string
.
version: v0.2.0
input:
root: RootObject
objects:
RootObject:
id: RootObject
properties:
nickname: #<<== Input key name
display:
description: Just a name
name: Name
required: true
type:
type_id: string #<<== Input value type
...
Next we will define the steps of the workflow. The steps are to be deployed as container
images, where the src
field defines the image and tag. The arcaflow-plugin-utilities
plugin has multiple steps available, so we indicate with the step: uuid
field which
step we want to run. The arcaflow-plugin-example
plugin has only one step, so the
step
field is not required. The uuidgen
step requires no input, so we pass an empty
object {}
to it. The example
plugin requires an input object of name
with
_type
and nick
fields. We statically set the value of the _type
field in the
input object of the step, and then we use the
Arcaflow expression language to reference the
workflow input value for nickname
as the input to the plugin’s nick
field.
...
steps:
uuidgen: #<<== Step name
plugin:
deployment_type: image
src: quay.io/arcalot/arcaflow-plugin-utilities:0.6.0 #<<== Container image
step: uuid #<<== Specific plugin step
input: {} #<<== Step does not require input
example: #<<== Step name
plugin:
deployment_type: image
src: quay.io/arcalot/arcaflow-plugin-example:0.5.0 #<<== Container image
input:
name:
_type: nickname #<<== Statically-defined input
nick: !expr $.input.nickname #<<== Referenced workflow input
...
Finally we define the outputs that we expect when the workflow succeeds. The workflow
will run until all of the items referenced in the success
sub-object become available
or until a step that one of the items depends on fails. Once all of the items referenced
in the success
sub-object become available, the engine will terminate any steps which
have not yet completed, and it will return the success
sub-object. If a required step
fails, then the workflow will fail.
Tip
It is possible to define multiple sub-objects for outputs
with different
dependencies. In this case, the output sub-oject that has its dependencies satisfied
first will be the one that returns and ends the workflow. See the
documentation for more info.
...
outputs:
success:
uuid: !expr $.steps.uuidgen.outputs.success
example: !expr $.steps.example.outputs.success
Our final workflow looks like this:
version: v0.2.0
input:
root: RootObject
objects:
RootObject:
id: RootObject
properties:
nickname:
display:
description: Just a name
name: Name
required: true
type:
type_id: string
steps:
uuidgen:
plugin:
deployment_type: image
src: quay.io/arcalot/arcaflow-plugin-utilities:0.6.0
step: uuid
input: {}
example:
plugin:
deployment_type: image
src: quay.io/arcalot/arcaflow-plugin-example:0.5.0
input:
name:
_type: nickname
nick: !expr $.input.nickname
outputs:
success:
uuid: !expr $.steps.uuidgen.outputs.success.uuid
example: !expr $.steps.example.outputs.success.message
We will create an input file to satisfy the input schema of the workflow:
nickname: Arcalot
We will also create a configuration file, setting the container deployer to Podman and
the log levels to error
:
log:
level: error
logged_outputs:
error:
level: error
deployers:
image:
deployer_name: podman
And now we can run our new workflow:
Tip
The default workflow file is workflow.yaml
so we don’t need to specifiy it here
explicitly.
arcaflow --config config.yaml --input input.yaml
output_data:
example: Hello, Arcalot!
uuid: b98909c2-4a25-4cc1-8222-3290b0621129
output_id: success
Learn more about workflow concepts »
Learn more about writing workflows »
Did you know?
Arcaflow provides Mermaid markdown in the workflow debug output that allows you to quickly visualize the workflow in a graphic format. You can grab the Mermaid graph you see in the output and put it into a Mermaid editor.
flowchart LR
steps.uuidgen.enabling-->steps.uuidgen.starting
steps.uuidgen.enabling-->steps.uuidgen.disabled
steps.uuidgen.enabling-->steps.uuidgen.enabling.resolved
steps.uuidgen.outputs-->steps.uuidgen.outputs.success
steps.example.cancelled-->steps.example.outputs
steps.example.outputs.success-->outputs.success
input-->steps.example.starting
steps.uuidgen.disabled-->steps.uuidgen.disabled.output
steps.uuidgen.outputs.success-->outputs.success
steps.example.disabled-->steps.example.disabled.output
steps.example.running-->steps.example.outputs
steps.example.enabling-->steps.example.enabling.resolved
steps.example.enabling-->steps.example.starting
steps.example.enabling-->steps.example.disabled
steps.example.starting-->steps.example.starting.started
steps.example.starting-->steps.example.running
steps.example.deploy-->steps.example.starting
steps.uuidgen.deploy-->steps.uuidgen.starting
steps.example.outputs-->steps.example.outputs.success
steps.uuidgen.cancelled-->steps.uuidgen.outputs
steps.uuidgen.running-->steps.uuidgen.outputs
steps.uuidgen.starting-->steps.uuidgen.starting.started
steps.uuidgen.starting-->steps.uuidgen.running
flowchart LR
steps.uuidgen.enabling-->steps.uuidgen.starting
steps.uuidgen.enabling-->steps.uuidgen.disabled
steps.uuidgen.enabling-->steps.uuidgen.enabling.resolved
steps.uuidgen.outputs-->steps.uuidgen.outputs.success
steps.example.cancelled-->steps.example.outputs
steps.example.outputs.success-->outputs.success
input-->steps.example.starting
steps.uuidgen.disabled-->steps.uuidgen.disabled.output
steps.uuidgen.outputs.success-->outputs.success
steps.example.disabled-->steps.example.disabled.output
steps.example.running-->steps.example.outputs
steps.example.enabling-->steps.example.enabling.resolved
steps.example.enabling-->steps.example.starting
steps.example.enabling-->steps.example.disabled
steps.example.starting-->steps.example.starting.started
steps.example.starting-->steps.example.running
steps.example.deploy-->steps.example.starting
steps.uuidgen.deploy-->steps.uuidgen.starting
steps.example.outputs-->steps.example.outputs.success
steps.uuidgen.cancelled-->steps.uuidgen.outputs
steps.uuidgen.running-->steps.uuidgen.outputs
steps.uuidgen.starting-->steps.uuidgen.starting.started
steps.uuidgen.starting-->steps.uuidgen.running
Running Plugins
Workflow steps are run via plugins, which are delivered as containers. The Arcalot community maintains an ever-growing list of official plugins, which are version-controlled and hosted in our Quay.io repository.
Plugins are designed to run independent of an Arcaflow workflow. All plugins have schema definitions for their inputs and outputs, and they perform data validation against those schemas when run. Plugins also have one or more steps, and when there are multiple steps we always need to specify which step we want to run.
Tip
Plugin steps are the fundamental building blocks for workflows.
Let’s take a look at the schema for the example plugin. Passing the --schema
parameter
to the plugin will return the complete schema in YAML format.
podman run --rm quay.io/arcalot/arcaflow-plugin-example --schema
docker run --rm quay.io/arcalot/arcaflow-plugin-example --schema
Here we see the example plugin has one step called hello-world
, which has schemas for
both its inputs and outputs.
steps:
hello-world:
display:
description: Says hello :)
name: Hello world!
id: hello-world
input:
objects:
FullName:
id: FullName
properties:
first_name:
display:
name: First name
examples:
- '"Arca"'
required: true
type:
min: 1
pattern: ^[a-zA-Z]+$
type_id: string
last_name:
display:
name: Last name
examples:
- '"Lot"'
required: true
type:
min: 1
pattern: ^[a-zA-Z]+$
type_id: string
InputParams:
id: InputParams
properties:
name:
display:
description: Who do we say hello to?
name: Name
examples:
- '{"_type": "fullname", "first_name": "Arca", "last_name": "Lot"}'
- '{"_type": "nickname", "nick": "Arcalot"}'
required: true
type:
discriminator_field_name: _type
type_id: one_of_string
types:
fullname:
display:
name: Full name
id: FullName
type_id: ref
nickname:
display:
name: Nick
id: Nickname
type_id: ref
Nickname:
id: Nickname
properties:
nick:
display:
name: Nickname
examples:
- '"Arcalot"'
required: true
type:
min: 1
pattern: ^[a-zA-Z]+$
type_id: string
root: InputParams
outputs:
error:
error: false
schema:
objects:
ErrorOutput:
id: ErrorOutput
properties:
error:
display: {}
required: true
type:
type_id: string
root: ErrorOutput
success:
error: false
schema:
objects:
SuccessOutput:
id: SuccessOutput
properties:
message:
display: {}
required: true
type:
type_id: string
root: SuccessOutput
The plugin schema can also be returned in JSON format, in which case you must specify
whether to return the input
or output
schema.
podman run --rm quay.io/arcalot/arcaflow-plugin-example --json-schema output
docker run --rm quay.io/arcalot/arcaflow-plugin-example --json-schema output
{
"$id": "hello-world",
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "Hello world! outputs",
"description": "Says hello :)",
"oneof": [
{
"output_id": {
"type": "string",
"const": "success"
},
"output_data": {
"type": "object",
"properties": {
"message": {
"type": "string"
}
},
"required": [
"message"
],
"additionalProperties": false,
"dependentRequired": {}
}
},
{
"output_id": {
"type": "string",
"const": "error"
},
"output_data": {
"type": "object",
"properties": {
"error": {
"type": "string"
}
},
"required": [
"error"
],
"additionalProperties": false,
"dependentRequired": {}
}
}
],
"$defs": {
"SuccessOutput": {
"type": "object",
"properties": {
"message": {
"type": "string"
}
},
"required": [
"message"
],
"additionalProperties": false,
"dependentRequired": {}
},
"ErrorOutput": {
"type": "object",
"properties": {
"error": {
"type": "string"
}
},
"required": [
"error"
],
"additionalProperties": false,
"dependentRequired": {}
}
}
}
A plugin takes its input as a file, but because it runs as a container, it looks for the input file in the context of the container. This means you either need to bind-mount the input file to the container, or, as in this example, pipe the input value to the plugin’s file input.
name:
_type: nickname
nick: Arcalot
Note
In order to pass the input to the container via redirection or pipe, you must pass
the -i, --interactive
parameter.
cat input.yaml | podman run -i --rm quay.io/arcalot/arcaflow-plugin-example -f -
podman run --rm -v ${PWD}/input.yaml:/input.yaml:z quay.io/arcalot/arcaflow-plugin-example -f /input.yaml
cat input.yaml | docker run -i --rm quay.io/arcalot/arcaflow-plugin-example -f -
docker run --rm -v ${PWD}/input.yaml:/input.yaml:z quay.io/arcalot/arcaflow-plugin-example -f /input.yaml
output_id: success
output_data:
message: Hello, Arcalot!
debug_logs: ''
Now let’s generate a UUID with the utilities plugin. This plugin has multiple steps, so we need to specify which step to run. The step also requires no input, so we pass it an empty object.
Note
An input object is always required, even if a plugin step does not require input parameters.
echo '{}' | podman run -i --rm quay.io/arcalot/arcaflow-plugin-utilities -s uuid -f -
echo '{}' | docker run -i --rm quay.io/arcalot/arcaflow-plugin-utilities -s uuid -f -
output_id: success
output_data:
uuid: bb08484f-6263-4317-9162-be2ae846b438
debug_logs: ''
Learn more about plugin schemas »
Writing Plugins
Of course you may have specific needs and want to author your own plugins. To aid with this, we provide SDKs in popular languages. Let’s create a simple hello-world plugin using the Python SDK. We’ll publish the code here, and you can find the details in the Python plugin guide.
#!/usr/local/bin/python3
from dataclasses import dataclass
import sys
from arcaflow_plugin_sdk import plugin
@dataclass
class InputParams:
name: str
@dataclass
class SuccessOutput:
message: str
@plugin.step(
id="hello-world",
name="Hello world!",
description="Says hello :)",
outputs={"success": SuccessOutput},
)
def hello_world(params: InputParams):
return "success", SuccessOutput(f"Hello, {params.name}")
if __name__ == "__main__":
sys.exit(
plugin.run(
plugin.build_schema(
hello_world,
)
)
)
Learn more about writing Python plugins »
Next, let’s create a Dockerfile
and build a container image:
FROM quay.io/arcalot/arcaflow-plugin-baseimage-python-osbase
ADD plugin.py /
RUN python -m pip install arcaflow_plugin_sdk
ENTRYPOINT ["python", "/plugin.py"]
CMD []
podman build -t example-plugin .
docker build -t example-plugin .
And finally we can run our new plugin.
echo "name: Arca Lot" | podman run -i --rm example-plugin -f -
echo "name: Arca Lot" | docker run -i --rm example-plugin -f -
output_id: success
output_data:
message: Hello, Arca Lot
debug_logs: ''
Learn more about Packaging plugins
Next steps
Congratulations, you are now an Arcaflow user! Here are some things you can do next to start working with plugins and workflows:
- See our repositories of community-supported plugins »
- Get our latest plugin container builds from quay.io »
- Experiment with more advanced example workflows »
Keep learning
Hungry for more? Keep digging into our docs::