Basic usage
kedro mermaid generate
Full example
kedro mermaid generate \
--from-inputs=raw_data \ # Start from the `raw_data` dataset
--to-outputs=report \ # End at the `report` dataset
--nodes=clean_data,train_model,report_results \ # Include only these nodes
--tags=model,reporting \ # Include only nodes tagged with `model` or `reporting`
--namespaces=data,model \ # Include only nodes in the `data` or `model` namespaces
--pipeline=data_science \ # Target the `data_science` pipeline
--set-node-attr pattern="(?P<category>\w+)__(?P<node>\w+)" \ # Group nodes by prefix before `__`
--set-node-attr params.shape=circle \ # Change node shape to circles
--set-graph-attr declaration="flowchart TB" \ # Change layout to top-to-bottom
--set-graph-attr config.layout=elk \ # Use the ELK layout engine
--set-edge-attr params.arrow='---' # Change edge arrows to dashed lines
Filter the diagram
Based on datasets, nodes, tags, or namespaces
kedro mermaid generate \
--tags=model,reporting \
--from-inputs=raw_data \
--pipeline=data_science
Only show the nodes in a specific format
For instance, if your nodes are named like data__load_data, data__clean_data, model__train_model, and model__evaluate_model, you can use:
kedro mermaid generate --set-node-attr pattern="(?P<category>\w+)__(?P<node>\w+)"
This will group nodes into subgraphs based on the category prefix and label them with the node_name:
data__load_data: appears in thedatasubgraph with the labelLoad Datadata__clean_data: appears in thedatasubgraph with the labelClean Datamodel__train_model: appears in themodelsubgraph with the labelTrain Modelmodel__evaluate_model: appears in themodelsubgraph with the labelEvaluate Model
Limit the granularity of the diagram
If your nodes are named with an indicative pattern, such as level1.level2.level3.level4, you can create a regex pattern to capture only the top-level namespace:
kedro mermaid generate --set-node-attr pattern="(?P<category>[^.]+)(?:[.](?P<node>[^.]+)){2}"
Breakdown of the regex:
- (?P<category>[^.]+): Captures the first segment before the first dot as the category.
- (?:[.](?P<node>[^.]+)){2}: Non-capturing group that matches a dot followed by a segment, repeated twice to capture the next two segments as node.
This pattern will group nodes into subgraphs based on the top-level namespace and label them with the next two segments, effectively reducing the diagram's complexity while retaining meaningful structure. For example:
- data.load.clean.transform: appears in the data subgraph with the label Load Clean
- data.load.clean.aggregate: Will be merged with data.load.clean.transform as they share the same category and node labels.
- model.train.evaluate.report: appears in the model subgraph with the label Train Evaluate
Style the diagram
Change the layout
kedro mermaid generate --set-graph-attr declaration="flowchart TB"
Use the ELK layout engine
kedro mermaid generate --set-graph-attr config.layout=elk