Data-Flow Diagram

FlowStrider takes a Data-Flow Diagram (DFD) of the software system under analysis as input. The DFD representation in FlowStrider follows the STRIDE threat modeling framework, originally developed by Microsoft.

What is a Data-Flow Diagram?

A DFD provides a high-level view of how data moves through a system:

  • where it is processed,

  • where it is stored,

  • and where it enters or leaves the system.

A DFD typically consists of five main elements:

DFD elements visualization

Tip

For a step-by-step tutorial and detailed explanations of each element, see Microsoft Learn – Create a threat model using foundational DFD elements and the Wikipedia article on Data-flow diagrams.

FlowStrider’s Data-Flow Diagram Format

General Structure

To be processed by FlowStrider, a DFD must be provided as a json file using FlowStrider’s Data-Flow Diagram format. This format borrows terminology from graph theory:

  • Nodes represent processes, data stores, or external entities.

  • Edges represent data flows (connections) between nodes.

  • Clusters group nodes together, typically to define trust boundaries.

The mapping to STRIDE’s DFD elements is preserved through tags on nodes (explained further below).

The core component of this format is the data class DataflowDiagram, which contains the three element types listed above. The figure below illustrates the format’s overall structure using a simple example, alongside the corresponding DFD visualization.

JSON structure

Inside the diagram, the objects "nodes": {}, "edges": {}, and "clusters": {} hold the respective entities. The array tags lists the rule sets that FlowStrider should apply when analyzing this diagram. Detailed field descriptions are provided in the tables further below.

Representing DFD Elements

Each of the main DFD elements can be represented in the format as shown below:

DFD element visualization and json structure

Note: When creating nodes, edges, or clusters, always assign the correct element type as a tag.

Adding Metadata to Elements

All three element types (nodes, edges, and clusters) include a generic attributes dictionary for additional metadata. The next figure shows examples of each element with possible values.

DFD elements with context information

The supported metadata values are described in Supported metadata table below.

Custom Prioritization

Both nodes and clusters can include an additional field called severity_multiplier. This field takes a float number used to customize prioritization:

  • Default: if not set, the multiplier is 1.0.

  • Use higher values to increase the priority of specific elements (e.g., critical assets or highly exposed components).

The prioritization influences the sorting of findings in the final report.

Examples

  • The complete json example can be found in test/resources/example_readme.json.

  • Additional examples are available in the same folder.

Tip

For more background on the underlying json format itself, see Wikipedia – JSON or MDN – Working with JSON.

Detailed Field Reference

The following section provides a complete list of all fields in the relevant data classes, along with their descriptions.

Data-flow diagram

Field

Description

Type

id

A unique identifier for the diagram.

str

nodes

The nodes in the diagram. These can represent processes, external entities, or data stores.

Dict[str, Node]

edges

The edges in the diagram. These represent data flows between nodes.

Dict[str, Edge]

clusters

The clusters in the diagram. These contain nodes and represent trust boundaries.

Dict[str, Cluster]

name

The name of the diagram.

str

tags

A set of tags specifying the rule set to use [‘stride’, ‘bsi_rules’, ‘linddun_rules’].

Set[str]

attributes

Metadata about the data flow diagram. This information is not used in the current version.

Dict[str, Any]

Node

Field

Description

Type

id

A unique identifier for the node.

str

name

The name of the node.

str

tags

A set of tags used to specify the type of the node: datastore, process, or external entity [‘STRIDE:DataStore’, ‘STRIDE:Process’, ‘STRIDE:Interactor’].

Set[str]

attributes

A dictionary containing metadata about the node (see supported metadata).

Dict[str, Any]

severity_multiplier

Multiplier for the severity of threats found at this node.

float

Edge

Field

Description

Type

id

A unique identifier of the edge.

str

source_id

ID of the source node.

str

sink_id

ID of the sink node.

str

name

Name of the edge.

str

tags

A set of tags used to specify the type of the edge: data flow [STRIDE:Dataflow].

Set[str]

attributes

A dictionary containing metadata about the edge. (see supported metadata).

Dict[str, Any]

Cluster

Field

Description

Type

id

A unique identifier of the cluster.

str

node_ids

IDs of the nodes in the cluster.

Set[str]

name

Name of the cluster.

str

tags

A set of tags used to specify the type of the cluster [“STRIDE:TrustBoundary”].

Set[str]

attributes

A dictionary containing metadata about the cluster. Currently, no additional metadata is used here.

Dict[str, Any]

severity_multiplier

Multiplier for the severity of threats found in this cluster.

float

Supported Metadata

Field

Description

Applicable To

Allowed Values

Corresponding Rule Sets

auth_factors

If authentication is required, which factors will be needed for authentication. Examples: [‘PIN’, ‘Chip Card’, ‘OTP’] or [‘Digital Certificate’, ‘Biometric Data’].

Node: DataStore, Node: Process

PIN, OTP, Biometric Data, Digital Certificate, Chip Card, Security Token

bsi_rules

auth_protocol

Which authentication protocol will be used to ensure integrity. Examples: ‘DH_CHAP’ or ‘FCPAP’

Node: DataStore

DH_CHAP, FCAP, FCPAP

bsi_rules

auth_req

Whether any form of authentication is required to access the entity.

Node: DataStore, Node: Process

True, False

bsi_rules

encryption_method

Which method of encryption will be used to encrypt the data. Examples: ‘AES_128’ or ‘AES_256’

Node: DataStore

AES_128, AES_192, AES_256

bsi_rules

given_permissions

Actions the actor is priviliged to perform.

Node: Process, Node: Interactor

bsi_rules

handles_confidential_data

Whether the entity handles confidential data.

Node: DataStore, Node: Process, Edge: Dataflow

True, False

bsi_rules, linddun_rules

handles_logs

Whether the DataStore handles protocol logging data.

Node: DataStore

True, False

bsi_rules

hash_function

Which function will be used to store hashed data. Examples: ‘SHA3_256’ or ‘SHA_512_256’.

Node: DataStore

SHA3_256, SHA3_384, SHA3_512, SHA_256, SHA_384, SHA_512, SHA_512_256

bsi_rules

http_content_security_policy

If the HTTP header ‘Content Security Policy’ is set appropriately and as restrictive as possible.

Edge: Dataflow

True, False

bsi_rules

http_strict_transport_security

If the HTTP header ‘Strict Transport Security’ is set appropriately and as restrictive as possible.

Edge: Dataflow

True, False

bsi_rules

http_content_type

If the HTTP header ‘Content Type’ is set appropriately and as restrictive as possible.

Edge: Dataflow

True, False

bsi_rules

http_x_content_options

If the HTTP header ‘X Content Options’ is set appropriately and as restrictive as possible.

Edge: Dataflow

True, False

bsi_rules

http_cache_control

If the HTTP header ‘Cache Control’ is set appropriately and as restrictive as possible.

Edge: Dataflow

True, False

bsi_rules

is_san_fabric

Defines, if the entity is part of the fabric layer of a storage area network.

Node: DataStore

True, False

bsi_rules

input_data

All types of handled data. Example: [‘Session IDs’, ‘User Requests’].

Node: Process

bsi_rules

input_validation

Defines, if the input data is validated.

Node: Process

True, False

bsi_rules

integrity_check

If an integrity check (such as a check sum) is used, this should note the specific check. Examples: ‘check sum’ or ‘digital certificate’.

Edge: Dataflow

check sum, digital certificate, ECDSA

bsi_rules

proxy

Whether the dataflow is routed through a TLS-proxy

Edge: Dataflow

True, False

bsi_rules

req_permissions

Which priviliges are required to interact with the process.

Node: Process, Node: Interactor

bsi_rules

signature_scheme

If a signature scheme is used, this should note the specific scheme. Examples:’RSA’ or ‘ECDSA’ or ‘LMS’.

Edge: Dataflow, Node: DataStore

DSA, ECDSA, ECGDSA, ECKDSA, LMS, RSA, XMSS

bsi_rules

stores_credentials

Whether the DataStore stores Login Credentials or other authentication data.

Node: DataStore

True, False

bsi_rules

transport_protocol

Which transport protocol the dataflow uses.

Edge: Dataflow

HTTPS, TLS 1.2, TLS 1.3

bsi_rules

data_collection_informed

If this data subject gets informed in detail about which data is being collected in which way, what is done with collected data and with whom it is shared.

Node: Interactor

True, False

linddun_rules

data_lifecycle_policy_exists

If a policy is defined that concerns the lifecycle management of data, including principles for creation, storage, sharing, usage, archival and destruction of data.

DataflowDiagram

True, False

linddun_rules

data_retention_minimized

If the application stores data only for the time frame necessary to the core functionality. For example no mail addresses are stored of users who already unsubscribed a newsletter.

Node: DataStore

True, False

linddun_rules

data_sharing_minimized

If the application shares data only with services and external parties who need it for the functionality of the system.

Node: Process, Node: DataStore

True, False

linddun_rules

discloses_responses

If the entity discloses the existence of information through status messages when the query was wrong or not authenticated. E.g. returning a ‘wrong password’ error message revealing the existence of the account.

Node: Process, Node: DataStore

True, False

linddun_rules

handles_personal_data

Whether the entity handles personal data.

Node: Process, Node: DataStore

True, False

linddun_rules

handles_user_data

If this entity stores or handles any data from users like messages, texts, files or full user accounts.

Node: Process, Node: DataStore

True, False

linddun_rules

is_private_network

If the inside workings of the trust boundary are private and the network acts like a blackbox sending and receiving data via dedicated interfaces. Internal communication channels would be completely hidden to outside viewers.

TrustBoundary

True, False

linddun_rules

is_user

Whether this interactor node represents human users.

Node: Interactor

True, False

linddun_rules

leaves_usage_traces

If the application leaves any traces that a user has used the application like log files, traces of temporary files or size of data changing.

Node: Process, Node: DataStore

True, False

linddun_rules

logs_access

If the process logs access by users.

Node: Process

True, False

linddun_rules

logs_receipt

If the process logs the receipt of messages.

Node: Process

True, False

linddun_rules

only_necessary_data_analyzed

If the application keeps the analysis of its data to a strictly necessary level and data is not enriched more than it needs to be for the core functionality.

Node: Process, Node: DataStore

True, False

linddun_rules

only_necessary_data_collected

If the application collects only data that is strictly necessary for its core functionality. This includes limiting the amount/size of data collected and the collected data not being more fine-grained than necessary.

Node: Process, Node: DataStore

True, False

linddun_rules

own_data_access

If data subjects have access to their own personal data.

Node: Interactor

True, False

linddun_rules

own_data_modification

If data subjects have the ability to correct or delete their own personal data.

Node: Interactor

True, False

linddun_rules

personal_data_preferences

Whether or not this data subject is given the option to set their preferences regarding the collection, handling and sharing of their personal data.

Node: Interactor

True, False

linddun_rules

privacy_regulation_compliance

If this entity is compliant with privacy regulations of jurisdictions the system is used in.

DataflowDiagram

True, False

linddun_rules

privacy_standards_compliance

If the system is compliant with (industry specific) privacy standards and best practices.

DataflowDiagram

True, False

linddun_rules

security_standards_compliance

If the system is compliant with (industry specific) security standards and best practices.

DataflowDiagram

True, False

linddun_rules

stores_signed_data

If the data store stores data that has been digitally signed by the uploader.

Node: DataStore

True, False

linddun_rules

stores_user_associated_metadata

If the data store stores metadata, hidden data or specific patterns that could relate to specific users.

Node: DataStore

True, False

linddun_rules

transmits_signed_data

If the dataflow transmits data that has been digitally signed by the sender. Example: signed emails.

Edge: Dataflow

True, False

linddun_rules

transmits_unique_user_id

If an identifier is transmitted on this dataflow that uniquely corresponds to one user. Examples: IP-address, email-address, unique pseudonyms

Edge: Dataflow

True, False

linddun_rules

transmits_user_data

If this dataflow transmits any data directly from users like messages, texts or files.

Edge: Dataflow

True, False

linddun_rules

transmits_user_identity

If this dataflow transmits the clear identity of users such as the full name.

Edge: Dataflow

True, False

linddun_rules

transmits_user_properties

If this dataflow transmits any properties that are dependent on the user like OS, browser, screen size, language, etc.

Edge: Dataflow

True, False

linddun_rules