Data-Flow Diagram¶
FlowStrider takes a Data-Flow Diagram (DFD) of the software system under analysis as input. The DFD representation in FlowStrider follows the STRIDE threat modeling framework, originally developed by Microsoft.
What is a Data-Flow Diagram?¶
A DFD provides a high-level view of how data moves through a system:
where it is processed,
where it is stored,
and where it enters or leaves the system.
A DFD typically consists of five main elements:
Tip
For a step-by-step tutorial and detailed explanations of each element, see Microsoft Learn – Create a threat model using foundational DFD elements and the Wikipedia article on Data-flow diagrams.
FlowStrider’s Data-Flow Diagram Format¶
General Structure¶
To be processed by FlowStrider, a DFD must be provided as a json file using FlowStrider’s Data-Flow Diagram format. This format borrows terminology from graph theory:
Nodes represent processes, data stores, or external entities.
Edges represent data flows (connections) between nodes.
Clusters group nodes together, typically to define trust boundaries.
The mapping to STRIDE’s DFD elements is preserved through tags on nodes (explained further below).
The core component of this format is the data class DataflowDiagram, which contains the three element types listed above.
The figure below illustrates the format’s overall structure using a simple example, alongside the corresponding DFD visualization.
Inside the diagram, the objects "nodes": {}, "edges": {}, and "clusters": {} hold the respective entities.
The array tags lists the rule sets that FlowStrider should apply when analyzing this diagram.
Detailed field descriptions are provided in the tables further below.
Representing DFD Elements¶
Each of the main DFD elements can be represented in the format as shown below:
Note: When creating nodes, edges, or clusters, always assign the correct element type as a tag.
Adding Metadata to Elements¶
All three element types (nodes, edges, and clusters) include a generic
attributes dictionary for additional metadata.
The next figure shows examples of each element with possible values.
The supported metadata values are described in Supported metadata table below.
Custom Prioritization¶
Both nodes and clusters can include an additional field called severity_multiplier.
This field takes a float number used to customize prioritization:
Default: if not set, the multiplier is
1.0.Use higher values to increase the priority of specific elements (e.g., critical assets or highly exposed components).
The prioritization influences the sorting of findings in the final report.
Examples¶
The complete json example can be found in
test/resources/example_readme.json.Additional examples are available in the same folder.
Tip
For more background on the underlying json format itself, see Wikipedia – JSON or MDN – Working with JSON.
Detailed Field Reference¶
The following section provides a complete list of all fields in the relevant data classes, along with their descriptions.
Data-flow diagram¶
Field |
Description |
Type |
|---|---|---|
|
A unique identifier for the diagram. |
|
|
The nodes in the diagram. These can represent processes, external entities, or data stores. |
|
|
The edges in the diagram. These represent data flows between nodes. |
|
|
The clusters in the diagram. These contain nodes and represent trust boundaries. |
|
|
The name of the diagram. |
|
|
A set of tags specifying the rule set to use [‘stride’, ‘bsi_rules’, ‘linddun_rules’]. |
|
|
Metadata about the data flow diagram. This information is not used in the current version. |
|
Node¶
Field |
Description |
Type |
|---|---|---|
|
A unique identifier for the node. |
|
|
The name of the node. |
|
|
A set of tags used to specify the type of the node: datastore, process, or external entity [‘STRIDE:DataStore’, ‘STRIDE:Process’, ‘STRIDE:Interactor’]. |
|
|
A dictionary containing metadata about the node (see supported metadata). |
|
|
Multiplier for the severity of threats found at this node. |
|
Edge¶
Field |
Description |
Type |
|---|---|---|
|
A unique identifier of the edge. |
|
|
ID of the source node. |
|
|
ID of the sink node. |
|
|
Name of the edge. |
|
|
A set of tags used to specify the type of the edge: data flow [STRIDE:Dataflow]. |
|
|
A dictionary containing metadata about the edge. (see supported metadata). |
|
Cluster¶
Field |
Description |
Type |
|---|---|---|
|
A unique identifier of the cluster. |
|
|
IDs of the nodes in the cluster. |
|
|
Name of the cluster. |
|
|
A set of tags used to specify the type of the cluster [“STRIDE:TrustBoundary”]. |
|
|
A dictionary containing metadata about the cluster. Currently, no additional metadata is used here. |
|
|
Multiplier for the severity of threats found in this cluster. |
|
Supported Metadata¶
Field |
Description |
Applicable To |
Allowed Values |
Corresponding Rule Sets |
|---|---|---|---|---|
|
If authentication is required, which factors will be needed for authentication. Examples: [‘PIN’, ‘Chip Card’, ‘OTP’] or [‘Digital Certificate’, ‘Biometric Data’]. |
Node: DataStore, Node: Process |
PIN, OTP, Biometric Data, Digital Certificate, Chip Card, Security Token |
bsi_rules |
|
Which authentication protocol will be used to ensure integrity. Examples: ‘DH_CHAP’ or ‘FCPAP’ |
Node: DataStore |
DH_CHAP, FCAP, FCPAP |
bsi_rules |
|
Whether any form of authentication is required to access the entity. |
Node: DataStore, Node: Process |
True, False |
bsi_rules |
|
Which method of encryption will be used to encrypt the data. Examples: ‘AES_128’ or ‘AES_256’ |
Node: DataStore |
AES_128, AES_192, AES_256 |
bsi_rules |
|
Actions the actor is priviliged to perform. |
Node: Process, Node: Interactor |
bsi_rules |
|
|
Whether the entity handles confidential data. |
Node: DataStore, Node: Process, Edge: Dataflow |
True, False |
bsi_rules, linddun_rules |
|
Whether the DataStore handles protocol logging data. |
Node: DataStore |
True, False |
bsi_rules |
|
Which function will be used to store hashed data. Examples: ‘SHA3_256’ or ‘SHA_512_256’. |
Node: DataStore |
SHA3_256, SHA3_384, SHA3_512, SHA_256, SHA_384, SHA_512, SHA_512_256 |
bsi_rules |
|
If the HTTP header ‘Content Security Policy’ is set appropriately and as restrictive as possible. |
Edge: Dataflow |
True, False |
bsi_rules |
|
If the HTTP header ‘Strict Transport Security’ is set appropriately and as restrictive as possible. |
Edge: Dataflow |
True, False |
bsi_rules |
|
If the HTTP header ‘Content Type’ is set appropriately and as restrictive as possible. |
Edge: Dataflow |
True, False |
bsi_rules |
|
If the HTTP header ‘X Content Options’ is set appropriately and as restrictive as possible. |
Edge: Dataflow |
True, False |
bsi_rules |
|
If the HTTP header ‘Cache Control’ is set appropriately and as restrictive as possible. |
Edge: Dataflow |
True, False |
bsi_rules |
|
Defines, if the entity is part of the fabric layer of a storage area network. |
Node: DataStore |
True, False |
bsi_rules |
|
All types of handled data. Example: [‘Session IDs’, ‘User Requests’]. |
Node: Process |
bsi_rules |
|
|
Defines, if the input data is validated. |
Node: Process |
True, False |
bsi_rules |
|
If an integrity check (such as a check sum) is used, this should note the specific check. Examples: ‘check sum’ or ‘digital certificate’. |
Edge: Dataflow |
check sum, digital certificate, ECDSA |
bsi_rules |
|
Whether the dataflow is routed through a TLS-proxy |
Edge: Dataflow |
True, False |
bsi_rules |
|
Which priviliges are required to interact with the process. |
Node: Process, Node: Interactor |
bsi_rules |
|
|
If a signature scheme is used, this should note the specific scheme. Examples:’RSA’ or ‘ECDSA’ or ‘LMS’. |
Edge: Dataflow, Node: DataStore |
DSA, ECDSA, ECGDSA, ECKDSA, LMS, RSA, XMSS |
bsi_rules |
|
Whether the DataStore stores Login Credentials or other authentication data. |
Node: DataStore |
True, False |
bsi_rules |
|
Which transport protocol the dataflow uses. |
Edge: Dataflow |
HTTPS, TLS 1.2, TLS 1.3 |
bsi_rules |
|
If this data subject gets informed in detail about which data is being collected in which way, what is done with collected data and with whom it is shared. |
Node: Interactor |
True, False |
linddun_rules |
|
If a policy is defined that concerns the lifecycle management of data, including principles for creation, storage, sharing, usage, archival and destruction of data. |
DataflowDiagram |
True, False |
linddun_rules |
|
If the application stores data only for the time frame necessary to the core functionality. For example no mail addresses are stored of users who already unsubscribed a newsletter. |
Node: DataStore |
True, False |
linddun_rules |
|
If the application shares data only with services and external parties who need it for the functionality of the system. |
Node: Process, Node: DataStore |
True, False |
linddun_rules |
|
If the entity discloses the existence of information through status messages when the query was wrong or not authenticated. E.g. returning a ‘wrong password’ error message revealing the existence of the account. |
Node: Process, Node: DataStore |
True, False |
linddun_rules |
|
Whether the entity handles personal data. |
Node: Process, Node: DataStore |
True, False |
linddun_rules |
|
If this entity stores or handles any data from users like messages, texts, files or full user accounts. |
Node: Process, Node: DataStore |
True, False |
linddun_rules |
|
If the inside workings of the trust boundary are private and the network acts like a blackbox sending and receiving data via dedicated interfaces. Internal communication channels would be completely hidden to outside viewers. |
TrustBoundary |
True, False |
linddun_rules |
|
Whether this interactor node represents human users. |
Node: Interactor |
True, False |
linddun_rules |
|
If the application leaves any traces that a user has used the application like log files, traces of temporary files or size of data changing. |
Node: Process, Node: DataStore |
True, False |
linddun_rules |
|
If the process logs access by users. |
Node: Process |
True, False |
linddun_rules |
|
If the process logs the receipt of messages. |
Node: Process |
True, False |
linddun_rules |
|
If the application keeps the analysis of its data to a strictly necessary level and data is not enriched more than it needs to be for the core functionality. |
Node: Process, Node: DataStore |
True, False |
linddun_rules |
|
If the application collects only data that is strictly necessary for its core functionality. This includes limiting the amount/size of data collected and the collected data not being more fine-grained than necessary. |
Node: Process, Node: DataStore |
True, False |
linddun_rules |
|
If data subjects have access to their own personal data. |
Node: Interactor |
True, False |
linddun_rules |
|
If data subjects have the ability to correct or delete their own personal data. |
Node: Interactor |
True, False |
linddun_rules |
|
Whether or not this data subject is given the option to set their preferences regarding the collection, handling and sharing of their personal data. |
Node: Interactor |
True, False |
linddun_rules |
|
If this entity is compliant with privacy regulations of jurisdictions the system is used in. |
DataflowDiagram |
True, False |
linddun_rules |
|
If the system is compliant with (industry specific) privacy standards and best practices. |
DataflowDiagram |
True, False |
linddun_rules |
|
If the system is compliant with (industry specific) security standards and best practices. |
DataflowDiagram |
True, False |
linddun_rules |
|
If the data store stores data that has been digitally signed by the uploader. |
Node: DataStore |
True, False |
linddun_rules |
|
If the data store stores metadata, hidden data or specific patterns that could relate to specific users. |
Node: DataStore |
True, False |
linddun_rules |
|
If the dataflow transmits data that has been digitally signed by the sender. Example: signed emails. |
Edge: Dataflow |
True, False |
linddun_rules |
|
If an identifier is transmitted on this dataflow that uniquely corresponds to one user. Examples: IP-address, email-address, unique pseudonyms |
Edge: Dataflow |
True, False |
linddun_rules |
|
If this dataflow transmits any data directly from users like messages, texts or files. |
Edge: Dataflow |
True, False |
linddun_rules |
|
If this dataflow transmits the clear identity of users such as the full name. |
Edge: Dataflow |
True, False |
linddun_rules |
|
If this dataflow transmits any properties that are dependent on the user like OS, browser, screen size, language, etc. |
Edge: Dataflow |
True, False |
linddun_rules |