While the concept of
attack graphs has been discussed, once thing that is lacking is a standard
definition for an attack graph. This
blog hopes to resolve that by presenting a new standard: the Cyber Attack Graph
Schema (CAGS) 1.0
1.
All property names must be lower
case
2.
Nodes must have the following
properties:
1.
"class": May be
"actor", "event", "condition",
"attribute"
2.
"cpt": must be a JSON
string in the format defined at
http://infosecanalytics.blogspot.com/2013/03/conditional-probability-tables-in-json.html
3.
"start": The time the node
is created. Time should be in ISO 8601 combined date and time format (e.g.
2013-03-14T16:57Z)
4.
"id": Assigned by
database.
3.
Nodes must have property
"label".
4.
The "label" property of
nodes of "class" "event", "condition", or
"actor" will contain a string holding a narrative describing the
actor, event, or condition
5.
The "label" property of
nodes of "class" "attribute" must contain a JSON formatted
string with a single "{'type':'value'}" pair. Type is the type/name
of the attribute and value the value.
6.
Nodes of any class MAY have property
"comments" providing additional narrative on the node
7.
Nodes of any class MAY have property
"finish" providing a finish time for the node. Time should be in ISO
8601 combined date and time format (e.g. 2013-03-14T16:57Z)
8.
Edges must have the following
properties:
1.
"source": the id of the
source node
2.
"target": the id of the
target node
3.
"id": id assigned by the
database
4.
"relationship":
1.
Value of "influence" if
"source" property "class" is "attribute" and
"target" property "class" is "event" or
"condition". Value of "leads to" if "source"
property "class" is "event", "threat"
2.
Value of "influence" if
"condition" and "target" property "class" is
"actor", "event", or "condition"
3.
Value of "described by" if
"source" property "class" is "event",
"condition", or "actor" and "target" property
"class" is "attribute"
4.
Value of "described by" if
both "source" and "target" property "class" are
"attribute"
5.
"directed": value of
"True"
9.
Edges may have a property
"confidence" with an integer value from 0 to 100 representing the
percent confidence
10.
Edges must be directed
11.
Nodes and Edges may have additional
properties, however they will not be validated and may be ignored by the attack
graph.
12.
Nodes and Edges missing values may
still be accepted if the value can be filled in.
Consider replacing spaces with underscores (i.e. "described by" becomes "described_by".)
ReplyDeleteConsider replacing "start" with "start_time" as start is ambiguous in some cypher queries.
Consider describing attributes as {"class":"attribute", "attribute":, :} rather than just {"class":"attribute", :} to improve ease of querying the graph.
Consider requiring edges to have a start_time.
ReplyDeleteConsider describing attributes as {"class":"attribute", "attribute type":, "type value":}. This would improve querying the graph directly for a value and for the type.
ReplyDelete'label' is reserved in some graph databases. Consider using the class value in place of label and indexing all class values on all nodes.
ReplyDeleteThe cpt requirement will be removed in next version.
ReplyDeleteGraph IDs should be a URI of the form :?class=< node class>&=&= so class:attribute, attribute = ip, ip = 8.8.8.8 at mybiz would be mybiz:?class=attribute&attribute=ip&ip=8.8.8.8
ReplyDeleteTo allow efficient storage, it may be necessary to express {class:, :, :,} with explicit columns of {class:, key:, value:}. The advantage is that nodes can be indexed on class, key, and value. The limitation is that the a:b, b:c, c:d, d:etc, chain is limited in length.
ReplyDeleteConsider making edge URIs derived from their source, relationship, destination triple.
ReplyDeleteIn documentation, may want to correlate source, relationship, destination to subject, predicate, object.
Edge URIs should be as follows ":?source=&destination=&relationship=". (This is necessary as the source and destination are URIs in and of themselves.) Hash should be an md5 hash of the source and destination URI in URL namespace.
DeleteIf there is a chain from the relationship such as =, those should then be added "&=...".
Finally, if an origin exists, the origin should be added. "&origin=".
For example:
"mybiz:?source=dd28255d-9ebe-3df7-9384-73e257baf7d1
&destination=091f2eb4-ba1b-39ae-a677-82c77f1ef530&relationship=described_by&described_by=nameserver&origin=farsight"
Consider allowing edges to have sub-relationships such as: .
ReplyDeleteConsider allowing edges to have an origin to explain the enrichment they came from. e.g. .
The URI should be stored as an attribute to the node or edge with a key of 'uri' and should be used as the node and edge id whenever possible.
ReplyDeleteNeed to consider how to handle the difference between "no relationship found" and "creation of relationship not attempted".
ReplyDeletePrefixes should not be required on URIs within a graph. The reasoning being that if the nodes/edges are within a graph, the prefix is implicit.
ReplyDeleteThe case exists where we may wish to suggest that knowledge about a node resides in another graph. While adding the prefix to the node would indicate that, it also allows for two nodes of the same key:value to exist in the same graph. Moreso, a key:value node such as could be used to suggest an algorithm should query another graph for the information.
This does not preclude having a prefix on a node in a graph, (with the absence of a prefix implying the location of the graph represents the prefix), however such a prefix would require a means of translating a prefix to a fully qualified location which does not currently exist in the schema.
This does not preclude including a prefix (or the fully qualified URI) in a client subgraph to help distinguish between nodes from different locations. However, it will still suffer from the same issue of potential duplicate nodes. It is more advisable that prefixes only be kept for edges. The client may choose how to keep the mapping between prefix and full location.