GraphBLAS Tensors
GraphBLAS an API for Sparse Linear Algebra is at the core of FalkorDB,
It provides key elements for representing and traversing in memory graphs.
Specifically Sparse Matrices and a set of operations e.g. matrix multiplication, addition and transpose.
Although up front GraphBLAS is an API specification we’ve been using Professor Tim David’s highly performant implementation of the API, SuiteSparse:GraphBLAS which covers the entire API with additional extensions.
The API is extremely flexible and allows for advanced usages e.g. defining your custom semirings, binary & unary operations along with user defined data types and it all works nicely.
Graph Representation
To represent a directed graph using a matrix one could assign each node a unique incremental ID (starting with ID 0) assuming node i has an outgoing edge to node j
we could mark this connection in boolean matrix M[i, j] = True. In case edges are also associated with IDs (as FalkorDB does) M is a UINT64_T matrix and M[i, j] = Edge_ID.
Things get a bit more complex with FalkorDB as we follow the property graph model which allows for multi labeled nodes and different edges can be associated with different relationship types.
Therefore a set of different matrices are used to capture the entire graph topology. But for now let’s keep things simple and focus on a single M matrix.
Multi Edge
Consider a graph which models communication between people
e.g. Arthur sent a text to Katy or Robert phoned Jill.
(Arthur)-[Texted]->(Katy) (Robert)-[Phoned]->(Jill)
This is all well, but how should we capture a graph where multiple calls or texts had been sent between the same two entities? e.g. Arther texted Katy on multiple occasions.
As we all know a Boolean matrix entry can only represent one out of three distinct values: True, False or Empty.
Tensor
To address multiple connections between two entities we’ve extended the set of matrices supported by FalkorDB to include a tensor, now to be as accurate as I can a Tensor can have a variable number of dimensions: A one dimension tensor is a Vector, Two dimension tensor is a 2D matrix and a three dimension tensor the one used in FalkorDB can be thought of as a 2D matrix with vectors as its entries, I like to think of it as a 3D matrix.
FalkorDB treats tensors as an extension to its regular 2D sparse matrices and can consider a tensor as a simple sparse matrix, e.g. when checking if there’s a path connecting node I to node J.
Consider the following scenario starting with an empty graph.
Let’s create two entities:
Arthur and Katy, using Cypher (FalkorDB’s supported query language) we can create the two by issuing this query:
CREATE (:Person {name:'Arthur'}), (:Person {name:'Katy'})
Now let’s create the first edge noting a text was sent by Arthur to Katy
MATCH (Arthur:Person {name:'Arthur'}), (Katy:Person {name:'Katy'})
CREATE (Arthur)-[:Text]->(Katy)
At this point a new tensor `T` is created within our database capturing all edges of type Text and so T[0, 1] = 0 is set assuming Arthur’s and Katy’s internal IDs are 0 and 1 and the only edge in the graph is assigned internal ID 0.
Note up to this point no vectors had been created, the tensor T contains only scalars instead of vector(s), this is done to reduce memory overhead as vectors require additional allocations.
Now it’s time to introduce a second text message, we can just reissue the previous query.
By creating multiple edges of the same relationship-type between the same set of nodes T[0, 1] is not longer a scalar but a vector
T[0, 1] = V
Where V[0] True and V[1] = True
Assuming the two edges had been assigned internal IDs 0 and 1.
Similarly if we delete connections to a point where only a single edge connects Arthur and Katy the vector V would be freed and its only remaining entry would become a scalar entry.
Following our example graph
MATCH (Arthur:Person {name:'Arthur'})-[e]->(Katy:Person {name:'Katy'})
WITH e LIMIT 1
DELETE e
We’ll be left with T[0, 1] = 1 where 1 is the remaining edge internal ID.
Tensors and Sparse matrices
As mentioned earlier tensors within FalkorDB can be treated as regular 2D matrices, the GraphBLAS type for a matrix is GrB_Matrix and so to demonstrate the flexibility of this API consider how one can find out which nodes labeled as Admin has an outgoing edge of relationship-type Login ?
MATCH (a:Admin)-[Login]->(s:System) RETURN a
Assuming our graph describes Login connections between different types of users: Admin, Owner and Contributor to nodes of type System
I should mention that FalkorDB maintains a boolean sparse matrix for each label, L[i, i] = True if node with internal ID i is labeled as L.
To answer the above question all we need to compute is:
Expression: Admin * Login * System
- Admin – being a label matrix noting all nodes in the graph labeled as Admin.
- Login – a relationship tensor capturing all connections of type Login.
- System – being a label matrix noting all nodes in the graph labeled as System.
To compute the expression we’re using a semiring (GxB_ANY_PAIR_BOOL) which doesn’t consider the actual values within the multiplied matrices, but only cares for their patterns, whether the entry i, j is present or not.
This allows for mixing different types of matrices when performing multiplication in addition to speeding the overall computation time.
Conclusion
GraphBAS is a powerful API and I would dare saying it’s the missing piece in the BLAS API enabling operating on sparse vectors and matrices with the additional flexibility of working with a variety of native matrix types in addition to user defined types and custom semirings.
FalkorDB had been using it at its core for both clear expressibility of different graph operations in the language of linear algebra but mostly due to its high performance and low memory footprint. Enabling us fast graph query processing which yields low latency and high throughput.
In case you’ve found this post interesting consider following our work on Github or try out FalkorDB.