Network Diffusion Geometry





Many natural and artificial systems, from human brain to wireless sensors, are tightly interconnected and form networks. Interactions in systems with complex organization often leads to collective phenomena -- such as social agreement, power grid synchronization or consensus in distributed systems of machines -- which favor the temporary emergence of special groups of units, being people, sensors or neurons, each playing a crucial functional role. We propose a unifying approach, based on a special type of geometry developed for machine learning, named diffusion geometry, to understand and predict functional organization in systems with arbitrary complex structure. Diffusion geometry is defined by means of diffusion distance, originally introduced to build diffusion maps of complex data, to reduce their dimensionality by integrating local similarities at different scales, thus giving a global description of the data-set.
In the case of complex networks, it can be related to important dynamical processes, such as random walks and navigability, opinion dynamics like the DeGroot model and metastable synchronization in the Kuramoto model.
The main idea is that the backbone of a system alters the way a piece of information (e.g. a meme in a social network or data exchanged by machines) propagate between two endpoints. We exploit this unique behavior to build a geometric picture of the system, where clusters can be easily identified.
The framework can be used for a variety of application, such as building functional maps of the human brain and more effective representations of groups in structured populations. Other potential applications include machine learning and systems biology.

Network diffusion geometry is a special type of geometry induced by network-driven processes, based on a distance which is part of the class of kinematic distances. This framework provides an effective geometry of a system’s function, which cannot be obtained by purely topological latent geometry.

Geometry induced by spreading dynamics and universal temporal distance.
(a) Evolution of a simulated disease spreading originating in Hong Kong: red symbols encode the prevalence. Top panels show the evolution in the latent space; bottom panels show the same dynamics in the natural space. When the spreading dynamics is depicted by exploiting the induced geometry, complex spatial patterns are mapped to homogeneous wave fronts propagating at constant effective speed.
(b,c) Relationship between epidemic arrival times (Ta) and the two distances for the spreading of influenza A virus subtype H1N1. The relation is nonlinear when geographic distance Dg is used (part b) whereas it is nicely reproduced by a straight line when effective distance Deff is considered (part c).
(d,e) | A signal travelling from vertex j to the rest of the system exhibits different spreading patterns (part d), captured by the universal temporal distance L(j → i), impacting different nodes (such as nodes 1 and 2) in different ways (part e) depending on the type of dynamics, described by the parameter θ: distance limited (θ = 0; top); degree limited (θ > 0; middle); or composite (θ < 0; bottom) Aij indicates the corresponding entries of the adjacency matrix; xi(t) indicates the ith entry of the state vector at time t.
(f–h) The homogeneous propagation of concentric wave fronts emerge from the analysis of a broad spectrum of synthetic and empirical systems. R and R indicate models capturing regulatory dynamics, M captures ecological interactions, E captures epidemic spreading, N captures neuronal activation and P captures population dynamics.

Figure from Network geometry, Nature Reviews Physics volume 3, 114 (2021), readapted from Science 342, 1337 (2013) and Hens et al, Nature Physics 15, 403 (2019).

Mathematical formulation of network diffusion geometry


(A) A Girvan-Newman benchmark network of N=128 oscillators. The mesoscale structure is organized into four clusters.

(B) Phases of identical oscillators reported on a polar coordinate system with unitary radius for the original system (outer ring) and with smaller radius (inner ring) for one realization of its configuration model (which preserves the connectivity distribution of the original data and remove other correlations). The oscillators have initial phases uniformly distributed at time t=0 (left panel). They are free to interact each other, according to the underlying topology, and drive the systems to collective synchronization at t=20 (right panel). The original system reaches a metastable state -- with intra-cluster units synchronized to a common phase, with small fluctuations around a reference value -- whereas the configuration model -- where the mesoscale structure has been destroyed -- quickly reaches the global attractor.

(C) Opinion-formation dynamics of agents in the DeGroot model of decentralized consensus. Each line represents the evolution of an opinion. In the metastable state local consensus is firstly achieved within clusters (see the inset) and later evolves into a collective opinion.


Functional mesoscale organization in an Erdos-Renyi network. (A) An Erdos-Renyi network is not expected to show peculiar mesoscale functional organization (B) when compared to its configuration model (C). Here, diffusion distance matrices are shown in both cases, with color encoding the diffusion distance. In (D), the average diffusion distance in the network of super-units is used to find the most persistent mesoscale in the original network (solid line) and to evaluate its difference from the configuration model (dashed line). The two curves collapse on each other across all scales, correctly suggesting that the identified mesoscale is compatible with its random expectation. Networks with 128 nodes and link probability 0.076 have been considered in this case.

Functional mesoscale organization in a Girvan-Newman network. In this case there is a strong topological mesoscale structure with 4 clusters and the functional mesoscale consists of the same clusters, providing evidence that functional organization might correspond to structural organization. The difference between the original network and its random expectation is evident in (D). Networks with 128 nodes have been considered in this case.


Functional clusters in Macaque visuo-tactile cortex. Diffusion geometry analysis of the anatomical connectivity (335 visual, 85 sensorimotor and 43 heteromodal) from 30 visual cortical areas and 15 sensorimotor areas in the Macaque monkey clearly reveals the regions corresponding to the two areas, while unraveling more detailed functional clusters which are persistent across time. (A) Average diffusion distance matrix of the empirical network, (B) of its configuration model and (C) network of functional super-units which is most persistent across time and significantly different from random expectation.


Novel information from diffusion geometry. The network representation of the Macaque visuo-tactile cortex is shown in (A). Node's shape encodes anatomical information (circles for sensorimotor areas, squares for visual ones). From left to right, nodes' and groups' color encode the clusters identified from anatomical information only, spin-glass community detection, diffusion geometry functional clusters and configurational clusters (i.e., obtained from a representative random realization of the empirical network, while preserving the underlying degree distribution). (B) The similarity among the identified clusters is quantified by normalized mutual information (left) and variation of information (right).

We exploit diffusion distance to embed the nodes of the network into a metric space (e.g., by using multidimensional scaling). The geometry of this space might help to gain new insights on structure and dynamics of the system:




References:

Generalization to multilayer systems


Recently, the above framework has been generalized to the real of multilayer networks and distinct types of random walk dynamics.

Different multilayer networks and their supraadjacency matrix representation (a) an edge-colored multigraph, with layers corresponding to colors and no interlayer connections; (b) a multiplex network where the replicas in the different layers are interconnected sequentially, a type of intertwining or diagonal coupling, and (c) the most general interconnected case, where inter-layer connections are not restricted to replicas (exogenous interactions). Latin letters denote nodes, while Greek letters are used for layers. D is the intensity of the intertwining between state nodes. C describes the general inter-layer connectivity.


A synthetic two-layers network with unitary diagonal coupling equal to 1, for all i, and its functional characterization. (a) Isolated nodes and different components have been highlighted in the two layers. (b) The supra-adjacency matrix of the multiplex. (c) the NxL diffusion supra-distance matrices for three different random walk dynamics and t = 1, 2, 5. The PrRW is evaluated for r = 0.5.

Average diffusion distance on two-layers multiplexes with different topologies, for fixed values of global average edge overlap (Barabasi-Albert and Watts-Strogatz) and partition overlap (Girvan-Newman) between layers. Dendrograms on the right-hand side of each distance matrix represents the corresponding hierarchical clustering to highlight the meso-scale organization of the system, with color encoding the planted node assignment in each layer.


Frobenius norm of the supra-distance matrices of a synthetic two-layers network, w.r.t. a diffusive random walk. As the fraction of inter-layer links grows we move from two disconnected multiplexes to a fully inter-connected multi-layer whit all connections across layers. The heatmaps are four representatives of the different regimes: (a) uncoupled layers; (b) a single-inter-link between the replicas of a random nodes coupling the two layers, partially interconnected multiplex; [1/N] fully interconnected multiplex (all state nodes corresponding to the same physical node are interconnected); (c) multilayer regime consisting of an interconnected multiplex with the addition of cross-links between state nodes of distinct physical nodes.

References: