Many natural and artificial systems, from human brain to wireless sensors, are tightly interconnected and form networks. Interactions in systems with complex organization often leads to collective phenomena -- such as social agreement, power grid synchronization or consensus in distributed systems of machines -- which favor the temporary emergence of special groups of units, being people, sensors or neurons, each playing a crucial functional role.
We propose a unifying approach, based on a special type of geometry developed for machine learning, named diffusion geometry, to understand and predict functional organization in systems with arbitrary complex structure. Diffusion geometry is defined by means of diffusion distance, originally introduced to build diffusion maps of complex data, to reduce their dimensionality by integrating local similarities at different scales, thus giving a global description of the data-set.
In the case of complex networks, it can be related to important dynamical processes, such as random walks and navigability, opinion dynamics like the DeGroot model and metastable synchronization in the Kuramoto model.
The main idea is that the backbone of a system alters the way a piece of information (e.g. a meme in a social network or data exchanged by machines) propagate between two endpoints. We exploit this unique behavior to build a geometric picture of the system, where clusters can be easily identified.
The framework can be used for a variety of application, such as building functional maps of the human brain and more effective representations of groups in structured populations. Other potential applications include machine learning and systems biology.
Mathematical formulation of network diffusion geometry
(A) A Girvan-Newman benchmark network of N=128 oscillators. The mesoscale structure is organized into four clusters.
(B) Phases of identical oscillators reported on a polar coordinate system with unitary radius for the original system (outer ring) and with smaller radius (inner ring) for one realization of its configuration model (which preserves the connectivity distribution of the original data and remove other correlations). The oscillators have initial phases uniformly distributed at time t=0 (left panel). They are free to interact each other, according to the underlying topology, and drive the systems to collective synchronization at t=20 (right panel). The original system reaches a metastable state -- with intra-cluster units synchronized to a common phase, with small fluctuations around a reference value -- whereas the configuration model -- where the mesoscale structure has been destroyed -- quickly reaches the global attractor.
(C) Opinion-formation dynamics of agents in the DeGroot model of decentralized consensus. Each line represents the evolution of an opinion. In the metastable state local consensus is firstly achieved within clusters (see the inset) and later evolves into a collective opinion.
Functional mesoscale organization in an Erdos-Renyi network. (A) An Erdos-Renyi network is not expected to show peculiar mesoscale functional organization (B) when compared to its configuration model (C). Here, diffusion distance matrices are shown in both cases, with color encoding the diffusion distance. In (D), the average diffusion distance in the network of super-units is used to find the most persistent mesoscale in the original network (solid line) and to evaluate its difference from the configuration model (dashed line). The two curves collapse on each other across all scales, correctly suggesting that the identified mesoscale is compatible with its random expectation. Networks with 128 nodes and link probability 0.076 have been considered in this case.
Functional mesoscale organization in a Girvan-Newman network. In this case there is a strong topological mesoscale structure with 4 clusters and the functional mesoscale consists of the same clusters, providing evidence that functional organization might correspond to structural organization. The difference between the original network and its random expectation is evident in (D). Networks with 128 nodes have been considered in this case.
Functional clusters in Macaque visuo-tactile cortex. Diffusion geometry analysis of the anatomical connectivity (335 visual, 85 sensorimotor and 43 heteromodal) from 30 visual cortical areas and 15 sensorimotor areas in the Macaque monkey clearly reveals the regions corresponding to the two
areas, while unraveling more detailed functional clusters which are persistent across time. (A) Average diffusion distance matrix of the empirical network, (B) of its configuration model and (C) network of functional super-units which is most persistent across time and significantly different from random expectation.
Novel information from diffusion geometry. The network representation of the Macaque visuo-tactile cortex is shown in (A). Node's shape encodes anatomical information (circles for sensorimotor areas, squares for visual ones). From left to right, nodes' and groups' color encode the clusters identified from anatomical information only, spin-glass community detection, diffusion geometry functional clusters and configurational clusters (i.e., obtained from a representative random realization of the empirical network, while preserving the underlying degree distribution). (B) The similarity among the identified clusters is quantified by normalized mutual information (left) and variation of information (right).
We exploit diffusion distance to embed the nodes of the network into a metric space (e.g., by using multidimensional scaling). The geometry of this space might help to gain new insights on structure and dynamics of the system:
References:
M. De Domenico. Phys. Rev. Lett. 118, 168301 (2017) [On the cover]
Generalization to multilayer systems
Recently, the above framework has been generalized to the real of multilayer networks and distinct types of random walk dynamics.
Different multilayer networks and their supraadjacency matrix representation (a) an edge-colored multigraph, with layers corresponding to colors and no interlayer connections; (b) a multiplex network where the replicas in the different layers are interconnected sequentially, a type of intertwining or diagonal coupling, and (c) the most general
interconnected case, where inter-layer connections are not restricted to replicas (exogenous interactions). Latin letters denote nodes, while Greek letters are used for layers. D is the intensity of the intertwining between state nodes. C describes the general inter-layer connectivity.
A synthetic two-layers network with unitary diagonal coupling equal to 1, for all i, and its functional characterization. (a) Isolated nodes and different components have been highlighted in the two layers. (b) The supra-adjacency matrix of the multiplex. (c) the NxL diffusion supra-distance matrices for three different random walk dynamics and t = 1, 2, 5. The PrRW is evaluated for r = 0.5.
Average diffusion distance on two-layers multiplexes with different topologies, for fixed values of global average edge overlap (Barabasi-Albert and Watts-Strogatz) and partition overlap (Girvan-Newman) between layers. Dendrograms on the right-hand side of each distance matrix represents the corresponding hierarchical clustering to highlight the meso-scale
organization of the system, with color encoding the planted node assignment in each layer.
Frobenius norm of the supra-distance matrices of a synthetic two-layers network, w.r.t. a diffusive random walk. As the fraction of inter-layer links grows we move from two disconnected multiplexes to a fully inter-connected multi-layer whit all connections across layers. The heatmaps are four representatives of the different regimes: (a) uncoupled layers; (b) a single-inter-link between the replicas of a random nodes coupling the two layers, partially interconnected multiplex; [1/N] fully interconnected multiplex (all state nodes corresponding to the same physical node are interconnected); (c) multilayer regime consisting of an interconnected multiplex with the addition of cross-links between state nodes of
distinct physical nodes.
References:
G. Bertagnolli, M. De Domenico. arxiv:2006.13032 (2020) [Read]