Hyperbolic Geometry in Maps and Networks $ \def\Real{\mathbb{R}} \def\X{\mathbb{X}} \def\S{\mathbb{S}} \def\A{\mathcal{A}} \def\C{\mathcal{C}} \def\P{\mathbb{P}} \def\views{\mathbb{G}} \def\Ob{S} \def\Lu{L} \def\Ll{\ell} \def\e{\varepsilon} \def\homt{\gamma} \def\dl{\mathbf{x}} \def\Real{\mathbb{R}} \def\Int{\mathbb Z} \def\vol{\mathtt{vol}} \def\chains{\mathtt{Ch}} \def\one{\mathbb{1}} \def\heis{\mathcal H} \def\hsp{\mathbb{H}} $


Hyperbolic Geometry of Maps and Networks




yuliy baryshnikov



ECE and Mathematics UIUC

Plan


  • Hyperbolic Spaces: a reminder
  • Hyperbolic Space at Your Fingertips: Geodesics of the Smartphone Maps
  • Navigating the Wires: Hyperbolic Spaces and the Internet

hyperbolic spaces

Let us briefly review some basic properties of the hyperbolic spaces.

The standard models are the unit ball with the metric $$ \frac{dx^2}{(1-\sum x_i^2)^2}, $$ or the half-space, with the metric $$ x_1^{-2}{dx^2}. $$

Hyperbolic geometry differs in many ways from Euclidean geometry (although close to it at small distances - hyperbolic spaces have a scale).

In the half-space model, there are several obvious symmetries: horizontal shifts, and the scaling about a point at the absolute.

In either of the models, - disk or halfspace - the geodesics are the circular arcs or segments (rays) orthogonal to the boundary (easy to prove using Noether theorem and built-in invariances).

The sum of angles in a triangle is less than $\pi$; the circumference of a circle is more than $2\pi r$. Triangles formed by parallel lines are called ideal, and have all the same area, - all planar ideal polygons with a given number of sides do.


The triangles are thin: the union of the tubes around the edges of width proportional to the characteristic length cover the triangle. Much of the properties (especially, the large-scale properties) of the hyperbolic spaces can be derived from this property.


Large scale properties are surviving the quasi-isometries: we say that mapping $f:X\to Y$ is a quasi-isometry if for some constants $A,C>0$, $$ A^{-1}d_X(x,x')-C\leq d_Y(f(x),f(x'))\leq A d_X(x,x')+C. $$

Hyperbolic Space in Google Maps

Navigation of the Google maps in iPad (or any smartphone) is not the same as navigating the Earth. We are moving from views to views.

Assume the Earth to be flat (not unreasonable simplification). Then the space of views $\views$ can be identified with $\Real^2\times\Real_+$:

What is the optimal way to navigate the space of views?

We introduce the finger distance $d_\views$, reflecting how much effort one needs to move between the views.

Obviously, the finger metric is invariant with respect to the horizontal shifts and rescalings.

We consider $\Real^2\times\Real_+$ also as a hyperbolic space with the standard metric $d_\hsp$.

The fact that both metrics are invariant with respect to the same transitive group of symmetries immediately implies

Corollary: The space of views $(\views, d_\views)$ is quasi-isometric to the standard hyperbolic 3-space $\hsp^3$.

By Morse lemma, the geodesics of the hyperbolic space stay within the bounded distance from the geodesics of the finger metric.

In other words, the intuitive navigation algorithm:

	NAVIGATE(VIEW_START, VIEW_GOAL)
	view=VIEW_START;
	while VIEW_GOAL not in VIEW
		        zoom out;
	end while;
	center(VIEW_GOAL);
	until VIEW==VIEW_GOAL;
		        zoom in;
	end until;
	    
is (up to a multiplicative constant) not only optimal - a quasi-geodesic in the finger metric - but is forced on the users by the geometry of the viewspace.

Perhaps not too surprising, the hyperbolic geometry in the space of map views defined by the finger metric is (up to a constant) the best possible if one tries to minimize the distance between (most) of the view pairs of some reasonable compact subset of viewspace (for example, all views about some scale: Google maps allows the scales over 5 orders of magnitude).

Steps done by single motion connect finitely many views: being restricted by agility, image resolution etc. Thus the total number of steps needed is at least $\Omega(\log{V})$, where $V$ is the number of views available at the given resolution level.

large-scale networks and their geometries

Large networks fascinate both for their inherent complexity and for the glimpses of simple universal laws they seem to illustrate.

One of the recent - and very attractive ideas - is to view large networks as geometries, i.e. as path metric spaces.

Rescaling allows to smooth out unwanted local noise; often using self-similarity or other symmetries leads to nice, simple models.

The "real life" network we consider here is the Internet, or, somewhat more precisely, the network of the Autonomous System Nodes. It is a large-scale network consisting of large players, like ISPs, Universities, large companies and various unknown entities.

There about $2.5\times 10^4$ (self-registered) nodes, interacting using BGP.

One research thread in the theory of large networks addresses the presumed hyperbolicity of these networks.

The inquiry has two facets:

  • what does it mean, to look hyperbolic, and
  • what thus defined hyperbolicity implies.

We will ignore here the second question, albeit it has many significant implications. Thus, one can show that in Gromov hyperbolic networks, under suitable demand structure a core emerges: there exists a small ball traversed by most of the traffic.

So, what does it mean, for a network, or large finite metric space, to "look hyperbolic"?

$\def\hyp{{\mathcal{H}}}$

There are two broad proposals, how to answer this question (when a network "looks hyperbolic?"):

  • One can introduce some operational measure of hyperbolicity, or negative curvature, for example Gromov-Rips hyperbolicity (the thin triangle condition), or some geometry comparison property (CAT(0)-geometry).
  • Alternatively, one can assume that the network, with its intrinsic (path- or hop-metric) is close, say in Hausdorff sense, to a standard hyperbolic space: for example is sampled from a large ball in the space $\hyp$, under an appropriate measure.


One of the key assumptions here is that the sample is drawn from a manifold. This is a portentous assumption: samples from manifolds are easier to analyze than those from singular sets.

There is, of course, some potential for cheating here: the hyperbolic spaces look too tree-like...

Let us concentrate on the second approach to the modeling of the large networks. This is the approach based on viewing the networks as "morally" sampled from a hyperbolic space, and in particular, on assuming that the sample is drawn from a manifold.
This idea was advanced in a series of papers, where it was used to explain (observed?) heavy tailed distributions of the degrees of the nodes, to sketch a universal routing protocol, and infer common underlying laws in the structure of the Internet and the Universe.

(adopted from Boguna, Papadoupulos, Krioukov, Nature Comm., 2010)

manifolds and their invariants

Given some collection of points $X$, sampled from a manifold (or, perhaps some space with controlled singularities), what information can be recovered about the space?

We assume that the proximity data derived from the distances on the manifold translate into the proximity data on the point cloud $X$, turning it into a network, i.e. a finite metric space with the distances given by hop counts (or some weighted version).

Let's address the simplest question: how to estimate the dimension of a manifold? Specifically, if we model the Internet as sampled from a hyperbolic space $\hsp^d$, what is the dimension $d$ of this hyperbolic space?

It is not hard to see that the answer might be scale-dependent.

The classical definitions of the dimensions (covering dimension) are somewhat hard to test.

One feasible approach is to look at the volume growth: in $\Real^d$, the volume of the balls of size $R$ grows as $R^d$. This idea has been exploited in many approaches to understanding the dimension of the internet. (Correlation analysis, growth,...)

There are some issues with this approach, however: the definition using the volume growth depends on the metric used.

Consider an archetypal example, the Heisenberg group. It is the group $\heis$ of matrices $$ \left(\begin{array}{ccc} 1 & k & m \\ 0 & 1 & l \\ 0 & 0 & 1\\ \end{array}\right) $$ so that $\heis\cong\Int^3$ as a set. But the balls of radius $R$ are of volume $\approx R^4$.

Perhaps the most reasonable way to measure the dimension of a manifold is via homology.

Recall that the (reduced) homology group of a sphere $S^k$ are $$ \bar{H}_l(S^k,\Int)=\left\{\begin{array}{lll}\Int, &\mathrm{if}& k=l,\\ 0 &\mathrm{otherwise.}&\\ \end{array}\right. $$

This allows one to recover the dimension of a manifold $M^m$ via checking the homologies $$ \bar{H}_*(M, M-\{x\}): $$ it should be of rank $1$ precisely in dimension $m$ and nowhere else.

Note that the dimension is a function of a point - a separate theorem asserts that it is a (locally) constant function.

One can also recognize the boundary points (in a manifold with boundary or, more generally, with corners): if $x\in\partial X$, $$ \bar{H}_*(M,M-x)\cong 0 $$ in all dimensions.

homology from data

Point clouds or networks are somewhat hard to geometerize. One can use some embedding techniques (Eigenmap, Isomap, LLE, pick one or two), or rely on purely topological tools.

The latter path is preferable, being intrinsic. There is a widely used way to construct a simplicial complex from a network: the Rips-Vietoris (clique) complex.

Definition: Let $Y$ be a metric space. The Rips complex $R(\epsilon, Y)$ contains the simplex $\{v_o, v_1,\ldots, v_k\}, v_i\in Y$ if all the pairwise distances between the vertices $v_i, v_j$ are at most $\epsilon$.

How well this simplicial complex approximates the topology of the manifold? In the best possible way:

Approximation Theorem: (Hausmann-Latschev)

For a Riemannian manifold $X$, there exists $\epsilon_o$ such that for any finite subset $Y\subset X$ the Rips complex $R(\epsilon, Y)$, $\epsilon\leq \epsilon_o$, is homotopy equivalent to $X$, if the Hausdorff distance between $Y$ and $X$ is at most $\delta(\epsilon)$ from $X$.

Typical generating mechanism for a finite subset close to a Riemannian manifold $M$ is a random sample from $M$. To recover the dimension of the manifold at a point $x$, we need to have a dense enough sample in a small ball around $x$. One can talk about statistics of the dimension.
Theorem: Let $Y$ be an iid sample of size $N\to\infty$ from the Lebesgue measure on a compact Riemannian manifold $M$ of dimension $d$. Let $Y_o(\epsilon)=Y\cap \partial_\epsilon M$ be the part of the sample at distance at least $\epsilon$ from $\partial M$. Then for $\epsilon$ small enough, the fraction of the points $y\in Y$ for which $$ \bar{H}_\cdot (R(\epsilon,Y), R(\epsilon,Y-B(y,k\epsilon)) $$ is not equal to $\bar{H}_\cdot (S^d)$ tends to $0$ as $N\to\infty$. (Here $k$ is a constant depending only on the dimension $d$ and the sectional curvatures of $M$.)

This results tell us what to expect from what to expect from local dimensions of the Rips complexes of the samples from the hyperbolic spaces $H^d$ (or any other manifold):

  • If one computes the homologies of $$ \bar{H}_\cdot (R(\epsilon,Y), R(\epsilon,Y-B(y,k\epsilon)), $$ then
  • the points "in the bulk" will have reduced homologies of a $d$-sphere;
  • the points near the boundary will have trivial homologies;
  • there will be some number of pathologies (points with neither trivial homologies nor those of $S^d$), but their fraction will be small.

Experiments

We start with the samples from a hyperbolic plane (or rather a large ball $B_R\subset \hsp^3$):

Easier to look at the homologies:

The plots show the cumulative ranks of the local relative homologies (1- and 2-dimensional) as a function of the distance of the node to the center of $B_R$. One can see that the behavior in accordance with the predictions (the boundary is large).

Now, let's look at the ASN. Here is the impressionistic view:

Some graphs:

Ranks of homologies of the nodes ordered by their centrality (larger is closer to the core). Definitely glimpses of a structure, but not of a sample from a manifold.

What these computations tell us:


  • It seems quite clear that the ASN networks are not sampled from the hyperbolic space - at least not from a regular enough density.
  • It does not mean that the hyperbolic spaces cannot be used to embed networks - but they definitely are bad as a template for the underlying geometry.
  • On the other hand, the local homology is a brand new characteristic of a node - we should investigate those.



Thank You!