Degree sequence

A degree sequence is the list of the number of connections — the degrees — of each node in a graph, ordered from highest to lowest or in the order of the nodes themselves. It is the most compressed summary of a network's local connectivity structure: every other property of the graph, from its global diameter to its clustering coefficient, is constrained by what the degree sequence permits. In network science, the degree sequence is not merely a descriptive statistic but a fundamental object of study, because it reveals what a network could be before any other structure is imposed.

The study of degree sequences is older than network science. In graph theory, a sequence of non-negative integers is called graphic if there exists at least one simple graph whose nodes have exactly those degrees. The Erdős–Gallai theorem and the Havel–Hakimi algorithm provide necessary and sufficient conditions for a sequence to be graphic, and construct such a graph when one exists. These classical results establish a deep fact: not every list of numbers is a realizable network, and the constraints on realizability are themselves a form of structural information.

Degree Sequence in Network Models

In the Erdős-Rényi model, all nodes are statistically equivalent, and the degree sequence is approximately a Poisson distribution with mean equal to the expected degree. The sequence is tightly concentrated: most nodes have similar degrees, and extreme deviations are exponentially rare. This homogeneity is the defining signature of randomness without structure.

Real networks almost never exhibit Poisson degree sequences. Social networks, the internet, biological interaction networks, and citation networks all have heavy-tailed degree sequences in which a small number of nodes have vastly more connections than the average. These sequences are often approximated by power laws or Pareto distributions, giving rise to the scale-free property. The degree sequence is where the deviation from randomness is most visible.

The Configuration model was designed precisely to decouple the degree sequence from other structural properties. By generating random graphs that match any specified degree sequence, the configuration model reveals which network features are consequences of the degree distribution alone and which require additional mechanisms. For example, the high clustering of social networks is not explained by their degree sequence; the configuration model with the same degrees produces lower clustering. This makes the degree sequence both a constraint and a clue: it tells you what cannot explain the structure you observe.

The Information Content of Degree Sequences

A degree sequence contains less information than the full adjacency matrix of a graph, but it is not information-poor. The degree sequence determines the number of edges, the average degree, the maximum degree, and the heterogeneity of connectivity. It constrains the possible existence of paths, the vulnerability to targeted attack, and the epidemic threshold for disease spread. What it does not determine — and this is crucial — is the specific pattern of connections: which high-degree node connects to which low-degree node, whether edges cluster into triangles, and whether the graph divides into communities.

The gap between degree sequence and full structure is the domain of assortativity, clustering coefficient, and community detection. A network with a given degree sequence can be assortative (high-degree nodes connect to high-degree nodes) or disassortative (high-degree nodes connect to low-degree nodes). The degree sequence permits both; the actual wiring determines which occurs. This is why the configuration model is so powerful: it shows you the baseline, and deviations from the baseline are where the science lives.

The degree sequence is the network's genotype: it specifies what is possible, not what is actual. Two networks with identical degree sequences can differ radically in function — one a robust communication infrastructure, the other a fragile cascade waiting to happen. The degree sequence is necessary but never sufficient. Network scientists who stop at the degree distribution have read the genome but not the organism.