Generative Data Intelligence

Math That Lets You Think Locally but Act Globally | Quanta Magazine

Date:

Introduction

In math, as in life, small choices can have big consequences. This is especially true in graph theory, a field that studies networks of objects and the connections between them. Here’s a little puzzle to help you see why.

Given six dots, your goal is to connect them to each other with line segments so that there’s always a path between any pair of dots, with no path exceeding two line segments in length. Stop scrolling for a moment and try to work out a solution.

If you solved it, I bet you have something that looks like this:

(click here to reveal the answer)

Notice that this indeed satisfies the requirements of the puzzle. There’s a path between any two dots — what graph theorists call vertices — and no path is longer than two line segments, or edges. (Note: In the puzzle and throughout the column, paths are not allowed to use the same edge more than once.) Your solution might look slightly different, but you’ll end up with the same essential structure, which is easier to see if you move the vertices around a bit.

This “star” structure in graph theory has one central vertex that connects to each of a group of other vertices by a single edge, with no edges between any other vertices. This star isn’t just a solution to our puzzle; it’s the only solution. (You’ll get a chance to prove this in the exercises at the end of the column.)

This puzzle illustrates how local restrictions, like forbidding paths of length 3 or longer, can sometimes force global structures, like the star, to emerge as a result. Leveraging relationships like these can be a powerful tool for understanding graphs and networks, especially when you are looking for certain important structures.

One such structure is a “clique” — a set of vertices where every vertex is directly connected to every other vertex by an edge. Cliques are important because they identify areas of maximum connectivity and dependence. For example, in a network of airline routes, a clique represents a group of cities all connected to each other by direct flights in both directions. This is a source of strength in the network, since you can reach any city in a single flight, and if a route is canceled, you can still connect to any destination with relative ease.

By contrast, an “anti-clique” is a set of vertices where none are directly connected to any others. On a flight map, this represents a group of cities with no direct flights between them. You may still be able to get from point A to point B in an anti-clique, but not directly. You’ll have to travel outside the group first, so getting there will cost you: in time, money or, more generally, efficiency. In a way, anti-cliques identify areas of maximum independence in a network, and for this reason they are also known as independent sets. (You might also see these sets referred to as “cocliques” or “stable sets” elsewhere.)

Finding large cliques or large independent sets, or even just guaranteeing they exist, turns out to be an important part of analyzing graphs and networks. And that’s where the Erdős-Hajnal conjecture comes in. This says that if you forbid certain local structures in graphs, then certain global structures — in particular, relatively large cliques or relatively large independent sets — are unavoidable. It’s one of many open questions attributed to Paul Erdős, the famous mathematical nomad who traveled the world sharing coffee and conjectures with thousands of collaborators. When the Erdős-Hajnal conjecture holds, it provides information that can be used by scientists in fields as diverse as biology, logistics and computer science to draw even stronger conclusions about the global structures of their networks.

We’ve actually the seen Erdős-Hajnal conjecture in action already. In our original puzzle the star shape was unavoidable, and as it turns out, that shape guarantees a large independent set. The five vertices connected to the center of the star have no other connections to each other. You can see this independent set by ignoring the central vertex and the edges that connect to it.

Notice that the existence of this independent set alerts us to a vulnerability in our network. If this were a flight map, and that central city became inaccessible, no one would be able to fly anywhere.

In our puzzle, forbidding the local structure of paths of length 3 or longer guarantees an independent set of size 5. However, this puzzle relies on something that the Erdős-Hajnal conjecture does not, namely that there exists a path between any two vertices. This means that the graph must be “connected,” and this isn’t part of the Erdős-Hajnal conjecture. Is a large independent set unavoidable if such a graph isn’t necessarily connected?

To see if we can avoid a large independent set in our graph, let’s think like a mathematician and start by considering extreme cases. If we aren’t required to connect everything, what if we don’t connect anything?

Adding no edges at all actually gives us the largest independent set possible, all six vertices. In fact, any vertex that doesn’t connect to an edge can be added to any existing independent set to make it bigger, so to get smaller independent sets we probably want every vertex to have at least one edge. What about something like this?

This graph is in two pieces, and you can get an independent set of size 4 by selecting the two vertices on the ends of each piece. Notice how none of the four chosen vertices are connected by an edge to any of the others, thus forming an independent set.

What about a graph like this?

This graph is in three disconnected pieces, and you can get an independent set of size 3 by selecting one vertex from each piece.

This turns out to be the smallest independent set we can achieve under the given conditions. In other words, in a graph with six vertices, forbidding paths of length 3 guarantees an independent set that is at least half the size of the original graph, which is pretty big when it comes to graphs.

There’s actually something more general going on. All the graphs we explored have one important thing in common: They are all collections of stars!

The graph on the left is two stars of three vertices each, and the graph on the right is three stars of two vertices each. Even the graph with no edges can be thought of as six stars of one vertex each. The rule that forbids paths of length 3 forces the graph to be a collection of stars, and this is true whether you start with six vertices or 600. So when it comes to finding independent sets, the only question is how many disconnected stars you end up with.

In general, if you end up with a lot of stars, it’s easy to get a large independent set. Since the stars aren’t connected to each other, you can just choose one vertex from each star, so this guarantees an independent set at least as large as the number of stars. In the example above on the right, you can take one vertex from each of the three stars of size 2 and produce an independent set of size 3.

On the other hand, if you end up with only a few stars, the stars themselves must be big enough to account for all the vertices in the original graph, and we’ve already seen how to get a relatively large independent set from a star — just take everything but the central vertex. For example, if a graph consists of only two stars, then one of the stars is guaranteed to contain at least half the vertices of the original graph, and this guarantees an independent set roughly half the size of the original graph. We can make an even bigger independent set in our example by combining the independent sets from the disconnected stars. (How small could the biggest possible independent set be? See the exercises for more about this.)

In general, for a graph that is just a collection of disconnected stars, a large independent set is unavoidable. Large stars produce large independent sets, but small stars mean lots of stars, which also produce large independent sets. This approach doesn’t just work for our simple example. It also suggests how to handle a more complicated problem.

Suppose you have a graph with n vertices, and you impose the following local restriction: No three vertices can all be connected to each other. This would be like designing a flight map with a particular goal of minimizing locally redundant routes. If you can get between A and B and between B and C, then you aren’t allowed to create a separate, direct route between A and C.

In other words, there can be no cliques of size 3. In geometry terms, the graph is “triangle-free.”

Must a triangle-free graph have a relatively large independent set? The answer is yes, and as before, the secret lies in the stars. We can show that a triangle-free graph with n vertices must have an independent set that is at least roughly $latex sqrt{n}$ in size, which is relatively big by graph theory standards. Let’s walk through the argument using the following triangle-free graph as an example.

Start by picking any vertex in the graph and considering all of its neighbors — the vertices connected to it by an edge.

None of these neighbors of your chosen vertex are connected to each other — because that would make a triangle and this is a triangle-free graph — so each vertex and its set of neighbors essentially forms a star.

Even though this star is connected to other vertices in the graph, we can still use its properties to guarantee a large independent set. The key is to form a star and then remove it and all the edges that connect to it.

Now find a star in the remaining graph and remove it, too.

In this example we are now left with two single vertices — one-vertex stars themselves — and we form an independent set of size 4 by adding the central vertices from the other two stars we found. Since our original graph has nine vertices and $latex sqrt{9}=3$, our independent set of size 4 fits our conjecture.

This argument works in general for any triangle-free graph. The key is to just keep finding and removing stars and the edges that connect to them until you’ve accounted for all the vertices. Once you’ve done that, just count the number of stars.

Let’s say you end up with k stars. If $latex k > sqrt{n}$, then you can form an independent set of size k by taking the central vertex from each star. This works because no two central vertices from any two stars could be connected to each other, as that would have made them neighbors in the original graph.

If $latex k < sqrt{n}$, then this smaller number of stars guarantees that at least one of the stars must be relatively big. In fact, it must have at least $latex sqrt{n}$ vertices. Why? Because all n vertices of the graph must be found among the k stars. If all of the k stars had fewer than $latex sqrt{n}$ vertices each, then the total number of vertices in the graph would be less than $latex k times sqrt{n}$. But since $latex k < sqrt{n}$, this means that the total number of vertices in the graph would have to be less than $latex sqrt{n} times sqrt{n} = n$. Since we know the graph has n vertices, the assumption that the stars are small leaves some vertices unaccounted for, which means at least one of the stars must have at least $latex sqrt{n}$ vertices. And a star of size $latex sqrt{n}$ guarantees an independent set with a size of at least $latex sqrt{n} -1$, which is roughly $latex sqrt{n}$. (Graph theorists are generally interested in how big a subgraph is relative to the original graph size, so the $latex sqrt{n}$ is much more important than the $latex – 1$.)

As with our original example, this result about triangle-free graphs is related to the Erdős-Hajnal conjecture. If a graph is triangle-free, then it can’t have a clique larger than size 2, since a clique of size 3 or more would require a triangle. Forbidding triangles means forbidding large cliques, and this forces large independent sets to emerge, just as the Erdős-Hajnal conjecture predicts.

Mathematicians recently proved much more. The Erdős-Hajnal conjecture has been proved in all cases where the forbidden subgraph consists of four or fewer vertices (like a square or a path of length 4). And in 2021 a group of mathematicians proved that if a graph contains no pentagons — that is, a loop that connects five vertices — then an unusually large clique or an unusually large independent set must exist as a consequence. This surprised some of the mathematicians who proved it, as they expected the Erdős-Hajnal conjecture to be false for pentagons. It was another surprisingly powerful mathematical result that came from thinking locally to act globally.

Introduction

Introduction

1. Suppose the biggest independent set you can find in a graph has size 1. What can you say about the graph?

Click for Answer 1:

This means that every possible edge exists in the graph. In this case the graph itself is a clique. We call such graphs “complete graphs.”

Introduction

2. Explain why it’s impossible to get anything other than a star in a connected graph with no paths of length 3 or longer.

Click for Answer 2:

If there’s only one vertex, it’s a star and we’re done. If there are two vertices, they must be connected, making a two-vertex star. If there are three vertices, then there are three possible edges that could exist between them. Given the conditions, the three vertices must form a path of length 2, like this:

This is because if either of these edges were missing, then the graph wouldn’t be connected, but if you added in the third edge, you’d make a triangle, which would give you a path of length 3.

If there are any other vertices, where could they connect? It would have to be the middle vertex and the middle vertex only. Connecting to the vertex on either end would immediately create a path of length 3. Thus the result will be a group of vertices all connected by a single edge to a central vertex. In other words, a star.

Introduction

3. Suppose a graph with n vertices has no paths of length 3. How big an independent set is guaranteed in such a graph?

Click for Answer 3:

If n is even, $latex frac{n}{2}$; if n is odd, $latex frac{n+1}{2}$. In other words, the “ceiling” of $latex frac{n}{2}$, written $latex lceil{frac{n}{2}}rceil$.

We already know that such a graph must be a collection of disconnected stars, and since the stars are disconnected, we can always form an independent set by combining the independent sets of each individual star. We also know that each star can contribute all but one of its vertices to an independent set by just dropping the center. In particular, if a star has size m > 1, then it can contribute m − 1 of its vertices to an independent set. The strategy is to make m − 1 as small as possible for every star, and to do that, you make as many two-vertex stars as you can.

If n is even, you end up with a bunch of pairs like this:

And you form an independent set by taking one vertex from each, yielding an independent set equal in size to the number of stars: $latex frac{n}{2}$.

If n is odd, you’ll have one vertex left over when you pair the vertices up.

Here, the independent set will be one vertex from each of the $latex frac{n-1}{2}$ pairs plus the leftover vertex for an independent set of size $latex frac{n-1}{2} + 1 = frac{n+1}{2}$.

spot_img

Latest Intelligence

spot_img

Chat with us

Hi there! How can I help you?