2. Essential Graph Algorithms#

These problems cover BFS, DFS, and their applications: shortest paths, cycle detection, topological ordering, and connectivity.

2.1. The input is a connected, undirected graph $G$. For any two vertices $u, v$, let $d(u, v)$ denote the distance from $u$ to $v$ (i.e., the length of the shortest $u$-$v$ path). The diameter $D$ of $G$ is the maximum distance in $G$, i.e. $D = \max_{u,v} d(u,v)$. Describe an $O(mn)$-time algorithm that returns $D$, briefly explain why it’s correct, and analyze its running time.

Solution

The brute-force solution is enough: run BFS from every vertex and keep track of the farthest distance.

ALG(G):
  diam = 0
  for each vertex u:
    d = BFS(G, u)
    diam = max(diam, max(d))
  return diam

When we run BFS from $u$, we compute the distance from $u$ to every other vertex. The farthest distance in that run is $\max(d)$, and taking the maximum over all starting vertices yields the diameter. Because $G$ is connected, BFS from each vertex takes $O(m+n) = O(m)$ time. There are $n$ BFS runs, so the total running time is $n \cdot O(m) = O(mn)$.

2.2. The input is a DAG $G$. Our goal is to return a pair of vertices $(s, t)$ such that $s$ cannot reach $t$ and $t$ cannot reach $s$, or return nothing if no such pair exists. Describe an $O(m+n)$-time algorithm for this problem, explain why it’s correct, and analyze its running time.

Solution

Compute a topological ordering and then look for a consecutive pair with no edge between them.

ALG(G):
  T = topological ordering of V
  for i = 1, ..., n-1:
    if (T[i], T[i+1]) not in E:
      return (T[i], T[i+1])

If the algorithm returns $(T[i], T[i+1])$, then no path can exist from $T[i]$ to $T[i+1]$ because every edge from $T[i]$ goes to a later vertex in the topological order, and there is no direct edge to the immediately following vertex. Likewise, there can be no path from $T[i+1]$ back to $T[i]$ because $T[i]$ appears earlier in the ordering. If every consecutive pair has an edge, then every vertex can reach every later vertex by following the chain of edges along the topological order.

Topological sorting takes $O(m+n)$ time, and scanning the $n-1$ consecutive pairs also takes $O(m+n)$ time if we check adjacency in a single pass over the outgoing edges. The total running time is $O(m+n)$.

2.3. The input is an undirected graph $G$. Our goal is to return an array $c$ of integers such that for every two vertices $u$ and $v$, $u$ and $v$ are in the same connected component if and only if $c[u] = c[v]$. Describe an $O(m+n)$-time algorithm for this problem and analyze its running time.

Solution

Use BFS to label each component with a unique component number.

ALG(G):
  c, cc_num = [0]*n, 1
  for each vertex u:
    if c[u] = 0:
      BFS(G, u)  # when processing any vertex v, set c[v] = cc_num
      cc_num += 1
  return c

Each time we start BFS from an unlabeled vertex $u$, we assign the current component number to every vertex reachable from $u$. Those vertices are exactly the connected component containing $u$. Since each edge is explored at most twice across all BFS runs and each vertex is visited at most once, the total running time is $O(m+n)$.

2.4. The input is a connected, undirected graph $G$. We say that $G$ is bipartite if there exists an array $c$ such that $c[u] \in \{0, 1\}$ for every $u \in V$, and for every edge $\{u, v\} \in E$, $c[u] \neq c[v]$. Describe an $O(m+n)$-time algorithm that determines if $G$ is bipartite (in which case it should return a valid $c$; otherwise, it should return nothing), explain why it’s correct, and analyze its running time.

Solution

Use BFS to color the graph with two colors and detect any conflict.

ALG(G):
  c = [∞]*n; c[1] = 0
  Q = queue(1)
  while |Q| >= 1:
    u = dequeue from Q
    for v in G[u]:
      if c[v] = ∞:
        c[v] = 1 - c[u]
        add v to Q
      else if c[v] = c[u]:
        return
  return c

If the algorithm returns nothing, then it found an edge $\{u, v\}$ with $c[v] = c[u]$, which means $G$ is not bipartite. If it returns $c$, then every edge was checked and every endpoint pair was assigned different colors, so $c$ is a valid bipartite coloring. Each edge is examined at most twice and each vertex is enqueued at most once, so the running time is $O(m+n)$.