1. Array Algorithms#

These problems cover fundamental array techniques: scanning, two-pointer methods, and partitioning.

1.1. The input is an array $A$ of length $n+1$; every element of ${1, 2, \ldots, n}$ is in $A$ and one element appears in $A$ twice. Describe an $O(n)$-time algorithm that returns the repeated element, explain why it’s correct, and analyze its running time.

Solution

We give two solutions.

ALG1(A):
  return sum(A) - n(n+1)/2

Let $k$ denote the repeated element in $A$. Then the sum of $A$ is exactly $1 + 2 + \cdots + n + k = n(n+1)/2 + k$. Thus, by computing the sum of $A$ and subtracting $n(n+1)/2$, we obtain $k$.

Calculating the sum of $A$ requires a single $O(n)$-time loop, so the total running time is $O(n)$.

The second solution scans $A$ and keeps track of the seen elements by using a binary array $B$:

ALG2(A):
  B = [0]*n  # array of length n, every element is 0
  for i = 1, ..., n+1:
    if B[A[i]] = 1:
      return A[i]
    B[A[i]] = 1

The array $B$ acts as a hash table whose keys are elements in $A$, and $B[k] = 1$ indicates that the algorithm has already seen $k$ in $A$. The second time we encounter an element $k$ in $A$, the algorithm checks that $B[k] = 1$ and returns $k$, as desired.

Creating an array of size $n$ takes $O(n)$ time. The algorithm then makes $O(n)$ iterations, and each iteration takes $O(1)$ time, so the total running time is $O(n)$.

1.2. The input is an array $A$ of $n$ distinct integers sorted in increasing order. Our goal is to return an array $B$ that contains the squares of the elements of $A$, and $B$ is sorted in non-decreasing order. Describe an $O(n)$-time algorithm for this problem, briefly explain why it’s correct, and analyze its running time. Example: If $A = [-1, 0, 1, 2]$, the algorithm should return $B = [0, 1, 1, 4]$.

Solution

(This is LeetCode 977.) We use a two-pointer approach: start with $(i, j) = (1, n)$. If $A[i]^2 \geq A[j]^2$, write $A[i]^2$ in $B$ and increment $i$; otherwise, write $A[j]^2$ in $B$ and decrement $j$. Use a third variable $k$, initially $n$, to track the current index in $B$. Repeat until $i = j$:

ALG(A):
  i, j, k = 1, n, n
  B = [0]*n
  while i <= j:
    if A[i]^2 >= A[j]^2:
      B[k] = A[i]^2
      i += 1
    else:
      B[k] = A[j]^2
      j -= 1
    k -= 1
  return B

Since $A$ is sorted, the last element of $B$ is either $A[1]^2$ or $A[n]^2$. If $A[1]^2 \geq A[n]^2$, then again since $A$ is sorted, the second-to-last element of $B$ is either $A[2]^2$ or $A[n]^2$. This logic continues until $i = j$.

The running time analysis is very similar to that of the two-pointer algorithm for Two Sum. The number of iterations is $O(n)$ because $j-i$ is initially $n-1$ and decreases by $1$ in each iteration loop until $j - i = 0$. Each iteration takes $O(1)$ time, so the total running time is $n \cdot O(1) = O(n)$.

1.3. The input is $(A, B)$, where $A$ is an array of $m$ distinct, positive integers, $B$ is an array of $n$ distinct, positive integers, and $m \leq n$. Our goal is to return a list $C$ (in any order) of the intersection of $A$ and $B$. Describe an $O(n\log m)$-time algorithm for this problem and analyze its running time.

Solution

Sort $A$. Then, for each element $b$ in $B$, binary search for $b$ in $A$. If $b$ is in $A$, append $b$ to a list $C$ (initially empty).

ALG(A, B):
  sort A
  C = [empty list]
  for b in B:
    if b in A:  # use binary search
      add b to C
  return C

Sorting $A$ takes $O(m\log m)$ time. The algorithm then makes $n$ iterations, and each iteration takes $O(\log m)$ time. Thus, the total running time is $O(m\log m) + n \cdot O(\log m) = O(n\log m)$.

1.4. The input is an array $A$ of $n$ positive integers. Our goal is to return indices $(i, j)$ such that $i < j$ and $(j-i) \cdot \min(A[i],A[j])$ is maximized. Describe an $O(n)$-time algorithm for this problem, briefly explain why it’s correct, and analyze its running time.

Solution

(This is LeetCode 11.) We use a two-pointer approach similar to Two Sum: start with $(i, j) = (1, n)$. If $A[i] \leq A[j]$, then increment $i$; otherwise, decrement $j$. Along the way, keep track of the maximum value of $(j-i) \cdot \min(A[i], A[j])$.

ALG(A):
  i, best_i, j, best_j, m = 1, 1, n, n, -inf
  while i < j:
    val = (j-i)*min(A[i], A[j])
    if val > m:
      m, best_i, best_j = val, i, j
    if A[i] <= A[j]:
      i += 1
    else:
      j -= 1
  return best_i, best_j

The analysis is similar to that of the two-pointer algorithm for Two Sum. We are not checking every pair $(i,j)$, but that is not necessary: if $A[i] \leq A[j]$, then for this particular $i$, there is no better solution $(i,j’)$ for any $j’ < j$. Thus, by incrementing $i$, we cannot miss the optimal solution.

The algorithm makes $O(n)$ iterations because $j-i$ decreases by $1$ in each one, and each iteration takes $O(1)$ time, so the total running time is $O(n)$.

1.5. The input is an array $A$ of $n$ distinct integers. Our goal is to return the minimum difference among every pair of elements in $A$. Describe an $O(n\log n)$-time algorithm for this problem, explain why it’s correct, and analyze its running time. Example: If $A = [5, 1, 7, 10]$, ALG should return $2$. (ALG should always return a positive integer.)

Solution

Sort $A$, then traverse $A$ and keep track of the minimum difference between consecutive elements:

ALG(A):
  sort A  # Merge Sort
  diff = infinity
  for i = 2, ..., n:
    diff = min(diff, A[i] - A[i-1])
  return diff

The algorithm is correct because after we sort, the minimum difference must be a difference of two consecutive elements. More formally, if $A$ is sorted then $A[i]-A[i-1] < A[i]-A[j]$ for all $j < i-1$, so it suffices to just check the differences between consecutive elements.

Merge Sort takes $O(n\log n)$ time. The algorithm then makes $n-1$ iterations, and each takes $O(1)$ time, so the total running time is $O(n\log n) + O(n) = O(n\log n)$.