*Bounty: 50*

*Bounty: 50*

Given a DAG $G = (V, E)$, let $T subseteq V$ be a set of nodes of $V$ that is computed via the following process. Assuming the nodes of $G$ are sorted in topological order, $v_1, dots, v_n$. We process the nodes in this topological order. Suppose we are processing vertex $v_i$ in the order. Let $r(v_i)$ be the set of ancestors of $v_i$. We add $v_i$ and all ancestors of $v_i$ into $T$ if and only if the number of ancestors of $v_i$ (including $v_i$ itself) that are not in $T$ is at least $frac{|r(v_i)| + 1}{f}$ where $f$ is some constant $f > 1$ provided as part of the input. If a vertex is added to $T$, then all its ancestors are also added into $T$. Any algorithm that computes $T$ must satisfy the following invariant: any vertex $v_i$ added into $T$ must have at least $frac{|r(v_i)| + 1}{f}$ of its ancestors (including itself) not in $T$ when it was added, and any vertex $v_i$ not added into $T$ must have $< frac{|r(v_i)| + 1}{f}$ of its ancestors (including itself) not in $T$.

**Problem:** Provide an algorithm that runs in $tilde{O}(m + n)$ time (meaning ignoring polylogarithmic factors) that provides a valid set $T$ given an input graph $G = (V, E)$ and the processing order of the vertices is the toposort order. Approximation (meaning constant error on the $frac{r(v_i)+1}{f}$ condition) and Las Vegas algorithms are welcome also. You may preprocess the graph but the preprocessing time must also be $tilde{O}(m + n)$.

**Trivial Solution in More Time** We can do this trivially in $O(nm)$ time. For each $v_i$, we find via BFS its set of ancestors and check whether each one is in $T$. From this we can compute the exact fraction that is not in $T$ and either put $v_i$ and $r(v_i)$ in $T$ or not.

**Example Problem:** Suppose you have the following trivial DAG (below) and $f= 2$.

a -> b -> c

Then, $T = {a, b}$. $a$ is processed first and $T = emptyset$ at first so $a$ is added into $T$. Then, $b$ is processed after $a$ and is added to $T$ since $1/2$ of the vertices in ${a, b}$ are not in $T$. Finally, $c$ is processed and not added to $T$ since it has at most $1/3$ of the vertices in ${a, b, c}$ are not in $T$. Note that in this example, a trivial $O(n)$ algorithm for any line is to keep a counter $c$ of the *index* of the last element added to $T$ (since adding the element at $c$ also means adding all its ancestors) and add the $i$-th element of the line to $T$ if $frac{i-c}{i} geq frac{1}{f}$. However, this problem is much less trivial for general graphs.

Has anyone seen a problem similar to this in the literature or algorithms which solve similar problems?