# #StackBounty: #ds.algorithms #graph-theory #graph-algorithms #directed-acyclic-graph Finding nodes with enough unique ancestors

### Bounty: 50

Given a DAG $$G = (V, E)$$, let $$T subseteq V$$ be a set of nodes of $$V$$ that is computed via the following process. Assuming the nodes of $$G$$ are sorted in topological order, $$v_1, dots, v_n$$. We process the nodes in this topological order. Suppose we are processing vertex $$v_i$$ in the order. Let $$r(v_i)$$ be the set of ancestors of $$v_i$$. We add $$v_i$$ and all ancestors of $$v_i$$ into $$T$$ if and only if the number of ancestors of $$v_i$$ (including $$v_i$$ itself) that are not in $$T$$ is at least $$frac{|r(v_i)| + 1}{f}$$ where $$f$$ is some constant $$f > 1$$ provided as part of the input. If a vertex is added to $$T$$, then all its ancestors are also added into $$T$$. Any algorithm that computes $$T$$ must satisfy the following invariant: any vertex $$v_i$$ added into $$T$$ must have at least $$frac{|r(v_i)| + 1}{f}$$ of its ancestors (including itself) not in $$T$$ when it was added, and any vertex $$v_i$$ not added into $$T$$ must have $$< frac{|r(v_i)| + 1}{f}$$ of its ancestors (including itself) not in $$T$$.

Problem: Provide an algorithm that runs in $$tilde{O}(m + n)$$ time (meaning ignoring polylogarithmic factors) that provides a valid set $$T$$ given an input graph $$G = (V, E)$$ and the processing order of the vertices is the toposort order. Approximation (meaning constant error on the $$frac{r(v_i)+1}{f}$$ condition) and Las Vegas algorithms are welcome also. You may preprocess the graph but the preprocessing time must also be $$tilde{O}(m + n)$$.

Trivial Solution in More Time We can do this trivially in $$O(nm)$$ time. For each $$v_i$$, we find via BFS its set of ancestors and check whether each one is in $$T$$. From this we can compute the exact fraction that is not in $$T$$ and either put $$v_i$$ and $$r(v_i)$$ in $$T$$ or not.

Example Problem: Suppose you have the following trivial DAG (below) and $$f= 2$$.

a -> b -> c

Then, $$T = {a, b}$$. $$a$$ is processed first and $$T = emptyset$$ at first so $$a$$ is added into $$T$$. Then, $$b$$ is processed after $$a$$ and is added to $$T$$ since $$1/2$$ of the vertices in $${a, b}$$ are not in $$T$$. Finally, $$c$$ is processed and not added to $$T$$ since it has at most $$1/3$$ of the vertices in $${a, b, c}$$ are not in $$T$$. Note that in this example, a trivial $$O(n)$$ algorithm for any line is to keep a counter $$c$$ of the index of the last element added to $$T$$ (since adding the element at $$c$$ also means adding all its ancestors) and add the $$i$$-th element of the line to $$T$$ if $$frac{i-c}{i} geq frac{1}{f}$$. However, this problem is much less trivial for general graphs.

Has anyone seen a problem similar to this in the literature or algorithms which solve similar problems?

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.