#StackBounty: #cc.complexity-theory #graph-theory #sat #counting-complexity #planar-graphs Disproving $oplus$ETH by reducing $oplus k…

Bounty: 100

In this question and its answer, they discuss about reducing CNF-SAT with $n$ variables and $m$ clauses to a (problem on) planar graph $G=(V,E)$ with $|V|$ as small as possible. It is said that the best known reduction has $|V| = m^2$, and that if a better reduction with $|V| in o(m^2)$ is found, that would refute ETH.


There is a reduction from $oplus k$-SAT with $n$ variables and $m$ clauses to $oplus$VERTEX COVER where the output graph $G=(V,E)$ is planar and has $|V| = 51(k+1)nm$. Such reduction clearly meets the $|V| in o(m^2)$ requirement when $k$ is a constant and $m$ is superlinear in $n$.


Question
Can the same line of reasoning made within the linked question be applied here in order to refute $oplus$ETH, or am I missing some important detail?


Get this bounty!!!

#StackBounty: #cc.complexity-theory #graph-theory #sat #counting-complexity #planar-graphs Disproving $oplus$ETH by reducing $oplus k…

Bounty: 100

In this question and its answer, they discuss about reducing CNF-SAT with $n$ variables and $m$ clauses to a (problem on) planar graph $G=(V,E)$ with $|V|$ as small as possible. It is said that the best known reduction has $|V| = m^2$, and that if a better reduction with $|V| in o(m^2)$ is found, that would refute ETH.


There is a reduction from $oplus k$-SAT with $n$ variables and $m$ clauses to $oplus$VERTEX COVER where the output graph $G=(V,E)$ is planar and has $|V| = 51(k+1)nm$. Such reduction clearly meets the $|V| in o(m^2)$ requirement when $k$ is a constant and $m$ is superlinear in $n$.


Question
Can the same line of reasoning made within the linked question be applied here in order to refute $oplus$ETH, or am I missing some important detail?


Get this bounty!!!

#StackBounty: #ds.algorithms #graph-theory #pr.probability Complexity of finding the most likely edge

Bounty: 50

Consider a connected, unweighted, undirected graph $G$. Let $m$ be the number of edges and $n$ be the number of nodes.

Now consider the following random process. First sample a uniformly random spanning tree of $G$ and then pick an edge from this spanning tree uniformly at random. Our process returns the edge.

There is a probability distribution on edges implied by this process. https://math.stackexchange.com/a/3781031/678546 points out that if $T$ is a uniform sampled spanning tree then

$$P(e in T) = mathscr{R}(e_- leftrightarrow e_+)$$

where $e = {e_-, e_+}$ and $mathscr{R}(a leftrightarrow b)$ is the effective resistance between $a$ and $b$ when each edge is given resistance $1$.

Marcus M goes on to give a complexity of $O(mn^3)$ for computing the probabilities for every edge. This is much too slow to run in practice for all but the smallest graphs.

If I only want to find an edge with the maximal probability, is there a faster algorithm? How about if I am happy with an approximation?


Get this bounty!!!

#StackBounty: #clustering #graph-theory When to use graph clustering (by constructing a graph from raw data) vs conventional clustering…

Bounty: 50

This is a conceptual question. Say I have some tabular data, and a known similarity function i want to use to compare records in this tabular data. Records correspond to members of a MileageProgram, for example, and columns have categorical features corresponding to each member (Name, Membership tier, Country of Origin, City of residence, Color of hair, etc…). I could approach this in two (and there may be more, but I’m interested in comparing these two for the moment):

Approach 1:
One hot encode categorical variables (or find another way to embed/encode them). Use the known distance measure/function to calculate pairwise distances among the N members in my dataset. Then perform clustering using whatever makes sense for the structure of the data (e.g. K-means, DBSCAN, whatever…). Maybe throw in some dimensionality reduction

Approach 2:
Use the known distance measure/function to calculate pairwise distances among the N members in my dataset. Apply a threshold to create linkages based on these calculated distance values, and create linkages when distances are lower than some threshold T. Employ community detection methods on graphs (correct one, TBD).

Is there a rule of thumb to understand when to prefer Approach 1, and when to prefer Approach 2? What are the pros and cons of choosing one approach vs the other? I can see why thresholding might be a coarse step (in Approach 2) but there must be some scenarios where Approach 2 is the better approach to take?


Get this bounty!!!

#StackBounty: #clustering #graph-theory A graph-based clustering problem

Bounty: 50

I have a graph in which each node is associated with a time stamp. I have around 15-20 nodes associated with each time stamp.

The edges are not weighted & there cannot be an edge between nodes which share a timestamp.

I’d like to find connected components of the graph but with the additional constraint that each discovered component can contain at most one node from each timestep.

I can easily find connected components using networkx in Python, but I’m not sure of the best way to enforce the constraint. Often the connected components contain multiple nodes with the same timestamp.

It feels like the best way to proceed is to treat each connected component as a subgraph and try to find an optimal way of removing edges to enforce the constraint. I want to remove the fewest possible edges such that the constraint is satisfied.

I’m not quite sure how to do that. Any help would be much appreciated.

EDIT:

Here is an example of the subgraph for the largest connected component
Connected component

It’s the edges in the middle that are causing the problem. I essentially want to find the edges to remove which optimally break off the branches, which each contain at most one node from each timestep.

EDIT 2:

I tried a greedy approach whereby I score each edge according to the total number of constraint violations if that edge is removed. I then remove the edge that leads to the greatest improvement. I do this until there are no constraint violations left. It doesn’t even get close to the solution I’m looking for. I end up removing almost all of the edges.


Get this bounty!!!

#StackBounty: #algorithms #graph-theory What are the common practices to weight tags relations?

Bounty: 50

I am working on a webapp (fullstack JS) where the user create documents and attach tags to them. They also select a list of tags they are interested in and attach them to their profile.

I am not a math guy, but I did some NLP as hobbyist and learnt about latent semantic indexation: as I understand it, you create a table where you store each couple of words you parsed, and then add weight to each of these couple of words when both are found next to each other.

I was thinking of doing the same thing with tags: when 2 tags appear on the same document or profile, I increase the weight of their couple. That would allow me to get a ranking of the “closest” tags of a given one.

Then I remembered that I came across web graphs, where websites were represented in a 2D space (x and y coordinates) and placed depending on their links using an algorithm called force vector.

While I do know how I would implement my first idea, I am not sure about the second one. How do I spread the tag coordinates when created? Do they all have an x:0, y:0 at the start?

Since I assume this is a common case of data sorting, I wondered what would be the common/best practices recommended by people of the field.

Is there documents, articles, libraries (npm?) or wikipedia pages you could point me out to help me understand what can or should ideally be done? Is my first option a good one by default?

Also, please let me know in comments if I should add or remove a tag to this question or edit its title: I’m not even sure of how to categorize it.


Get this bounty!!!

#StackBounty: #graph-theory A question about Fleury's algorithm

Bounty: 50

The following is the Problem 1.4 in [1]:

Finding an Eulerian path. Show that if a connected graph has two vertices of odd degree and we start at one of them, Fleury’s algorithm will produce an Eulerian path, and that if all vertices have even degree, it (Fleury’s algorithm) will produce an Eulerian cycle no matter where we start.

Reference

[1] C. Moore and S. Mertens, The Nature of Computation, Oxford University Press, 2015.


I have tried to answer this question for a long time, but I don’t have any idea. By the way, this question is not my homework, I am just interested in solving this question.


Get this bounty!!!

#StackBounty: #graph-theory A question about Fleury's algorithm

Bounty: 50

The following is the Problem 1.4 in [1]:

Finding an Eulerian path. Show that if a connected graph has two vertices of odd degree and we start at one of them, Fleury’s algorithm will produce an Eulerian path, and that if all vertices have even degree, it (Fleury’s algorithm) will produce an Eulerian cycle no matter where we start.

Reference

[1] C. Moore and S. Mertens, The Nature of Computation, Oxford University Press, 2015.


I have tried to answer this question for a long time, but I don’t have any idea. By the way, this question is not my homework, I am just interested in solving this question.


Get this bounty!!!

#StackBounty: #graph-theory A question about Fleury's algorithm

Bounty: 50

The following is the Problem 1.4 in [1]:

Finding an Eulerian path. Show that if a connected graph has two vertices of odd degree and we start at one of them, Fleury’s algorithm will produce an Eulerian path, and that if all vertices have even degree, it (Fleury’s algorithm) will produce an Eulerian cycle no matter where we start.

Reference

[1] C. Moore and S. Mertens, The Nature of Computation, Oxford University Press, 2015.


I have tried to answer this question for a long time, but I don’t have any idea. By the way, this question is not my homework, I am just interested in solving this question.


Get this bounty!!!

#StackBounty: #graph-theory A question about Fleury's algorithm

Bounty: 50

The following is the Problem 1.4 in [1]:

Finding an Eulerian path. Show that if a connected graph has two vertices of odd degree and we start at one of them, Fleury’s algorithm will produce an Eulerian path, and that if all vertices have even degree, it (Fleury’s algorithm) will produce an Eulerian cycle no matter where we start.

Reference

[1] C. Moore and S. Mertens, The Nature of Computation, Oxford University Press, 2015.


I have tried to answer this question for a long time, but I don’t have any idea. By the way, this question is not my homework, I am just interested in solving this question.


Get this bounty!!!