*Bounty: 100*

*Bounty: 100*

I was recently asked to solve the following challenge (in C++) as part of the interview process. However, I haven’t heard from them at all afterwards, and based on past experiences of unsuccessful applicants that I’ve read online, my submission didn’t meet their standards. Since I did solve the challenge to the best of my abilities, I’m at a loss to understand in what ways I could have made a better solution. I’m posting the problem statement (in my own words) and my solution here. Please critique it as you would for a potential applicant to your team (as a means for gauging whether it’s worthwhile to have a subsequent phone-screen with such an applicant).

## Input Details

The utility would take as input an input file containing a list of

equations, one per line. Each equation has the following format:

`<LHS> = <RHS>`

, where`LHS`

is the left-hand side of the equation and is always a variable name.

`RHS`

is the right hand side of the equation and can be composed of the following only:

- Variables
- Unsigned integers
- The
`+`

operator## Assumptions

Input is well-formed i.e.

- Number of variables = Number of equations, with each variable occurring on the LHS of exactly one equation.
- The system of equations has an unique solution, and does not have circular dependencies.
- There are one or more white spaces between each token (numbers, + operator, variables).
- A variable name can only be composed of letters from the alphabet (e.g. for which
`isalpha(c)`

is true).- All integers will fit in a C
`unsigned long`

.## Output Format

The utility would print the value of each variable after evaluating the set of equations, in the format

`<variable name> = <unsigned integer value>`

. The variables would be sorted in ascending (lexicographic) order.## Sample Input Output

Input file:

`off = 4 + r + 1 l = 1 + or + off or = 3 + 5 r = 2`

Expected output for the above input:

`l = 16 off = 7 or = 8 r = 2`

## Implementation Notes

Due to the simplified nature of the input equations, a full-blown linear

equation solver is not required in my opinion (as such a solver would have at least quadratic complexity). A much simplified (and asymptotically faster) solution can be arrived at by modeling the set of input equations as a Directed Acyclic Graph (DAG), by observing the dependencies of the variables from the input equations. Once we can model the system as a DAG, the steps to derive the variable values are as follows:

- Construct the dependency DAG, where each node in the graph corresponds to a variable, and $(a, b)$ is a directed edge from $a$ to $b$
*if and only if*the variable $a$ needs to be*fully evaluated*before evaluating $b$. - Order the vertices in the DAG thus constructed using
*topological sort*. - For each vertex in the sorted order, evaluate its corresponding variable fully before moving on to the next vertex.

The algorithm above has a linear complexity, which is the best we could achieve under the current assumptions. I’ve encapsulated the algorithm in the following class (I’ve used Google’s C++ Style Guide in my code – not sure it’s the *best* choice, but I preferred to follow a style guide that’s at least recognized by and arrived at by a non-trivial number of engineers.)

Class header file:

```
//
// Class that encapsulates a (constrained) linear equation solver. See README.md
// for assumptions on input restrictions.
//
#include <unordered_map>
#include <vector>
#include <list>
#ifndef _EVALUATOR
#define _EVALUATOR
class Evaluator
{
private:
// Stores the values of each variable throughout algorithm
std::vector<UL> variable_values_;
// Hash tables that store the correspondence between variable name and index
std::unordered_map<std::string, UL> variable_index_map_;
std::unordered_map<UL, std::string> index_variable_map_;
// Adjacency list for DAG that stores the dependency information amongst
// variables. If A[i, j] is an edge, it implies variable 'i' appears on the
// RHS of definition of variable 'j'.
std::vector<std::list<UL> > dependency_adj_list_;
// List of equations stored as indices. If the list corresponding to eq[i]
// contains 'j', then variable 'j' appears on the RHS of variable 'i'.
std::vector<std::list<UL> > equation_list_;
// For efficiency, this list stores the number of dependencies for each
// variable, which is useful while executing a topological sort.
std::vector<UL> num_dependencies_;
// Resets all internal data structures
void Clear();
// Prints values of internal data structures to aid in debugging
void PrintState();
// Adds an entry corresponding to each new variable detected while parsing input
UL AddNewVar(std::string& );
// Parse the input equations from filename given as argument, and build the
// internal data structures coressponsing to the input.
bool ParseEquationsFromFile(const std::string&);
// If DAG in dependency_adj_list_ has a valid topological order, returns
// true along with the ordered vertices in the input vector
bool GetTopologicalVarOrder(std::vector<UL>&);
public:
Evaluator() {};
/**
* @brief Evaluate the set of constrained linear equations and returns the
* values of the variables as a list.
*
* @param[in] string: Filename containing list of constrained linear equations.
* @param[in] vector<string>: If solution exists, returns the values of
* variables in lexicographic order (ascending).
*
* @return True if solution exists (always exists for valid input), false if
* input is not well-formed (See README.md for more details about input
* format).
*/
bool SolveEquationSet(const std::string&, std::vector<std::string>& );
};
#endif
```

The main class file:

```
#include "evaluator.h"
#include <sstream>
#include <unordered_set>
#include <set>
#include <queue>
#include <algorithm>
#include <cassert>
#ifdef _EVALUATOR
// Used for early returns if the expression is false
#define TRUE_OR_RETURN(EXPR, MSG)
do
{
bool status = (EXPR);
if (status != true)
{
cerr << __FUNCTION__
<< ": " << MSG << endl;
return false;
}
} while(0)
#endif
using namespace std;
//**** Helper functions local to the file ****
// Returns true if each character in the non-empty string is a digit
bool IsNumber(string s)
{
return !s.empty() && std::all_of(s.begin(), s.end(), ::isdigit);
}
// Given a string, returns a vector of tokens separated by whitespace
vector<string> ParseTokensFromString(const string& s)
{
istringstream iss(s);
vector<string> token_list;
string token;
while (iss >> token)
token_list.push_back(token);
return token_list;
}
// Returns true if the string can be a valid variable name (i.e has
// only alphabetical characters in it).
bool IsValidVar(string& v)
{
for (auto& c: v)
TRUE_OR_RETURN(isalpha(c), "Non-alphabetical char in variable: " + v);
return true;
}
//********************************************
void Evaluator::Clear()
{
variable_values_.clear();
variable_index_map_.clear();
index_variable_map_.clear();
dependency_adj_list_.clear();
equation_list_.clear();
num_dependencies_.clear();
}
void Evaluator::PrintState()
{
for (auto i = 0U; i < dependency_adj_list_.size(); ++i)
cout << index_variable_map_[i] << "(" << i << ") =>"
<< "Value(" << variable_values_[i] << "), Deps("
<< num_dependencies_[i] << ")" << endl;
}
// Ensures all data structures correctly set aside an entry for the new variable
UL Evaluator::AddNewVar(string& v)
{
if (variable_index_map_.count(v) == 0)
{
dependency_adj_list_.push_back(list<UL>());
equation_list_.push_back(list<UL>());
variable_values_.push_back(0);
num_dependencies_.push_back(0);
variable_index_map_.insert(make_pair(v, dependency_adj_list_.size() - 1));
index_variable_map_.insert(make_pair(dependency_adj_list_.size() - 1, v));
assert(num_dependencies_.size() == variable_values_.size() &&
variable_index_map_.size() == variable_values_.size() &&
variable_values_.size() == dependency_adj_list_.size());
}
return variable_index_map_[v];
}
// Parses equation from given input file line-by-line, checking
// for validity of input at each step and returning true only if
// all equations were successfully parsed.
bool Evaluator::ParseEquationsFromFile(const string& sEqnFile)
{
string line;
ifstream infile(sEqnFile);
// This LUT serves as a sanity check for duplicate definitions of vars
// As per spec, only ONE definition (appearance as LHS) per variable is handled
unordered_set<string> defined_vars;
while (getline(infile, line))
{
vector<string> tokens = ParseTokensFromString(line);
string lhs = tokens[0];
// Check if equation is adhering to spec
TRUE_OR_RETURN(tokens.size() >= 3 && IsValidVar(lhs) && tokens[1] == "=",
"Invalid equation: " + line);
// Check if variable on LHS was previously defined - this would make the
// current approach untenable, and require general equation solver.
TRUE_OR_RETURN(defined_vars.count(lhs) == 0, "Multiple defn for: " + lhs);
defined_vars.insert(lhs);
const UL lhs_idx = AddNewVar(lhs);
// The operands appear in alternate positions in RHS, tracked by isOp
for (size_t i = 2, isOp = 0; i < tokens.size(); ++i, isOp ^= 1)
{
string token = tokens[i];
if (isOp)
TRUE_OR_RETURN(token == "+", "Unsupported operator: " + token);
else
{
if (IsNumber(token))
variable_values_[lhs_idx] += stol(token);
else
{
TRUE_OR_RETURN(IsValidVar(token), "Invalid variable name: " + token);
// Token variable must be evaluated before LHS.
// Hence adding token => LHS edge, and adding token to RHS of
// equation_list_[lhs]
auto token_idx = AddNewVar(token);
dependency_adj_list_[token_idx].push_back(lhs_idx);
assert(lhs_idx < equation_list_.size());
equation_list_[lhs_idx].push_back(token_idx);
num_dependencies_[lhs_idx]++;
}
}
}
}
return (variable_index_map_.size() == dependency_adj_list_.size() &&
dependency_adj_list_.size() == variable_values_.size());
}
// Execute the BFS version of topological sorting, using queue
bool Evaluator::GetTopologicalVarOrder(vector<UL>& ordered_vertices)
{
ordered_vertices.clear();
queue<UL> q;
for (auto i = 0U; i < dependency_adj_list_.size(); ++i)
if (num_dependencies_[i] == 0)
q.push(i);
while (!q.empty())
{
UL var_idx = q.front();
ordered_vertices.push_back(var_idx);
q.pop();
for (auto& nbr: dependency_adj_list_[var_idx])
{
assert(num_dependencies_[nbr] >= 0);
num_dependencies_[nbr]--;
if (num_dependencies_[nbr] == 0)
q.push(nbr);
}
}
return (ordered_vertices.size() == dependency_adj_list_.size());
}
// Solves the constrained set of linear equations in 3 phases:
// 1) Parsing equations and construction of the dependency DAG
// 2) Topological sort on the dependency DAG to get the order of vertices
// 3) Substituting the values of variables according to the sorted order,
// to get the final values for each variable.
bool Evaluator::SolveEquationSet(const string& eqn_file, vector<string>& solution_list)
{
Clear();
vector<UL> order;
TRUE_OR_RETURN(ParseEquationsFromFile(eqn_file), "Parsing Equations Failed");
TRUE_OR_RETURN(GetTopologicalVarOrder(order), "Topological Order Not Found");
// Populate variable values in topological order
for (auto& idx: order)
for (auto& nbr: equation_list_[idx])
variable_values_[idx] += variable_values_[nbr];
// Get keys from the LUT sorted in ascending order
set<pair<string, UL> > sorted_var_idx;
for (auto& vi_pair: variable_index_map_)
sorted_var_idx.insert(vi_pair);
for (auto& vi_pair: sorted_var_idx)
solution_list.push_back(vi_pair.first + " = " +
to_string(variable_values_[vi_pair.second]));
return true;
}
#endif
```

The usage of the class is as follows:

```
string eqn_file, log_file;
Evaluator evaluate;
vector<string> solution_list;
// Logic to get input filename from user - skipping it here
bool bStatus = evaluate.SolveEquationSet(eqn_file, solution_list);
for (auto& s: solution_list)
cout << s << endl;
```