#StackBounty: #c++ #algorithm #object-oriented #c++11 #interview-questions Solve a set of "restricted" linear equations effic…

Bounty: 100

I was recently asked to solve the following challenge (in C++) as part of the interview process. However, I haven’t heard from them at all afterwards, and based on past experiences of unsuccessful applicants that I’ve read online, my submission didn’t meet their standards. Since I did solve the challenge to the best of my abilities, I’m at a loss to understand in what ways I could have made a better solution. I’m posting the problem statement (in my own words) and my solution here. Please critique it as you would for a potential applicant to your team (as a means for gauging whether it’s worthwhile to have a subsequent phone-screen with such an applicant).

Input Details

The utility would take as input an input file containing a list of
equations, one per line. Each equation has the following format:
<LHS> = <RHS>, where LHS is the left-hand side of the equation and is always a variable name.
RHS is the right hand side of the equation and can be composed of the following only:

  • Variables
  • Unsigned integers
  • The + operator

Assumptions

Input is well-formed i.e.

  • Number of variables = Number of equations, with each variable occurring on the LHS of exactly one equation.
  • The system of equations has an unique solution, and does not have circular dependencies.
  • There are one or more white spaces between each token (numbers, + operator, variables).
  • A variable name can only be composed of letters from the alphabet (e.g. for which isalpha(c) is true).
  • All integers will fit in a C unsigned long.

Output Format

The utility would print the value of each variable after evaluating the set of equations, in the format <variable name> = <unsigned integer value>. The variables would be sorted in ascending (lexicographic) order.

Sample Input Output

Input file:

off = 4 + r + 1
l   = 1 + or + off
or  = 3 + 5
r   = 2

Expected output for the above input:

l   = 16
off = 7
or  = 8
r   = 2

Implementation Notes

Due to the simplified nature of the input equations, a full-blown linear
equation solver is not required in my opinion (as such a solver would have at least quadratic complexity). A much simplified (and asymptotically faster) solution can be arrived at by modeling the set of input equations as a Directed Acyclic Graph (DAG), by observing the dependencies of the variables from the input equations. Once we can model the system as a DAG, the steps to derive the variable values are as follows:

  • Construct the dependency DAG, where each node in the graph corresponds to a variable, and $(a, b)$ is a directed edge from $a$ to $b$ if and only if the variable $a$ needs to be fully evaluated before evaluating $b$.
  • Order the vertices in the DAG thus constructed using topological sort.
  • For each vertex in the sorted order, evaluate its corresponding variable fully before moving on to the next vertex.

The algorithm above has a linear complexity, which is the best we could achieve under the current assumptions. I’ve encapsulated the algorithm in the following class (I’ve used Google’s C++ Style Guide in my code – not sure it’s the best choice, but I preferred to follow a style guide that’s at least recognized by and arrived at by a non-trivial number of engineers.)

Class header file:

//
// Class that encapsulates a (constrained) linear equation solver. See README.md
// for assumptions on input restrictions.
//
#include <unordered_map>
#include <vector>
#include <list>

#ifndef _EVALUATOR
#define _EVALUATOR

class Evaluator
{
  private:
    // Stores the values of each variable throughout algorithm
    std::vector<UL>                      variable_values_;

    // Hash tables that store the correspondence between variable name and index
    std::unordered_map<std::string, UL>  variable_index_map_;
    std::unordered_map<UL, std::string>  index_variable_map_;

    // Adjacency list for DAG that stores the dependency information amongst
    // variables. If A[i, j] is an edge, it implies variable 'i' appears on the 
    // RHS of definition of variable 'j'.
    std::vector<std::list<UL> >          dependency_adj_list_;

    // List of equations stored as indices. If the list corresponding to eq[i]
    // contains 'j', then variable 'j' appears on the RHS of variable 'i'.
    std::vector<std::list<UL> >          equation_list_;

    // For efficiency, this list stores the number of dependencies for each
    // variable, which is useful while executing a topological sort.
    std::vector<UL>                      num_dependencies_;

    // Resets all internal data structures
    void  Clear();  

    // Prints values of internal data structures to aid in debugging
    void  PrintState();

    // Adds an entry corresponding to each new variable detected while parsing input
    UL    AddNewVar(std::string& );

    // Parse the input equations from filename given as argument, and build the
    // internal data structures coressponsing to the input.
    bool  ParseEquationsFromFile(const std::string&);

    // If DAG in dependency_adj_list_ has a valid topological order, returns
    // true along with the ordered vertices in the input vector
    bool  GetTopologicalVarOrder(std::vector<UL>&);

  public:
    Evaluator() {};

    /**
     * @brief Evaluate the set of constrained linear equations and returns the
     *        values of the variables as a list.
     *
     * @param[in]  string: Filename containing list of constrained linear equations.
     * @param[in]  vector<string>: If solution exists, returns the values of
     *             variables in lexicographic order (ascending).
     *
     * @return True if solution exists (always exists for valid input), false if
     *              input is not well-formed (See README.md for more details about input
     *              format).
     */
    bool SolveEquationSet(const std::string&, std::vector<std::string>& );
};
#endif

The main class file:

#include "evaluator.h"
#include <sstream>
#include <unordered_set>
#include <set>
#include <queue>
#include <algorithm>
#include <cassert>

#ifdef _EVALUATOR

// Used for early returns if the expression is false 
#define TRUE_OR_RETURN(EXPR, MSG)    
  do                                 
  {                                  
    bool status = (EXPR);            
    if (status != true)              
    {                                
      cerr << __FUNCTION__           
           << ": " << MSG << endl;   
      return false;                  
    }                                
  } while(0)                 
#endif

using namespace std;
//****  Helper functions local to the file ****

// Returns true if each character in the non-empty string is a digit
bool IsNumber(string s)
{
  return !s.empty() && std::all_of(s.begin(), s.end(), ::isdigit);
}

// Given a string, returns a vector of tokens separated by whitespace
vector<string> ParseTokensFromString(const string& s)
{
  istringstream   iss(s);
  vector<string>  token_list;
  string          token;
  while (iss >> token)
    token_list.push_back(token);
  return token_list;
}

// Returns true if the string can be a valid variable name (i.e has
// only alphabetical characters in it).
bool IsValidVar(string& v) 
{
  for (auto& c: v)
    TRUE_OR_RETURN(isalpha(c), "Non-alphabetical char in variable: " + v);
  return true;
}

//********************************************

void Evaluator::Clear()
{
  variable_values_.clear();
  variable_index_map_.clear();
  index_variable_map_.clear();
  dependency_adj_list_.clear();
  equation_list_.clear();
  num_dependencies_.clear();
}

void Evaluator::PrintState()
{
  for (auto i = 0U; i < dependency_adj_list_.size(); ++i)
    cout << index_variable_map_[i] << "(" << i << ") =>"  
         << "Value(" << variable_values_[i] << "), Deps(" 
         << num_dependencies_[i] << ")" << endl;
}

// Ensures all data structures correctly set aside an entry for the new variable
UL Evaluator::AddNewVar(string& v)
{
  if (variable_index_map_.count(v) == 0)
  {
    dependency_adj_list_.push_back(list<UL>());
    equation_list_.push_back(list<UL>());
    variable_values_.push_back(0);
    num_dependencies_.push_back(0);
    variable_index_map_.insert(make_pair(v, dependency_adj_list_.size() - 1));
    index_variable_map_.insert(make_pair(dependency_adj_list_.size() - 1, v));

    assert(num_dependencies_.size() == variable_values_.size() &&
           variable_index_map_.size() == variable_values_.size() && 
           variable_values_.size() == dependency_adj_list_.size());
  }
  return variable_index_map_[v];
}

// Parses equation from given input file line-by-line, checking 
// for validity of input at each step and returning true only if 
// all equations were successfully parsed.
bool Evaluator::ParseEquationsFromFile(const string& sEqnFile)
{
  string    line;
  ifstream  infile(sEqnFile);

  // This LUT serves as a sanity check for duplicate definitions of vars
  // As per spec, only ONE definition (appearance as LHS) per variable is handled
  unordered_set<string>  defined_vars; 
  while (getline(infile, line))
  {
    vector<string> tokens = ParseTokensFromString(line);
    string         lhs    = tokens[0];

    // Check if equation is adhering to spec
    TRUE_OR_RETURN(tokens.size() >= 3 && IsValidVar(lhs) && tokens[1] == "=", 
        "Invalid equation: " + line);

    // Check if variable on LHS was previously defined - this would make the 
    // current approach untenable, and require general equation solver.
    TRUE_OR_RETURN(defined_vars.count(lhs) == 0, "Multiple defn for: " + lhs);
    defined_vars.insert(lhs);
    const UL lhs_idx = AddNewVar(lhs);

    // The operands appear in alternate positions in RHS, tracked by isOp
    for (size_t i = 2, isOp = 0; i < tokens.size(); ++i, isOp ^= 1)
    {
      string token = tokens[i];
      if (isOp) 
        TRUE_OR_RETURN(token == "+", "Unsupported operator: " + token);
      else 
      {
        if (IsNumber(token))
          variable_values_[lhs_idx] += stol(token);
        else
        {
          TRUE_OR_RETURN(IsValidVar(token), "Invalid variable name: " + token);

          // Token variable must be evaluated before LHS. 
          // Hence adding token => LHS edge, and adding token to RHS of 
          // equation_list_[lhs]
          auto token_idx = AddNewVar(token);
          dependency_adj_list_[token_idx].push_back(lhs_idx);        
          assert(lhs_idx < equation_list_.size());
          equation_list_[lhs_idx].push_back(token_idx);
          num_dependencies_[lhs_idx]++;
        }
      }
    }
  }
  return (variable_index_map_.size() == dependency_adj_list_.size() && 
          dependency_adj_list_.size() == variable_values_.size());
}

// Execute the BFS version of topological sorting, using queue
bool Evaluator::GetTopologicalVarOrder(vector<UL>& ordered_vertices)
{
  ordered_vertices.clear();
  queue<UL> q;
  for (auto i = 0U; i < dependency_adj_list_.size(); ++i)
    if (num_dependencies_[i] == 0)
      q.push(i);

  while (!q.empty())
  {
    UL var_idx = q.front();
    ordered_vertices.push_back(var_idx);
    q.pop();
    for (auto& nbr: dependency_adj_list_[var_idx])
    {
      assert(num_dependencies_[nbr] >= 0);
      num_dependencies_[nbr]--;
      if (num_dependencies_[nbr] == 0)
        q.push(nbr);
    }
  }
  return (ordered_vertices.size() == dependency_adj_list_.size());
}

// Solves the constrained set of linear equations in 3 phases:
// 1) Parsing equations and construction of the dependency DAG
// 2) Topological sort on the dependency DAG to get the order of vertices
// 3) Substituting the values of variables according to the sorted order,
//    to get the final values for each variable.
bool Evaluator::SolveEquationSet(const string& eqn_file, vector<string>& solution_list)
{
  Clear();
  vector<UL> order;
  TRUE_OR_RETURN(ParseEquationsFromFile(eqn_file), "Parsing Equations Failed");
  TRUE_OR_RETURN(GetTopologicalVarOrder(order), "Topological Order Not Found");

  // Populate variable values in topological order 
  for (auto& idx: order)
    for (auto& nbr: equation_list_[idx])
      variable_values_[idx] += variable_values_[nbr];

  // Get keys from the LUT sorted in ascending order
  set<pair<string, UL> > sorted_var_idx;
  for (auto& vi_pair: variable_index_map_)
    sorted_var_idx.insert(vi_pair);
  for (auto& vi_pair: sorted_var_idx)
    solution_list.push_back(vi_pair.first + " = " + 
        to_string(variable_values_[vi_pair.second]));

  return true;
}
#endif

The usage of the class is as follows:

   string          eqn_file, log_file;
   Evaluator       evaluate;
   vector<string>  solution_list;

   // Logic to get input filename from user - skipping it here
   bool bStatus = evaluate.SolveEquationSet(eqn_file, solution_list); 

   for (auto& s: solution_list)
     cout << s << endl;


Get this bounty!!!