I was looking at this Data Science question on TestDome.
The problems is stated as the following:
Implement the desired_marketing_expenditure function, which returns
the required amount of money that needs to be invested in a new
marketing campaign to sell the desired number of units.
Use the data from previous marketing campaigns to evaluate how the
number of units sold grows linearly as the amount of money invested
For example, for the desired number of 60,000 units sold and previous
campaign data from the table below, the function should return the
Approaching this with linear regression I see this as:
marketing_expenditure = coeff * units_sold + intercept + error
because what I’m trying to find is the
marketing expenditure given a number of
However the author of this test seems it has seen the
marketing expenditure as the independent variable, in other words:
units_sold = coeff * marketing_expenditure + intercept + error
from which then it calculates the
marketing_expenditure by rearranging the equation.
The two approaches are not equivalent and give different results as depending on what is the dependent / independent variable the linear regression algorithm tries to minimise different square distances to different regression lines.
Which approach is correct and why?