# #StackBounty: #machine-learning #statistics #random-forest #prediction Understand the equations of quantile regression forest (Meinshau…

### Bounty: 50

I am trying to implement a quantile regression forest (https://www.jmlr.org/papers/volume7/meinshausen06a/meinshausen06a.pdf).

But, I have some difficulties to understand how the quantiles are computed. I will try to summarize the part of interest in order to then explain exactly what I don’t understand.

Let be $$n$$ independent observations $$(X_i, Y_i)$$. A tree $$T$$ parametrized with a realization $$theta$$ of a random variable $$Theta$$ is denoted by $$T(theta)$$.

• Grow $$k$$ trees $$T(theta_t)$$, $$t = 1, . . . , k$$, as in random forests. However, for every leaf of every tree, take note of all observations in this leaf, not just their average.
• For a given $$X = x$$, drop $$x$$ down all trees. Compute the weight $$omega_i(x, theta_t)$$ of observation $$i in {1, . . . , n}$$ for every tree as in (4). Compute weight $$omega_i(x)$$ for every observation $$i in {1, . . . , n}$$ as an average over $$omega_i(x, theta_t)$$, $$t = 1, . . . , k$$, as in (5).
• Compute the estimate of the distribution function as in (6) for all $$y in mathbb{R}$$.

Where the equations (4), (5), (6) are given below.

$$omega_i(x, theta_t) = frac{ 1 { X_i in R(x, theta_t) } }{text{#} { j : X_j in R(x, theta_t) } } (4)$$

$$omega_i(x) = k^{-1} sum_{t=1}^k omega_i(x, theta_t) (5)$$

$$hat{F}(y|X=x) = sum_{i=1}^n omega_i (x) 1{Y_i leq y} (6)$$

Where $$R(x, theta_t)$$ denotes the rectangular area corresponding to the unique leaf of the tree $$T(theta_t)$$ that $$x$$ belongs to.

I can compute (4) and (5) but I don’t understand how to compute (6) and then estimate quantiles. I would also add that I don’t know where all observations in leaves (first step of the algorithm) are used.

Can someone give some elements to understand this algorithm ? Any help would be appreciated.

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.