#StackBounty: #r #time-series #nls NLS analysis in R with timeseries data to remove known effect

Bounty: 50

I have a timeseries dataset collected by sensors. The data is crack displacement (disp) and temperature (temp). Thermal expansion is linear with temperature, so it causes a diurnal signature in my displacement data. I would like to remove the effect from temperature. I’m an engineer, not a statistician, so I am uncertain that I am doing it correctly.

I’ve used R’s nls function to remove the known effect of temperature. I used a sin wave with 24-hour period. I also used a shift parameter (phi) in the function to shift the wave (because the temp sensor responds faster than the disp sensor which is mounted on concrete that takes awhile to heat and cool).

My questions are: Am I doing this correctly? Is there a better way?

Note that I do not want to predict crack displacement, I want to remove effects of temperature and see the underlying ‘unexplained’ displacement.

I’ve developed a pseudo time series that somewhat mimics my data.

Here is my code (requires package lubridate):

# create 1 hour timeseries 1 week long
dat <- data.frame(TIMESTAMP = seq.POSIXt(from = as.POSIXct('2020-03-01'), 
                                          to = as.POSIXct('2020-03-08'),
                                          by = '1 hour'))

# Add displacement data and temperature data.
# Use a sinusoidal pattern with random element
set.seed(1)
dat$disp <- sin(as.numeric(dat$TIMESTAMP)) + runif(n = nrow(dat), min = -0.2, max = 0.2)
dat$temp <- sin(as.numeric(dat$TIMESTAMP)) * runif(n = nrow(dat), min = 0, max = 5) + 20
plot(dat$TIMESTAMP, dat$disp)
plot(dat$TIMESTAMP, dat$temp)

# Add a trend to displacement - this is the unexplained displacement that I am looking for
dat$disp <- dat$disp + seq(from = 0, to = 1, length.out = nrow(dat))*rnorm(1)

# Equation for nls:
# D = b0 + b1*T + b2*sin(omega*t + phi)
    # D = predicted displacement
    # T = temperature
    # omega = angular frequency = 2*pi / 24
    # t = time of day (1...24)
    # phi = time shift between temperature and displacement
    # b0, b1, b2 are fitting parameters

# Provide starting parameters for nls()
b0 <- 1.0
b1 <- 1.0
b2 <- 1.0
phi <- 20

# create time of day (tod) series
library(lubridate)
dat$tod <- hour(dat$TIMESTAMP)

# model
fit <- nls(disp ~ (b0 + b1 * temp + b2 * sin(2 * pi * tod / 24 + phi)), 
           start = list(b0 = b0, b1 = b1, b2 = b2, phi = phi),
           data = dat)

# assign modeling results (residual and fitted points) to dat
dat$res <- resid(fit)
dat$fit <- fitted(fit)

# regression on residuals to get the slope
fit2 <- lm(dat$res ~ dat$TIMESTAMP)

# plot original data
plot(dat$TIMESTAMP, dat$disp)

# plot residuals (the 'unexplained' part of displacement)
points(dat$TIMESTAMP, dat$res, col = 'red')

# plot the fitted values (the 'thermally explained' part of displacement)
points(dat$TIMESTAMP, dat$fit, col = 'green')

# add slope line through the residuals (movement not caused by thermal effects)
abline(a = fit2$coefficients[1], b = fit2$coefficients[2])
```


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.