I have a timeseries dataset collected by sensors. The data is crack displacement (
disp) and temperature (
temp). Thermal expansion is linear with temperature, so it causes a diurnal signature in my displacement data. I would like to remove the effect from temperature. I’m an engineer, not a statistician, so I am uncertain that I am doing it correctly.
I’ve used R’s
nls function to remove the known effect of temperature. I used a sin wave with 24-hour period. I also used a shift parameter (phi) in the function to shift the wave (because the
temp sensor responds faster than the
disp sensor which is mounted on concrete that takes awhile to heat and cool).
My questions are: Am I doing this correctly? Is there a better way?
Note that I do not want to predict crack displacement, I want to remove effects of temperature and see the underlying ‘unexplained’ displacement.
I’ve developed a pseudo time series that somewhat mimics my data.
Here is my code (requires package
# create 1 hour timeseries 1 week long dat <- data.frame(TIMESTAMP = seq.POSIXt(from = as.POSIXct('2020-03-01'), to = as.POSIXct('2020-03-08'), by = '1 hour')) # Add displacement data and temperature data. # Use a sinusoidal pattern with random element set.seed(1) dat$disp <- sin(as.numeric(dat$TIMESTAMP)) + runif(n = nrow(dat), min = -0.2, max = 0.2) dat$temp <- sin(as.numeric(dat$TIMESTAMP)) * runif(n = nrow(dat), min = 0, max = 5) + 20 plot(dat$TIMESTAMP, dat$disp) plot(dat$TIMESTAMP, dat$temp) # Add a trend to displacement - this is the unexplained displacement that I am looking for dat$disp <- dat$disp + seq(from = 0, to = 1, length.out = nrow(dat))*rnorm(1) # Equation for nls: # D = b0 + b1*T + b2*sin(omega*t + phi) # D = predicted displacement # T = temperature # omega = angular frequency = 2*pi / 24 # t = time of day (1...24) # phi = time shift between temperature and displacement # b0, b1, b2 are fitting parameters # Provide starting parameters for nls() b0 <- 1.0 b1 <- 1.0 b2 <- 1.0 phi <- 20 # create time of day (tod) series library(lubridate) dat$tod <- hour(dat$TIMESTAMP) # model fit <- nls(disp ~ (b0 + b1 * temp + b2 * sin(2 * pi * tod / 24 + phi)), start = list(b0 = b0, b1 = b1, b2 = b2, phi = phi), data = dat) # assign modeling results (residual and fitted points) to dat dat$res <- resid(fit) dat$fit <- fitted(fit) # regression on residuals to get the slope fit2 <- lm(dat$res ~ dat$TIMESTAMP) # plot original data plot(dat$TIMESTAMP, dat$disp) # plot residuals (the 'unexplained' part of displacement) points(dat$TIMESTAMP, dat$res, col = 'red') # plot the fitted values (the 'thermally explained' part of displacement) points(dat$TIMESTAMP, dat$fit, col = 'green') # add slope line through the residuals (movement not caused by thermal effects) abline(a = fit2$coefficients, b = fit2$coefficients) ```