# How can I avoid right-truncated subjects being dropped?

I’m doing a survival analysis about the time some individual components remain in the source code of a software project, but some of these components are being dropped by the `survfit` function.

This is what I’m doing:

``````library(survival)
data <- read.table(text = "component_id weeks removed
1              1       1
2              1       1
3              1       1
4              1       1
5              1       1
6              1       1
7              1       1
8              2       0
9              2       0
10              2       0
11              2       0
12              2       1
13              2       1
14              2       0
15              2       0
16              2       0
17              2       0
18              2       0
19              2       0
20              2       1
21              2       1
22              2       0
23              2       0
24              3       1
25              3       1
26              3       1
27              3       1
28              7       1
29              7       1
30             14       1
31             14       1
32             14       1
33             14       1
34             14       1
35             14       1
36             14       1
37             14       1
38             14       1
39             14       1
40             14       1
41             14       1
42             14       1
43             14       1
44             14       1
45             14       1
46             14       1
47             14       1
48             40       1
49             40       1
50             40       1
51             40       1
52             48       1
53             48       1
54             48       1
55             48       1
56             48       1
57             48       1
58             48       1
59             48       1
60             56       1
61             56       1
62             56       1
63             56       1
64             56       1
65             56       1
66             56       1
67             56       1
68             56       1
69             56       1", header = TRUE)

fit <- survfit(Surv(data$$weeks, data$$removed) ~ 1)
summary(fit, censored=TRUE)
``````

And this is the output

``````Call: survfit(formula = Surv(data$$weeks, data$$removed) ~ 1)

time n.risk n.event survival std.err lower 95% CI upper 95% CI
1     69       7    0.899  0.0363        0.830        0.973
2     62       4    0.841  0.0441        0.758        0.932
3     46       4    0.767  0.0533        0.670        0.879
7     42       2    0.731  0.0567        0.628        0.851
14     40      18    0.402  0.0654        0.292        0.553
40     22       4    0.329  0.0629        0.226        0.478
48     18       8    0.183  0.0520        0.105        0.319
56     10      10    0.000     NaN           NA           NA
``````

I was expecting the number of events to be 69 but I get 12 subjects dropped.

I initially thought I was misusing the package functions, and carried a `type="interval2"` approach, following a similar situation, but the drops keep happening with now a weird continuous number of subjects and events counts:

``````as.t2 <- function(i, data) if (data$$removed[i] == 1) data$$weeks[i] else NA
size  <- length(data$$weeks) t1 <- data$$weeks
t2    <- sapply(1:size, as.t2, data = data)
interval_fit <- survfit(Surv(t1, t2, type="interval2") ~ 1)
summary(interval_fit, censored=TRUE)
``````

Next, I found what I call a mid-air explanation, clarifying a bit further the situation. I understand this is caused by non-censored subjects appearing after a “constant censoring time”, but again, why?

That led me somehow to dig deeper and read about right-truncation and realized that type of studies mapped very closely to the drops I’m experiencing. Here’s Klein & Moeschberger:

Truncation of survival data occurs when only those individuals whose event time lies within a certain observational window $$(Y_L,Y_R)$$ are observed. An individual whose event time is not in this interval is not observed and no information on this subject is available to the investigator.

Right truncation occurs when $$Y_L$$ is equal to zero. That is, we observe the survival time $$X$$ only when $$X leq Y_R$$.

From my perspective, these drops carry important information for my study regardless of their time of entry.

How can I stop the drops?

