Online trend estimation and detection of trend deviations in sub-sewershed time series of SARS-CoV-2 RNA measured in wastewater

Symposium on Data Science and Statistics 2024

Dr. Julia C. Schedler

2024-06-07

Background and motivation

Our team

Routine monitoring of wastewater

Figure 1 From Stadler LB, Ensor KB, Clark JR, Kalvapalle P, LaTurner ZW, Mojica L, et al. Wastewater Analysis of SARS-CoV-2 as a Predictive Metric of Positivity Rate for a Major Metropolis. medRxiv. Published online November 6, 2020:2020.11.04.20226191. doi:10/ghjmm9

Terminology

  • WWTP = WasteWater Treatment Plant
  • LS = Lift Station (sub-sewershed)

Motivating questions

  1. How do we scientifically estimate trends in wastewater time series?
  2. How can we detect deviations from the trend at new locations during times of concern?
  3. How can we share our code for reasonable technology transfer?

Data processing steps

  • identify observations below the level of detection using statistical analysis

  • align all observations to Mondays

  • transform copies per L to a log10 scale

  • average replicates to give one weekly measurement per week per location

  • Only use locations where the primary WWTP has at least 85% coverage and observations within 1 month of last date

  • ensure there is a row for each week, even if the observation is missing

  • create an indicator of missing values

  • remove irrelevant features/variables

         dates           name    value ts_missing
1   2021-05-24 Lift station A 3.397031      FALSE
2   2021-05-24 Lift station B 2.710917      FALSE
3   2021-05-24 Lift station C 2.433121      FALSE
4   2021-05-24 Lift station D 2.478166      FALSE
5   2021-05-24           WWTP 3.247796      FALSE
6   2021-05-31 Lift station A       NA       TRUE
7   2021-05-31 Lift station B       NA       TRUE
8   2021-05-31 Lift station C 3.327098      FALSE
9   2021-05-31 Lift station D       NA       TRUE
10  2021-05-31           WWTP 3.686631      FALSE
11  2021-06-07 Lift station A       NA       TRUE
12  2021-06-07 Lift station B 3.533132      FALSE
13  2021-06-07 Lift station C 3.943171      FALSE
14  2021-06-07 Lift station D 4.166666      FALSE
15  2021-06-07           WWTP 3.804769      FALSE
16  2021-06-14 Lift station A       NA       TRUE
17  2021-06-14 Lift station B 4.454591      FALSE
18  2021-06-14 Lift station C       NA       TRUE
19  2021-06-14 Lift station D       NA       TRUE
20  2021-06-14           WWTP 4.210527      FALSE
21  2021-06-21 Lift station A 4.543146      FALSE
22  2021-06-21 Lift station B 4.708266      FALSE
23  2021-06-21 Lift station C 5.869086      FALSE
24  2021-06-21 Lift station D       NA       TRUE
25  2021-06-21           WWTP 4.279630      FALSE
26  2021-06-28 Lift station A 4.356128      FALSE
27  2021-06-28 Lift station B 4.306562      FALSE
28  2021-06-28 Lift station C 5.230189      FALSE
29  2021-06-28 Lift station D       NA       TRUE
30  2021-06-28           WWTP 4.005982      FALSE
31  2021-07-05 Lift station A       NA       TRUE
32  2021-07-05 Lift station B 4.892258      FALSE
33  2021-07-05 Lift station C 5.766061      FALSE
34  2021-07-05 Lift station D 6.174665      FALSE
35  2021-07-05           WWTP 4.382618      FALSE
36  2021-07-12 Lift station A 4.746005      FALSE
37  2021-07-12 Lift station B 4.668747      FALSE
38  2021-07-12 Lift station C 5.096225      FALSE
39  2021-07-12 Lift station D 5.506649      FALSE
40  2021-07-12           WWTP 4.352995      FALSE
41  2021-07-19 Lift station A 4.581049      FALSE
42  2021-07-19 Lift station B 4.093388      FALSE
43  2021-07-19 Lift station C 3.877557      FALSE
44  2021-07-19 Lift station D 4.283020      FALSE
45  2021-07-19           WWTP 4.141902      FALSE
46  2021-07-26 Lift station A 5.452360      FALSE
47  2021-07-26 Lift station B 5.621591      FALSE
48  2021-07-26 Lift station C 5.712927      FALSE
49  2021-07-26 Lift station D 6.096784      FALSE
50  2021-07-26           WWTP 5.017634      FALSE
51  2021-08-02 Lift station A 5.332646      FALSE
52  2021-08-02 Lift station B 5.402850      FALSE
53  2021-08-02 Lift station C 5.170501      FALSE
54  2021-08-02 Lift station D 5.517266      FALSE
55  2021-08-02           WWTP 4.915438      FALSE
56  2021-08-09 Lift station A 5.676073      FALSE
57  2021-08-09 Lift station B 6.232570      FALSE
58  2021-08-09 Lift station C 6.262397      FALSE
59  2021-08-09 Lift station D 6.547182      FALSE
60  2021-08-09           WWTP 5.336388      FALSE
61  2021-08-16 Lift station A       NA       TRUE
62  2021-08-16 Lift station B 5.462582      FALSE
63  2021-08-16 Lift station C 5.237442      FALSE
64  2021-08-16 Lift station D 5.432048      FALSE
65  2021-08-16           WWTP 4.904691      FALSE
66  2021-08-23 Lift station A 4.951239      FALSE
67  2021-08-23 Lift station B 5.439118      FALSE
68  2021-08-23 Lift station C 5.266882      FALSE
69  2021-08-23 Lift station D 5.377799      FALSE
70  2021-08-23           WWTP 4.974239      FALSE
71  2021-08-30 Lift station A       NA       TRUE
72  2021-08-30 Lift station B       NA       TRUE
73  2021-08-30 Lift station C 4.923318      FALSE
74  2021-08-30 Lift station D 4.936557      FALSE
75  2021-08-30           WWTP 4.819800      FALSE
76  2021-09-06 Lift station A 4.505679      FALSE
77  2021-09-06 Lift station B 4.866079      FALSE
78  2021-09-06 Lift station C 4.526690      FALSE
79  2021-09-06 Lift station D 4.503110      FALSE
80  2021-09-06           WWTP 4.642827      FALSE
81  2021-09-13 Lift station A       NA       TRUE
82  2021-09-13 Lift station B       NA       TRUE
83  2021-09-13 Lift station C 4.555870      FALSE
84  2021-09-13 Lift station D 4.534722      FALSE
85  2021-09-13           WWTP       NA       TRUE
86  2021-09-20 Lift station A 4.477462      FALSE
87  2021-09-20 Lift station B 4.552501      FALSE
88  2021-09-20 Lift station C 3.969145      FALSE
89  2021-09-20 Lift station D 3.969242      FALSE
90  2021-09-20           WWTP 4.280986      FALSE
91  2021-09-27 Lift station A 4.358866      FALSE
92  2021-09-27 Lift station B 4.208715      FALSE
93  2021-09-27 Lift station C 3.449691      FALSE
94  2021-09-27 Lift station D 3.468834      FALSE
95  2021-09-27           WWTP 4.074034      FALSE
96  2021-10-04 Lift station A 4.611473      FALSE
97  2021-10-04 Lift station B 4.615874      FALSE
98  2021-10-04 Lift station C 4.233867      FALSE
99  2021-10-04 Lift station D 4.151621      FALSE
100 2021-10-04           WWTP 4.262782      FALSE
101 2021-10-11 Lift station A 4.437959      FALSE
102 2021-10-11 Lift station B 4.168897      FALSE
103 2021-10-11 Lift station C 3.841685      FALSE
104 2021-10-11 Lift station D       NA       TRUE
105 2021-10-11           WWTP 4.116432      FALSE
106 2021-10-18 Lift station A 4.220729      FALSE
107 2021-10-18 Lift station B 3.795609      FALSE
108 2021-10-18 Lift station C 3.578050      FALSE
109 2021-10-18 Lift station D       NA       TRUE
110 2021-10-18           WWTP 4.004019      FALSE
111 2021-10-25 Lift station A 4.056394      FALSE
112 2021-10-25 Lift station B 3.884914      FALSE
113 2021-10-25 Lift station C 3.999813      FALSE
114 2021-10-25 Lift station D       NA       TRUE
115 2021-10-25           WWTP 4.025630      FALSE
116 2021-11-01 Lift station A       NA       TRUE
117 2021-11-01 Lift station B 3.671352      FALSE
118 2021-11-01 Lift station C 3.743671      FALSE
119 2021-11-01 Lift station D       NA       TRUE
120 2021-11-01           WWTP 3.931336      FALSE
121 2021-11-08 Lift station A       NA       TRUE
122 2021-11-08 Lift station B 3.687355      FALSE
123 2021-11-08 Lift station C 3.734131      FALSE
124 2021-11-08 Lift station D 3.425022      FALSE
125 2021-11-08           WWTP 3.877653      FALSE
126 2021-11-15 Lift station A       NA       TRUE
127 2021-11-15 Lift station B 3.762643      FALSE
128 2021-11-15 Lift station C 3.727252      FALSE
129 2021-11-15 Lift station D 3.447529      FALSE
130 2021-11-15           WWTP 3.851034      FALSE
131 2021-11-22 Lift station A       NA       TRUE
132 2021-11-22 Lift station B       NA       TRUE
133 2021-11-22 Lift station C 4.617968      FALSE
134 2021-11-22 Lift station D 4.341540      FALSE
135 2021-11-22           WWTP 4.073687      FALSE
136 2021-11-29 Lift station A       NA       TRUE
137 2021-11-29 Lift station B       NA       TRUE
138 2021-11-29 Lift station C 5.837219      FALSE
139 2021-11-29 Lift station D 5.561777      FALSE
140 2021-11-29           WWTP 4.455586      FALSE
141 2021-12-06 Lift station A 3.738877      FALSE
142 2021-12-06 Lift station B 4.163225      FALSE
143 2021-12-06 Lift station C       NA       TRUE
144 2021-12-06 Lift station D 3.845751      FALSE
145 2021-12-06           WWTP 3.813849      FALSE
146 2021-12-13 Lift station A       NA       TRUE
147 2021-12-13 Lift station B 4.851504      FALSE
148 2021-12-13 Lift station C 4.862413      FALSE
149 2021-12-13 Lift station D 4.690651      FALSE
150 2021-12-13           WWTP 4.292733      FALSE
151 2021-12-20 Lift station A       NA       TRUE
152 2021-12-20 Lift station B 5.361986      FALSE
153 2021-12-20 Lift station C 5.440936      FALSE
154 2021-12-20 Lift station D 5.301950      FALSE
155 2021-12-20           WWTP       NA       TRUE
156 2021-12-27 Lift station A 5.239869      FALSE
157 2021-12-27 Lift station B       NA       TRUE
158 2021-12-27 Lift station C 4.855687      FALSE
159 2021-12-27 Lift station D 4.778626      FALSE
160 2021-12-27           WWTP 4.927720      FALSE
161 2022-01-03 Lift station A 5.850155      FALSE
162 2022-01-03 Lift station B       NA       TRUE
163 2022-01-03 Lift station C 5.946779      FALSE
164 2022-01-03 Lift station D 5.894874      FALSE
165 2022-01-03           WWTP 5.529660      FALSE
166 2022-01-10 Lift station A 5.463797      FALSE
167 2022-01-10 Lift station B 5.033963      FALSE
168 2022-01-10 Lift station C       NA       TRUE
169 2022-01-10 Lift station D       NA       TRUE
170 2022-01-10           WWTP 5.137469      FALSE
171 2022-01-17 Lift station A 5.371474      FALSE
172 2022-01-17 Lift station B 4.974649      FALSE
173 2022-01-17 Lift station C 4.944890      FALSE
174 2022-01-17 Lift station D 4.835231      FALSE
175 2022-01-17           WWTP 5.040134      FALSE
176 2022-01-24 Lift station A 5.079188      FALSE
177 2022-01-24 Lift station B 4.539009      FALSE
178 2022-01-24 Lift station C 4.528667      FALSE
179 2022-01-24 Lift station D 4.398598      FALSE
180 2022-01-24           WWTP 4.735477      FALSE
181 2022-01-31 Lift station A 4.777910      FALSE
182 2022-01-31 Lift station B 4.192846      FALSE
183 2022-01-31 Lift station C 4.349681      FALSE
184 2022-01-31 Lift station D 4.183396      FALSE
185 2022-01-31           WWTP 4.437708      FALSE
186 2022-02-07 Lift station A 4.507263      FALSE
187 2022-02-07 Lift station B 3.781950      FALSE
188 2022-02-07 Lift station C 3.988071      FALSE
189 2022-02-07 Lift station D 3.844223      FALSE
190 2022-02-07           WWTP 4.183525      FALSE
191 2022-02-14 Lift station A 4.231858      FALSE
192 2022-02-14 Lift station B 3.370880      FALSE
193 2022-02-14 Lift station C       NA       TRUE
194 2022-02-14 Lift station D 3.497339      FALSE
195 2022-02-14           WWTP 3.892506      FALSE
196 2022-02-21 Lift station A 4.382902      FALSE
197 2022-02-21 Lift station B 3.694710      FALSE
198 2022-02-21 Lift station C 4.133196      FALSE
199 2022-02-21 Lift station D 4.101206      FALSE
200 2022-02-21           WWTP 3.942561      FALSE
201 2022-02-28 Lift station A 4.147238      FALSE
202 2022-02-28 Lift station B 3.328388      FALSE
203 2022-02-28 Lift station C 3.641227      FALSE
204 2022-02-28 Lift station D       NA       TRUE
205 2022-02-28           WWTP 3.569448      FALSE
206 2022-03-07 Lift station A 4.396039      FALSE
207 2022-03-07 Lift station B       NA       TRUE
208 2022-03-07 Lift station C 4.068679      FALSE
209 2022-03-07 Lift station D       NA       TRUE
210 2022-03-07           WWTP 3.697926      FALSE
211 2022-03-14 Lift station A 3.660485      FALSE
212 2022-03-14 Lift station B       NA       TRUE
213 2022-03-14 Lift station C 2.038824      FALSE
214 2022-03-14 Lift station D 2.073368      FALSE
215 2022-03-14           WWTP 2.931988      FALSE
216 2022-03-21 Lift station A 4.106852      FALSE
217 2022-03-21 Lift station B 3.006341      FALSE
218 2022-03-21 Lift station C 3.091754      FALSE
219 2022-03-21 Lift station D 3.076814      FALSE
220 2022-03-21           WWTP 3.472412      FALSE
221 2022-03-28 Lift station A 4.617891      FALSE
222 2022-03-28 Lift station B       NA       TRUE
223 2022-03-28 Lift station C 4.457814      FALSE
224 2022-03-28 Lift station D 4.374370      FALSE
225 2022-03-28           WWTP 4.117559      FALSE
226 2022-04-04 Lift station A       NA       TRUE
227 2022-04-04 Lift station B       NA       TRUE
228 2022-04-04 Lift station C 4.043378      FALSE
229 2022-04-04 Lift station D 3.908229      FALSE
230 2022-04-04           WWTP 4.062018      FALSE
231 2022-04-11 Lift station A 4.317178      FALSE
232 2022-04-11 Lift station B 3.157110      FALSE
233 2022-04-11 Lift station C 3.614420      FALSE
234 2022-04-11 Lift station D 3.427629      FALSE
235 2022-04-11           WWTP 3.926067      FALSE
236 2022-04-18 Lift station A 4.663436      FALSE
237 2022-04-18 Lift station B       NA       TRUE
238 2022-04-18 Lift station C 4.455957      FALSE
239 2022-04-18 Lift station D       NA       TRUE
240 2022-04-18           WWTP 4.240613      FALSE
241 2022-04-25 Lift station A 4.728741      FALSE
242 2022-04-25 Lift station B       NA       TRUE
243 2022-04-25 Lift station C 4.247844      FALSE
244 2022-04-25 Lift station D 4.045955      FALSE
245 2022-04-25           WWTP 4.227155      FALSE
246 2022-05-02 Lift station A 4.796903      FALSE
247 2022-05-02 Lift station B       NA       TRUE
248 2022-05-02 Lift station C 4.252708      FALSE
249 2022-05-02 Lift station D 4.073360      FALSE
250 2022-05-02           WWTP 4.220640      FALSE
251 2022-05-09 Lift station A 5.247327      FALSE
252 2022-05-09 Lift station B       NA       TRUE
253 2022-05-09 Lift station C 5.260360      FALSE
254 2022-05-09 Lift station D 5.114130      FALSE
255 2022-05-09           WWTP 4.587104      FALSE
256 2022-05-16 Lift station A 5.226994      FALSE
257 2022-05-16 Lift station B       NA       TRUE
258 2022-05-16 Lift station C 4.948804      FALSE
259 2022-05-16 Lift station D 4.851201      FALSE
260 2022-05-16           WWTP 4.457253      FALSE
261 2022-05-23 Lift station A 5.291779      FALSE
262 2022-05-23 Lift station B       NA       TRUE
263 2022-05-23 Lift station C 5.088610      FALSE
264 2022-05-23 Lift station D 5.022635      FALSE
265 2022-05-23           WWTP 4.470757      FALSE
266 2022-05-30 Lift station A 5.152038      FALSE
267 2022-05-30 Lift station B       NA       TRUE
268 2022-05-30 Lift station C 4.755075      FALSE
269 2022-05-30 Lift station D 4.718337      FALSE
270 2022-05-30           WWTP 4.344221      FALSE
271 2022-06-06 Lift station A       NA       TRUE
272 2022-06-06 Lift station B       NA       TRUE
273 2022-06-06 Lift station C 4.551157      FALSE
274 2022-06-06 Lift station D 4.551809      FALSE
275 2022-06-06           WWTP 4.290264      FALSE
276 2022-06-13 Lift station A 5.353931      FALSE
277 2022-06-13 Lift station B       NA       TRUE
278 2022-06-13 Lift station C 5.429759      FALSE
279 2022-06-13 Lift station D 5.467575      FALSE
280 2022-06-13           WWTP 4.713474      FALSE
281 2022-06-20 Lift station A 5.057523      FALSE
282 2022-06-20 Lift station B       NA       TRUE
283 2022-06-20 Lift station C 4.412117      FALSE
284 2022-06-20 Lift station D 4.509241      FALSE
285 2022-06-20           WWTP 4.533620      FALSE
286 2022-06-27 Lift station A       NA       TRUE
287 2022-06-27 Lift station B       NA       TRUE
288 2022-06-27 Lift station C 4.701963      FALSE
289 2022-06-27 Lift station D 4.915736      FALSE
290 2022-06-27           WWTP 4.893376      FALSE
291 2022-07-04 Lift station A       NA       TRUE
292 2022-07-04 Lift station B 4.827485      FALSE
293 2022-07-04 Lift station C 4.726438      FALSE
294 2022-07-04 Lift station D 4.972369      FALSE
295 2022-07-04           WWTP 4.977504      FALSE
296 2022-07-11 Lift station A       NA       TRUE
297 2022-07-11 Lift station B 4.968331      FALSE
298 2022-07-11 Lift station C       NA       TRUE
299 2022-07-11 Lift station D       NA       TRUE
300 2022-07-11           WWTP 5.045952      FALSE
301 2022-07-18 Lift station A       NA       TRUE
302 2022-07-18 Lift station B       NA       TRUE
303 2022-07-18 Lift station C 5.475307      FALSE
304 2022-07-18 Lift station D 5.713996      FALSE
305 2022-07-18           WWTP 5.367066      FALSE
306 2022-07-25 Lift station A 5.663719      FALSE
307 2022-07-25 Lift station B       NA       TRUE
308 2022-07-25 Lift station C 5.202822      FALSE
309 2022-07-25 Lift station D 5.428252      FALSE
310 2022-07-25           WWTP 5.333170      FALSE
311 2022-08-01 Lift station A 5.449763      FALSE
312 2022-08-01 Lift station B 5.336353      FALSE
313 2022-08-01 Lift station C 4.700736      FALSE
314 2022-08-01 Lift station D 4.908666      FALSE
315 2022-08-01           WWTP 5.158984      FALSE
316 2022-08-08 Lift station A 5.580385      FALSE
317 2022-08-08 Lift station B 5.813875      FALSE
318 2022-08-08 Lift station C       NA       TRUE
319 2022-08-08 Lift station D       NA       TRUE
320 2022-08-08           WWTP 5.296705      FALSE
321 2022-08-15 Lift station A       NA       TRUE
322 2022-08-15 Lift station B 5.087009      FALSE
323 2022-08-15 Lift station C 4.353382      FALSE
324 2022-08-15 Lift station D 4.468795      FALSE
325 2022-08-15           WWTP 4.761753      FALSE
326 2022-08-22 Lift station A 4.972341      FALSE
327 2022-08-22 Lift station B       NA       TRUE
328 2022-08-22 Lift station C 4.353001      FALSE
329 2022-08-22 Lift station D 4.490474      FALSE
330 2022-08-22           WWTP 4.647420      FALSE
331 2022-08-29 Lift station A 4.730164      FALSE
332 2022-08-29 Lift station B       NA       TRUE
333 2022-08-29 Lift station C 3.753268      FALSE
334 2022-08-29 Lift station D 3.984822      FALSE
335 2022-08-29           WWTP 4.358192      FALSE
336 2022-09-05 Lift station A 4.772474      FALSE
337 2022-09-05 Lift station B 4.979961      FALSE
338 2022-09-05 Lift station C 4.012023      FALSE
339 2022-09-05 Lift station D 4.325235      FALSE
340 2022-09-05           WWTP 4.352046      FALSE
341 2022-09-12 Lift station A 4.904766      FALSE
342 2022-09-12 Lift station B       NA       TRUE
343 2022-09-12 Lift station C 4.569789      FALSE
344 2022-09-12 Lift station D 4.938988      FALSE
345 2022-09-12           WWTP 4.426454      FALSE
346 2022-09-19 Lift station A 4.979590      FALSE
347 2022-09-19 Lift station B 5.512624      FALSE
348 2022-09-19 Lift station C 4.830270      FALSE
349 2022-09-19 Lift station D 5.238814      FALSE
350 2022-09-19           WWTP 4.480430      FALSE
351 2022-09-26 Lift station A       NA       TRUE
352 2022-09-26 Lift station B 4.463826      FALSE
353 2022-09-26 Lift station C 3.389554      FALSE
354 2022-09-26 Lift station D 3.797496      FALSE
355 2022-09-26           WWTP 3.899522      FALSE
356 2022-10-03 Lift station A 4.783696      FALSE
357 2022-10-03 Lift station B 5.275716      FALSE
358 2022-10-03 Lift station C 4.551455      FALSE
359 2022-10-03 Lift station D 4.957152      FALSE
360 2022-10-03           WWTP 4.298550      FALSE
361 2022-10-10 Lift station A 4.807143      FALSE
362 2022-10-10 Lift station B       NA       TRUE
363 2022-10-10 Lift station C 4.734067      FALSE
364 2022-10-10 Lift station D 5.079587      FALSE
365 2022-10-10           WWTP 4.310610      FALSE
366 2022-10-17 Lift station A       NA       TRUE
367 2022-10-17 Lift station B       NA       TRUE
368 2022-10-17 Lift station C 4.174966      FALSE
369 2022-10-17 Lift station D 4.395995      FALSE
370 2022-10-17           WWTP 4.081558      FALSE
371 2022-10-24 Lift station A 4.591612      FALSE
372 2022-10-24 Lift station B 4.966974      FALSE
373 2022-10-24 Lift station C 4.281430      FALSE
374 2022-10-24 Lift station D 4.303030      FALSE
375 2022-10-24           WWTP 4.109536      FALSE
376 2022-10-31 Lift station A 4.565715      FALSE
377 2022-10-31 Lift station B 4.813525      FALSE
378 2022-10-31 Lift station C 4.246507      FALSE
379 2022-10-31 Lift station D 4.022139      FALSE
380 2022-10-31           WWTP 4.092752      FALSE
381 2022-11-07 Lift station A 4.851096      FALSE
382 2022-11-07 Lift station B 5.179645      FALSE
383 2022-11-07 Lift station C 4.946375      FALSE
384 2022-11-07 Lift station D 4.545763      FALSE
385 2022-11-07           WWTP 4.408887      FALSE
386 2022-11-14 Lift station A       NA       TRUE
387 2022-11-14 Lift station B 5.472800      FALSE
388 2022-11-14 Lift station C 5.548770      FALSE
389 2022-11-14 Lift station D 5.086430      FALSE
390 2022-11-14           WWTP 4.744275      FALSE
391 2022-11-21 Lift station A       NA       TRUE
392 2022-11-21 Lift station B 5.230644      FALSE
393 2022-11-21 Lift station C 5.462300      FALSE
394 2022-11-21 Lift station D 5.067701      FALSE
395 2022-11-21           WWTP 4.797173      FALSE
396 2022-11-28 Lift station A 5.134189      FALSE
397 2022-11-28 Lift station B       NA       TRUE
398 2022-11-28 Lift station C 5.632102      FALSE
399 2022-11-28 Lift station D 5.416644      FALSE
400 2022-11-28           WWTP 4.885524      FALSE
401 2022-12-05 Lift station A 4.847601      FALSE
402 2022-12-05 Lift station B       NA       TRUE
403 2022-12-05 Lift station C 4.862490      FALSE
404 2022-12-05 Lift station D 4.935342      FALSE
405 2022-12-05           WWTP 4.623078      FALSE
406 2022-12-12 Lift station A       NA       TRUE
407 2022-12-12 Lift station B 3.989367      FALSE
408 2022-12-12 Lift station C 4.515704      FALSE
409 2022-12-12 Lift station D 4.911444      FALSE
410 2022-12-12           WWTP 4.522467      FALSE
411 2022-12-19 Lift station A 5.065173      FALSE
412 2022-12-19 Lift station B       NA       TRUE
413 2022-12-19 Lift station C 5.091807      FALSE
414 2022-12-19 Lift station D 5.799730      FALSE
415 2022-12-19           WWTP 4.940762      FALSE
416 2022-12-26 Lift station A       NA       TRUE
417 2022-12-26 Lift station B       NA       TRUE
418 2022-12-26 Lift station C       NA       TRUE
419 2022-12-26 Lift station D       NA       TRUE
420 2022-12-26           WWTP       NA       TRUE
421 2023-01-02 Lift station A       NA       TRUE
422 2023-01-02 Lift station B       NA       TRUE
423 2023-01-02 Lift station C       NA       TRUE
424 2023-01-02 Lift station D       NA       TRUE
425 2023-01-02           WWTP       NA       TRUE
426 2023-01-09 Lift station A 5.045564      FALSE
427 2023-01-09 Lift station B 4.646863      FALSE
428 2023-01-09 Lift station C 4.853137      FALSE
429 2023-01-09 Lift station D 5.881326      FALSE
430 2023-01-09           WWTP 4.918670      FALSE
431 2023-01-16 Lift station A 5.155937      FALSE
432 2023-01-16 Lift station B       NA       TRUE
433 2023-01-16 Lift station C       NA       TRUE
434 2023-01-16 Lift station D       NA       TRUE
435 2023-01-16           WWTP 5.025141      FALSE
436 2023-01-23 Lift station A 5.522798      FALSE
437 2023-01-23 Lift station B       NA       TRUE
438 2023-01-23 Lift station C 6.014155      FALSE
439 2023-01-23 Lift station D 6.942382      FALSE
440 2023-01-23           WWTP 5.353660      FALSE
441 2023-01-30 Lift station A 4.882314      FALSE
442 2023-01-30 Lift station B       NA       TRUE
443 2023-01-30 Lift station C 4.058933      FALSE
444 2023-01-30 Lift station D       NA       TRUE
445 2023-01-30           WWTP 4.646803      FALSE
446 2023-02-06 Lift station A 5.284055      FALSE
447 2023-02-06 Lift station B       NA       TRUE
448 2023-02-06 Lift station C 5.073381      FALSE
449 2023-02-06 Lift station D 5.741934      FALSE
450 2023-02-06           WWTP 5.044864      FALSE
451 2023-02-13 Lift station A 5.317126      FALSE
452 2023-02-13 Lift station B       NA       TRUE
453 2023-02-13 Lift station C 5.489343      FALSE
454 2023-02-13 Lift station D 5.981146      FALSE
455 2023-02-13           WWTP 5.102399      FALSE
456 2023-02-20 Lift station A 5.098546      FALSE
457 2023-02-20 Lift station B       NA       TRUE
458 2023-02-20 Lift station C 5.133391      FALSE
459 2023-02-20 Lift station D 5.503848      FALSE
460 2023-02-20           WWTP 4.895386      FALSE
461 2023-02-27 Lift station A 4.902653      FALSE
462 2023-02-27 Lift station B 4.478055      FALSE
463 2023-02-27 Lift station C 4.849247      FALSE
464 2023-02-27 Lift station D 5.108564      FALSE
465 2023-02-27           WWTP 4.699857      FALSE
466 2023-03-06 Lift station A 4.708244      FALSE
467 2023-03-06 Lift station B       NA       TRUE
468 2023-03-06 Lift station C 4.266383      FALSE
469 2023-03-06 Lift station D 4.453423      FALSE
470 2023-03-06           WWTP 4.469688      FALSE
471 2023-03-13 Lift station A 4.743263      FALSE
472 2023-03-13 Lift station B       NA       TRUE
473 2023-03-13 Lift station C 4.139546      FALSE
474 2023-03-13 Lift station D 4.286500      FALSE
475 2023-03-13           WWTP 4.490017      FALSE

The Data

Log10 Copies per Liter (SARS-CoV-2 concentration) for several locations:

Trend estimation

Previous method: smoothing spline

Code
WWTP = all_ts_observed %>% dplyr::filter(name == "WWTP")
WWTP$y.smooth <- smooth.spline(x = WWTP$dates, 
                               y = zoo::na.approx(WWTP$value))$y

missing_dates_lwr <- unique(WWTP$dates[which(WWTP$ts_missing)])
missing_dates_upr <- unique(WWTP$dates[which(WWTP$ts_missing)+1])[1:length(missing_dates_lwr)]
        

ggplot(data = WWTP, aes(x = dates, y = value)) + 
  annotate("rect", xmin = missing_dates_lwr, 
                       xmax  = missing_dates_upr, 
                       ymin = min(WWTP$value, na.rm = T), 
                       ymax = max(WWTP$value, na.rm = T),
                       alpha = .3) +
        # geom_line(col = "#332288", lwd = 1.5) +
geom_point(col = "#332288", size = 3)+
         geom_line(aes(x = dates, y = y.smooth), col = "#33228850", lwd = 4) +
        theme_minimal() + 
          scale_color_manual(values = c("#33228850", "#332288")) +
        ylab("Log 10 Copies/L") + xlab("Date") + 
        ggtitle("Smoothing Spline for WWTP") + 
  theme(legend.position = "none", 
                      title = element_text(size = 15), 
                      axis.title.x = element_text(size = 20),
                      axis.title.y = element_text(size = 20),
                      axis.text.x =element_text(size = 20), 
                      axis.text.y = element_text(size = 20)) 

Spline methods

Pros

  • Gives a nice, smooth trend estimate
  • Functions available in base R

Cons

  • No way to separate process variability from observation error
  • Clunky when there’s missing data
  • Experience required for model fitting, interpretation, etc. (see A review of spline function procedures in R)
  • Not a time series method!

Time series data require time series methods

  • If a model assumes independence, but data are temporally correlated, standard error estimates will be biased
  • cannot give us uncertainty estimates of the trend (out of the box)

…Can we use a slightly fancier model that provides a similar fit?

State space models

A flexible modeling framework that specifies a model in two levels:

\[\begin{align} \text{data model (observation equation): } y_{t} &= A_t\mu_t + v_t\\ \text{process model (state equation): } \mu_{t+1} &= \Phi_{t} \mu_{t} + R_t w_t. \\ \end{align}\]

The choice of \(A_t, \Phi_t, R_t\) allow for a wide range of models to be specified.

State space second order difference model

\[\begin{align} \text{Observation equation: }& y_{t} = \mu_t + v_{t}\\ \text{State equation: }& (\mu_t - \mu_{t-1}) = (\mu_{t-1} - \mu_{t-2}) + w_t.\\ \text{ OR}\\ \mu_t &= 2\mu_{t-1} - \mu_{t-2} + w_t\\ \text{ OR}\\ \begin{pmatrix} \mu_t\\\mu_{t-1}\end{pmatrix} &= \begin{pmatrix} 2 & -1\\ 1 &0 \end{pmatrix} \begin{pmatrix}\mu_{t-1}\\ \mu_{t-2}\end{pmatrix} + \begin{pmatrix}1\\0\end{pmatrix}w_t. \\ \text{Initial condition: }& \mu_0 \sim N(c_0, m_0). \end{align}\] Parameters are \(\sigma^2_v, \sigma^2_w, c_0, m_0\)

See this example for a more mathematical treatment.

Several ways to obtain trend estimates using this model…

Types of estimation

  • Prediction: forecasting future values of the state
    • For example: one-step-ahead forecast
  • Online (Filtering): best estimate of the current value of the state
    • Estimate \(\mathbb{E}(\mu_t | y_1, \dots, y_{t})\)
  • Retrospective (Smoothing): best estimates of past values of the state given all observations
    • For example: \(\mathbb{E}(\mu_{t-1}|y_1, \dots, y_n)\)

Variance comparison

  • Observation variance: \(\sigma_v^2\), sampling and lab variation
  • State variance: \(\sigma_w^2\), inherent process variability

Table of Standard Deviations

            name   obs state    pop
1           WWTP 0.037 0.013 551150
2 Lift station A 0.036 0.012 373937
3 Lift station B 0.133 0.014   4849
4 Lift station C 0.269 0.011   2442
5 Lift station D 0.268 0.019   1724

State space models

Pros

  • Can specify many model structures
  • Separate process variability from observation error
  • Provides uncertainty estimates
  • Handles missing data
  • Many R packages available: astsa, KFAS, MARSS

Cons

  • Can be difficult to specify model structure
  • Varying ease of use for applied researchers with R packages

Big Pro:

For second order difference model, the retrospective (smoother) trend estimates will be the same as the smoothing spline!

Time series modeling assumption

Autocorrelation of model’s residuals should not be present.

Algorithm 1: Summary of trend estimation

Inputs: Raw lab values

  1. Initialize the model by estimating parameters using the first 10 weeks of data.
  2. Compute online trend estimates and confidence limits and re-estimating parameters with each additional data point.
  3. Compute retrospective trend estimates and confidence limits.
  4. Verify convergence of estimates and time series modeling assumptions.
  5. Compute table of state and observation variances for each time series.
  6. Compare visualizations of retrospective estimates of WWTP and sub-sewershed trends to determine whether a difference was present.

Outputs: Online and retrospective trend estimates, estimates of trend variability and measurement/sampling variability.

Do lift station series give different information than the WWTP?

(and if so, when?)

Retrospective comparison

Lift stations appear to give different information.

How can we integrate information from times of concern into the wastewater surveillance system in an online fashion?

(Online) Detection of deviations

Is there another tool for which we can use the online (filter) estimates?

Statistical Process Control (SPC)

When is a process “out of control” or experiencing a mean shift/structural break?

Traditional applications: (from Montgomery’s Statistical Quality Control)

  • ensuring a given percentage of on-time deliveries to a client
  • speed and consistency of service quality in a bank
  • loading passengers onto an airplane

Control chart options:

  • Shewhart chart
  • CUSUM chart
  • Exponentially weighted moving average chart

EWMA chart

For a time series \(d_t\):

\[ z_t = \lambda d_t + (1- \lambda)z_{t-1} \]

  • \(z_t\) is a weighted average of all past values of \(d_t\)
  • contribution of each past value is controlled by \(\lambda\)
    • we use lag 1 autocorrelation estimate for \(d_t\).

Reasoning for choice of EWMA chart:

  • can use with just 1 value
  • can detect changes in correlated series

Which series do we want to monitor?

What’s our \(d_t\) for the control chart?

Standardized difference series

The standardized difference at time point \(t\), for lift station \(i=1,\ldots,4\) is given by:

\[ d_{i,t} = \frac{y_{i,t} - \widehat{\mu}_t}{\widetilde{\sigma}_d} = \frac{\text{LS Observed - WWTP estimate}}{\text{std dev.}}. \]

  • Need to compute \(\tilde{\sigma}_d^2 = {\rm Var}(y_{i,t} - \hat{\mu}_t)\)

We approximate this variance by: \[ \widetilde{\sigma}_d^2 \approx \widehat{\sigma}_{v_t}^2 + \widehat{\sigma}_{w_t}^2 -2{\rm Corr}(y_{i}, \widehat{\mu})\cdot \widehat{\sigma}_{v_t} \cdot \widehat{\sigma}_{w_t}, \] \({\rm Corr}(y_{i}, \hat{\mu})\) is the Pearson correlation.

Goal: create an EWMA chart for the difference between estimated WWTP state and observed LS values.

EWMA Charts for Each Lift Station

Summary of EWMA chart procedure

Inputs: At least 10 + \(n\) WWTP observations, \(n \geq 1\) sub-sewershed observations

  1. Read in cleaned WWTP series and obtain online trend estimates through the date of the first sub-sewershed observation.
  2. Replace any missing sub-sewershed observations with WWTP online trend estimate for corresponding date.
  3. Create difference time series of sub-sewershed observed copies/liter (\(\log10\)) - WWTP Online Trend Estimate.
  4. Standardize the difference series by dividing by the standard deviation computed approximation
  5. Construct EWMA chart for the standardized difference series.
  6. Inspect EWMA chart for separation.

Outputs: WMA chart for determining separation of sub-sewershed from centralized WWTP.

Conclusion

Summary

  • We used a state space second order difference model to estimate trends in time series
  • We estimated both the lab/sampling variability for each series
  • We developed an EWMA control chart to detect when the lift stations are deviating from the larger WWTP

Ongoing work

  • Joint modeling
  • Spatial component

\[\begin{align} \text{data model (observation equation): } y_{t} &= A_t\mu_t + v_t\\ \text{process model (state equation): } \mu_{t+1} &= \Phi_{t} \mu_{t} + R_t w_t. \\ \end{align}\]

  • Incorporate Batch Sampling

Acknowledgements

This work was supported by the:

  • CDC Foundation (project no. 1085.46)

  • Centers for Disease Control and Prevention (ELC-ED grant no. 6NU50CK000557-01-05 and ELC-CORE grant no. NU50CK000557).

  • Rockefeller Foundation

Learn more about this work!

Paper: “Online trend estimation and detection of trend deviations in sub-sewershed time series of SARS-CoV-2 RNA measured in wastewater.” Link to Paper

https://juliaschedler.com

https://github.com/hou-wastewater-epi-org

https://hou-wastewater-epi.org/

Email me: [email protected] or, after August 15, [email protected]