This repository has been archived by the owner on Dec 2, 2024. It is now read-only.
forked from bbartholdy/mb11CalculusPilot
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathsupp-mat.qmd
1206 lines (1018 loc) · 42.8 KB
/
supp-mat.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
title: "Supplementary Materials"
execute:
echo: false
warning: false
knitr:
opts_chunk:
message: false
format:
html:
format-links: false
echo: true
toc: true
code-fold: true
embed-resources: true
pdf:
toc: true
bibliography: "../paper/references.bib"
---
```{r}
#| label: setup
#| include: false
library(here)
library(readr)
devtools::load_all()
# upload data
# metadata <- read_tsv(here("analysis/data/raw_data/metadata.tsv"))
# demography <- read_csv(here("analysis/data/raw_data/demography.csv"))
# lloq <- read_tsv(here("analysis/data/raw_data/lloq.tsv"))
# uhplc_data_comb <- read_csv(here("analysis/data/derived_data/uhplc-data_combined.csv"))
# #uhplc_calculus <- read_csv(here("analysis/data/derived_data/uhplc-calculus_cleaned.csv"))
# dental_inv <- read_csv(here("analysis/data/raw_data/dental-inv.csv"))
# caries <- read_csv(here("analysis/data/raw_data/caries.csv"))
# periodont <- read_csv(here("analysis/data/raw_data/periodontitis.csv"))
# periap <- read_csv(here("analysis/data/raw_data/periapical.csv"))
# calculus <- read_csv(here("analysis/data/raw_data/calculus.csv"))
# calculus_full <- read_csv(here("analysis/data/raw_data/calculus_full.csv"))
# sinusitis_clean <- read_csv(here("analysis/data/derived_data/sinusitis_cleaned.csv"))
# path_cond_clean <- read_csv(here("analysis/data/derived_data/path-conditions_cleaned.csv"))
source(here("analysis/scripts/setup-qmd.R"))
```
These supplementary figures and tables are a variable hodgepodge of things that
didn't fit in the main manuscript, and additional things I thought might be
useful. The best way to explore/verify the results and interpretations is to
download all the data and code (<https://doi.org/10.5281/zenodo.7649824>)
and just play around with it yourself. Enjoy!
## Samples
Calculus samples are a combination of leftovers from a previous aDNA study and
newly sampled individuals. In some cases individuals from the previous study were
sampled again (if not enough calculus was left over from the previous study)
(@tbl-sampling).
### Selection
```{r}
#| label: tbl-sampling
#| tbl-cap: "Table showing which individuals were sampled in this study and which individuals were sampled in the previous study. When both TRUE, the individual was sampled twice."
metadata %>%
select(id, element, KZ_element) %>%
mutate(
element = case_when(is.na(element) ~ FALSE,
TRUE ~ TRUE),
KZ_element = case_when(is.na(KZ_element) ~ FALSE,
TRUE ~ TRUE),
) %>%
arrange(id) %>%
knitr::kable(col.names = c("ID", "this study", "previous study"))
```
### Demographics
```{r}
#| label: male-sample
males <- demography %>%
filter(sex == "m" | sex == "pm")
```
The sample consists of `r nrow(demography)` individuals, most of which are
middle adult male individuals (@fig-age-distribution). Middle adult males were
preferentially targeted due to larger calculus deposits (observation) and the
sample age and sex distribution is therefore not
representative of the population. This was also done to limit potential confounding
factors, and because pipe notches, which served as a positive control for tobacco,
are predominantly seen in male individuals at the site.
```{r}
#| label: fig-age-distribution
#| fig-cap: "Distribution of age and sex in the sample. f = female; pf = probable female; pm = probable male; m = male; eya = early young adult (18-24 years); lya = late young adult (25-34 years); ma = middle adult (35-49 years); old = old adult (50+ years)."
demography %>%
ggplot(aes(x = age, fill = sex)) +
geom_bar() +
theme_minimal()
```
### Missing data
An overview of the missing teeth can be found in @fig-missing-teeth. Missing
scores per tooth can be found in @tbl-missing-scores and @fig-missing-dental.
```{r}
#| label: fig-missing-teeth
#| fig-cap: "Heatmap of missing teeth per individual in the sample. 1 = present, 2 = missing."
dental_inv_long %>%
mutate(status = if_else(status == "p", 1, 0)) %>%
ggplot(aes(x = tooth, y = id, fill = status)) +
geom_tile()
```
```{r}
#| label: missing-values-teeth
caries_missing <- caries %>%
mutate(across(!id, ~ is.na(.x))) %>%
select(!id) %>%
colSums() %>%
as_tibble_row()
periodont_missing <- periodont %>%
mutate(across(!id, ~ is.na(.x))) %>%
select(!id) %>%
colSums() %>%
as_tibble_row()
periap_missing <- periap %>%
mutate(across(!id, ~ is.na(.x))) %>%
select(!id) %>%
colSums() %>%
as_tibble_row()
missing_tbl <- tibble(caries_missing) %>%
add_case(periodont_missing) %>%
add_case(periap_missing) %>%
mutate(score = c("caries", "periodontitis", "periapical"), .before = 1)
```
```{r}
#| label: tbl-missing-scores
#| tbl-cap: "Table of missing scores by tooth."
knitr::kable(missing_tbl)
```
```{r}
#| label: fig-missing-dental
#| fig-cap: "Plots"
#| layout-ncol: 2
#| fig-subcap:
#| - "Caries"
#| - "Periodontitis"
#| - "Periapical lesions"
#| - "Combined caries, periodontitis, periapical."
caries %>%
mutate(across(!id, ~ is.na(.x))) %>%
pivot_longer(-id, names_to = "tooth") %>%
mutate(value = if_else(value == TRUE, "missing", "present")) %>%
ggplot(aes(x = tooth, y = id, fill = value)) +
geom_tile()
periodont %>%
mutate(across(!id, ~ is.na(.x))) %>%
pivot_longer(-id, names_to = "tooth") %>%
mutate(value = if_else(value == TRUE, "missing", "present")) %>%
ggplot(aes(x = tooth, y = id, fill = value)) +
geom_tile()
periap %>%
mutate(across(!id, ~ is.na(.x))) %>%
pivot_longer(-id, names_to = "tooth") %>%
mutate(value = if_else(value == TRUE, "missing", "present")) %>%
ggplot(aes(x = tooth, y = id, fill = value)) +
geom_tile()
dental_long %>%
mutate(across(c(caries, periodont, periap), is.na)) %>%
select(c(caries, periodont, periap, id, tooth)) %>%
#group_by(id, tooth) %>%
rowwise() %>%
mutate(missing = sum(caries, periodont, periap)) %>%
ggplot(aes(x = tooth, y = id, fill = missing)) +
geom_tile()
```
## UHPLC analysis
```{r}
#| label: setup-uhplc
# presence/absence data frame
uhplc_calculus_bin <- uhplc_calculus_long %>%
mutate(presence = if_else(quant > 0, 1, quant))
# successfully replicated samples only
uhplc_calculus_replicated <- uhplc_calculus_bin %>%
mutate(compound = str_remove(compound, "_calc")) %>%
group_by(id, sample, compound) %>% # combine batches
summarise(presence = sum(presence)) #%>%
uhplc_calculus_replicated <- uhplc_calculus_bin %>%
filter(id %in% filter(metadata, replicated == TRUE)$id) %>%
group_by(id, sample, compound) %>% # combine batches
summarise(presence = sum(presence)) %>%
filter(presence == 0 | presence == 2) %>% # remove compounds only detected in one batch
group_by(compound) %>%
mutate(presence = if_else(presence == 0, 0, 1)) %>% # convert replications to presence/absence
ungroup()
uhplc_replicated_wide <- uhplc_calculus_replicated %>%
mutate(compound = case_when(compound == "nicotine" ~ "tobacco",
compound == "cotinine" ~ "tobacco",
TRUE ~ compound)) %>%
group_by(id, sample, compound) %>%
summarise(presence = sum(presence)) %>% # combine nicotine and cotinine
#remove_missing() %>%
mutate(presence = case_when(presence > 0 ~ TRUE,
TRUE ~ FALSE)) %>%
pivot_wider(names_from = "compound", values_from = "presence")
```
The UHPLC-MS/MS method was validated in a separate study on cadavers received
for forensic autopsy and toxicological analysis. Results from dental calculus were
validated against compounds detected in whole blood samples from the same
individuals [@sorensenDrugsCalculus2021].
In the original method, samples were washed three times to remove residual
substances from the surface of the calculus that originated from oral fluids,
and only extract substances from the calculus. In our samples the washes served
to remove potential contaminants from the burial environment and post-excavation
handling.
Briefly, dental calculus was treated with citric acid and the dissolution extracts
were cleaned using weak and strong polymeric cation-exchange sorbents. Samples
were washed with 0.5 mL MeOH for 10 seconds. Samples were weighed before and after
each wash. The wash solvent was evaporated to a residual volume of 10 µl and added
50µl 30% methanol.
Samples were air-dried for 24 hours at room temperature after each wash.
Extracts from each wash were analysed by injecting 5 µL into the column
on an Exion UHPLC system that consisted of two Exion AD pumps, an Exion AD
multiplate autosampler set at 10 $\pm$ 2 °C and an Exion AC column oven set
at 40 $\pm$ 2 °C (Sciex, Ontario, Canada).
Separation was performed using a Raptor Biphenyl UHPLC column (2.7 mm, 2.1 mm I.D.
$\times$ 100 mm) (Restek, Bellefonte, PA). The mass spectrometer was a Sciex QTRAP
6500+ with a TurboIonSpray probe for electrospray ionisation.
The remaining calculus was dissolved using lysing tube beads in 800 $\mu$L of 0.5
$\small{M}$ citric acid (CA) and 50 $\mu$L stable isotope-labelled analogue used
as internal standards (SIL-IS) solution for 1 h at ambient
temperature with gentle shaking. The suspension was then mixed with 800 $\mu$L
MeOH and centrifuged at 10,000 $\times$ g for 5 mins, and analysed by the same
method as the wash extracts.
Data analysis was performed using Analyst 1.7 and MultiQuant 3.0.3 (Sciex).
Raw quantities of compounds are presented in ng and concentrations as ng / mg.
The samples in the replication batch were processed in the same way, but
analysed on different equipment used exclusively for oral samples.
Raw quantities of compounds detected in the dissolved calculus from batches
1 and 2 are presented in @tbl-uhplc-batch-1 and @tbl-uhplc-batch-2. Since
these tables may or may not be legible in PDF format, not to mention that they
don't adhere to FAIR principles in this format, the raw data can be downloaded
from Zenodo (https://doi.org/10.5281/zenodo.8061483).
```{r}
#| label: tbl-lloq
#| tbl-cap: "Target compounds and lower limits of quantitation (LLOQ)."
lloq %>%
arrange(compound) %>%
knitr::kable(col.names = c("Target", "LLOQ"))
```
```{r}
#| label: tbl-uhplc-batch-1
#| tbl-cap: "Results from the UHPLC analysis first batch. Quantity of compound in the dissolved calculus, represented in ng and rounded to 3 digits after the decimal."
uhplc_calculus_long %>%
filter(batch == "batch1") %>%
mutate(quant = round(quant, 3)) %>%
select(!c(sample, conc, extraction, batch, presence)) %>%
pivot_wider(names_from = "compound", values_from = "quant") %>%
knitr::kable()
```
```{r}
#| label: tbl-uhplc-batch-2
#| tbl-cap: "Results from the UHPLC analysis second batch. Quantity of compound in calculus after third wash, represented in ng and rounded to 3 digits after the decimal."
uhplc_calculus_long %>%
filter(batch == "batch2") %>%
mutate(quant = round(quant, 3)) %>%
pivot_wider(names_from = "compound", values_from = "quant") %>%
select(!c(extraction, batch)) %>%
knitr::kable()
```
### Authentication
No modern synthetic drugs were detected in any of the samples.
Samples were replicated to verify results from the initial analysis. Of the
`r nrow(demography)` samples initially analysed,
`r nrow(filter(metadata, replicated == TRUE))` samples were replicated.
Only caffeine, theophylline, nicotine, cotinine, and salicylic acid were found
in the replicated samples.
Most plots show a large increase in extracted mass of a compound between the
calculus wash extracts (wash 1-3) and the dissolved calculus (calc). Most samples
containing theophylline and caffeine had the largest quantity of the compound
extracted from the first wash, then decreasing in washes 2 and 3. There is
an increase between wash 3 and the dissolved calculus in all samples.
The patterns are consistent across batches 1 and 2. The pattern we expect to see
in a sample is a reduction in the quantity from wash 1 to wash 3, and then another
spike in the final extraction from the dissolved calculus, which means the compound
is actually 'ancient' or authentic. The compounds that are completely absent in
all three washes and present in high quantities in the final extraction may also
be suggestive of lab contamination.
This has not been thoroughly tested and is
only based on what we expect to see. Therefore, the interpretation of these graphs
is itself up for interpretation.
```{r}
#| label: fig-auth-plot-batch1
#| fig-cap: "Plot of extracted quantities of each compound across the three washes and calculus extraction in batch 1. Each line represents an individual."
uhplc_data_long %>%
filter(
batch == "batch1",
) %>%
semi_join(quant_filter, by = c("sample", "compound")) %>%
mutate(
extraction = factor(extraction, levels = c("wash1", "wash2", "wash3", "calc")),
sample = as.factor(sample)
) %>%
ggplot(aes(x = extraction, y = quant, group = sample, colour = sample)) +
geom_line() +
geom_point(size = 0.2) +
facet_wrap(~ compound, scales = "free_y", ncol = 3) +
theme_bw()
```
```{r}
#| label: fig-auth-plot-batch2
#| fig-cap: "Plot of extracted quantities of each compound across the three washes and calculus extraction in batch 2. Each line represents an individual."
uhplc_data_long %>%
filter(
batch == "batch2",
!compound %in% c("cbd", "cbn", "cocaine", "thc", "thca-a", "thcva") # remove compounds not detected in batch 2
) %>%
semi_join(quant_filter, by = c("sample", "compound")) %>% # remove compounds not detected in each sample
mutate(
extraction = factor(extraction, levels = c("wash1", "wash2", "wash3", "calc")),
sample = as.factor(sample)
) %>%
ggplot(aes(x = extraction, y = quant, group = sample, colour = sample)) +
geom_line() +
geom_point(size = 0.2) +
facet_wrap(~ compound, scales = "free_y", ncol = 2) +
theme_bw()
```
### Quantity vs. sample weight
There is no clear relationship between the sample weight and the amount of compound
detected, except for salicylic acid, where the amount of extracted compound increases
with increasing sample weight. In batch 2 there is also a slight positive trend
for caffeine, nicotine, and cotinine.
Nicotine and cotinine display the same relative relationship between samples. Where
the nicotine quantity is high compared to other samples, the cotinine quantity
will be similarly high (@fig-quant-weight-1 and @fig-quant-weight-1).
The positive correlation between the weight of the calculus
sample and recovered quantities of the compounds suggests sample weight may affect
the ability to detect compounds; although, we were able to detect
compounds in samples as small as 2 mg (@fig-quant-weight-1 and @fig-quant-weight-2).
```{r}
#| label: fig-quant-weight-1
#| fig-cap: "Quantity of a compound (ng) found in a sample plotted against the weight of the calculus sample. Results from batch 1."
uhplc_data_comb %>%
select(sample, contains("batch1")) %>%
select(sample, batch1_weight, contains("calc")) %>%
pivot_longer(
-c(sample, batch1_weight),
names_to = c("compound", "batch"),
names_pattern = "(.*)_(.*)",
values_to = c("conc")
) %>%
filter(conc > 0) %>%
mutate(compound = str_remove(compound, "_calc")) %>%
ggplot(aes(x = batch1_weight, y = conc, col = as.factor(sample))) +
geom_point() +
facet_wrap(~ compound, scales = "free_y") +
theme_bw() +
theme(legend.position = "none") +
labs(x = "Calculus weight (mg)", y = "Quantity (ng)")
```
```{r}
#| label: fig-quant-weight-2
#| fig-cap: "Quantity of a compound (ng) found in a sample plotted against the weight of the calculus sample. Results from batch 2."
uhplc_data_comb %>%
select(sample, contains("batch2")) %>%
select(sample, batch2_weight, contains("calc")) %>%
pivot_longer(
-c(sample, batch2_weight),
names_to = c("compound", "batch"),
names_pattern = "(.*)_(.*)",
values_to = c("conc")
) %>%
mutate(compound = str_remove(compound, "_calc")) %>%
#remove_missing() %>%
filter(conc > 0) %>%
ggplot(aes(x = batch2_weight, y = conc, col = as.factor(sample))) +
geom_point() +
facet_wrap(~ compound, scales = "free_y") +
theme_bw() +
labs(y = "Quantity (ng)", x = "Calculus weight (mg)", col = "Sample number")
```
### Distribution of compounds detected in the samples
<!-- absolute counts in each batch -->
```{r}
#| label: fig-compounds-detect
#| fig-cap: "Number of individuals in which each compound was detected between batch 1 and 2."
uhplc_calculus_long %>%
filter(quant > 0) %>%
ggplot(aes(x = compound, fill = compound)) +
geom_bar() +
facet_wrap(~ batch) +
theme_bw() +
theme(
axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1),
legend.position = "none"
)
```
The replication showed that caffeine, theophylline, cotinine, nicotine, and
salicylic acid could be consistently detected in the samples, although theophylline
detection decreased between batches 1 and 2. CBD, CBN, cocaine, and THCA-A was not
detected at all in the second batch.
<!-- absolute counts in the replicated individuals -->
```{r}
#| label: fig-compounds-detect2
#| fig-cap: "Number of individuals in which each compound was detected between batch 1 and 2. Only showing replicated individuals."
uhplc_calculus_long %>%
filter(id %in% filter(metadata, replicated == T)$id,
quant > 0) %>%
ggplot(aes(x = compound, fill = compound)) +
geom_bar() +
facet_wrap(~ batch) +
theme_bw() +
theme(
axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1),
legend.position = "none"
)
```
### Detection and preservation
To see if preservation of the skeletal remains had any effect on the detection of
compounds, absolute quantities of compounds were compared to the various levels of
preservation.
```{r}
#| label: fig-detection-preservation
#| fig-cap: "Plot of relationship between the absolute quantity of a detected compound (ng) and the overall skeletal preservation of the individuals in which the compound was detected. Showing results for batch 1."
uhplc_calculus_long %>%
filter(
!is.na(preservation),
batch == "batch1",
quant > 0,
) %>%
ggplot(aes(x = preservation, y = quant)) +
geom_violin(aes(fill = preservation), alpha = 0.6) +
geom_boxplot(width = 0.2) +
facet_wrap(~ compound, scales = "free_y", labeller = labeller(compound = compound_names), dir = "v") +
theme_bw() +
theme(
legend.position = "none",
panel.grid.major.x = element_blank(),
axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1)
) +
labs(x = "Preservation", y = "Quantity (ng)") +
scale_fill_viridis_d()
```
```{r}
#| label: fig-detection-preservation2
#| fig-cap: "Plot of relationship between the absolute quantity of a detected compound (ng) and the overall skeletal preservation of the individuals in which the compound was detected. Showing results for batch 2."
uhplc_calculus_long %>%
filter(
!is.na(preservation),
quant > 0,
batch == "batch2",
#!compound %in% c("cbd", "cbn", "cocaine", "thc", "thca-a", "thcva"), # remove compounds not detected
) %>%
ggplot(aes(x = preservation, y = quant)) +
geom_violin(aes(fill = preservation), alpha = 0.6) +
geom_boxplot(width = 0.2) +
facet_wrap(~ compound, scales = "free_y", labeller = labeller(compound = compound_names), dir = "v") +
theme_bw() +
theme(
legend.position = "none",
panel.grid.major.x = element_blank(),
axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1)
) +
labs(x = "Preservation", y = "Quantity (ng)") +
scale_fill_viridis_d()
```
Distribution of state-of-preservation in batches 1 and 2 to make sure the number
of skeletons are not affecting the relationships shown above. Given our sample
contains a smaller number of individuals with fair preservation, this may bias
our interpretations (@fig-preservation-detection).
```{r}
#| label: fig-preservation-detection
#| fig-cap: "Plot of the number of skeletons in each state of preservation separated by batch."
uhplc_calculus_long %>%
ungroup() %>%
distinct(id, batch, .keep_all = TRUE) %>%
remove_missing(vars = c("preservation", "weight")) %>%
ggplot(aes(x = preservation)) +
geom_bar() +
facet_wrap(~ batch) +
theme_bw()
```
### Detection of tobacco
Given that pipe notches are present in the majority of individuals, the presence
of pipe notch(es) in an individual and concurrent detection of nicotine and/or
cotinine is used as a rough indicator of the accuracy of the method.
```{r}
#| label: fig-corr-notches
#| fig-cap: "Plot of relationships between the total number of pipe notches in an individual and the concentration of detected compounds. The only relevant comparisons in this case are nicotine and cotinine. The others are just included because I couldn't be bothered filtering them out."
# correlation between nicotine and cotinine conc, and pipe notches
uhplc_calculus_long %>%
filter(compound %in% c("caffeine", "cotinine", "nicotine", "salicyl", "theophyl")) %>%
ggplot(aes(x = pipe_notch, y = conc)) +
geom_point() +
facet_wrap(~ compound + batch, scales = "free_y") +
theme_bw() +
labs(x= "number of pipe notches", y = "detected compound concentration (ng/mg)")
```
```{r}
#| label: tobacco-accuracy-setup
tobacco <- uhplc_calculus_long %>%
filter(compound %in% c("nicotine", "cotinine")) %>%
mutate(detection = if_else(is.na(conc), FALSE, TRUE))
tobacco <- uhplc_calculus_long %>%
filter(
#batch == "batch2",
compound %in% c("nicotine", "cotinine"),
sex == "m" | sex == "pm"
) %>%
mutate(detection = if_else(quant == 0, 0, 1)) # 0 = not detected; NA = not included in batch 2.
tobacco_accuracy <- tobacco %>%
remove_missing(vars = "quant") %>%
group_by(sample, id, .drop = F) %>%
summarise(
detection = sum(detection),
) %>%
left_join(select(demography, id, pipe_notch, preservation, age), by = "id") %>%
mutate(
pipe_notch = if_else(pipe_notch > 0, "Y", "N"),
correct = case_when(
detection == 0 & pipe_notch == "N" ~ 1,
detection > 0 & pipe_notch == "Y" ~ 1,
detection > 0 & pipe_notch == "N" ~ NaN, # no way of knowing macroscopically if the person smoked without pipe
TRUE ~ 0
)
)
accuracy_age <- tobacco_accuracy %>% # move to supplementary material
group_by(age) %>%
summarise(mean = mean(correct, na.rm = T))
tobacco_comb <- tobacco %>%
group_by(batch, sample, id, .add = TRUE) %>%
summarise(
detection = sum(detection),
.groups = "keep" # combine compounds: 0 = none detected; 1 = 1 detected; 2 = both detected
) %>%
ungroup() %>%
left_join(select(demography, id, pipe_notch), by = "id") %>%
mutate(
pipe_notch = if_else(pipe_notch > 0, "Y", "N"),
correct = case_when(
detection > 0 & pipe_notch == "Y" ~ TRUE,
detection == 0 & pipe_notch == "N" ~ TRUE,
detection > 0 & pipe_notch == "N" ~ NA, # can't be sure if this is true or not
TRUE ~ FALSE
)
)
nicotine <- tobacco %>%
filter(compound == "nicotine") %>%
mutate(
pipe_notch = if_else(pipe_notch > 0, "Y", "N"),
correct = case_when(
detection > 0 & pipe_notch == "Y" ~ TRUE,
detection == 0 & pipe_notch == "N" ~ TRUE,
detection > 0 & pipe_notch == "N" ~ NA, # can't be sure if this is true or not
TRUE ~ FALSE
)
)
cotinine <- tobacco %>%
filter(compound == "cotinine") %>%
mutate(
correct = case_when(
detection > 0 & pipe_notch == "Y" ~ TRUE,
detection == 0 & pipe_notch == "N" ~ TRUE,
detection > 0 & pipe_notch == "N" ~ NA, # can't be sure if this is true or not
TRUE ~ FALSE
)
)
# accuracy in replicated samples
uhplc_accuracy_batch2 <- uhplc_calculus_long %>%
filter(compound == "nicotine" | compound == "cotinine",
batch == "batch2") %>%
group_by(id) %>%
arrange(desc(presence), .by_group = T) %>%
distinct(id, .keep_all = T) %>%
mutate(
pipe_notch = if_else(pipe_notch > 0, "Y", "N"),
correct = case_when(presence == 1 & pipe_notch == "Y" ~ TRUE,
presence == 0 & pipe_notch == "N" ~ TRUE, # can we be sure if this is correct?
presence == 1 & pipe_notch == "N" ~ NA, # can't be sure if this is true or not
TRUE ~ FALSE))
# ratio of nicotine to cotinine
tobacco_ratio <- tobacco %>%
select(id, batch, compound, conc) %>%
pivot_wider(names_from = compound, values_from = conc) %>%
group_by(id, batch) %>%
summarise(ratio = cotinine / nicotine) %>%
remove_missing() %>%
left_join(select(demography, id, age, sex, preservation, pipe_notch)) %>%
filter(batch == "batch2",
ratio != Inf) %>%
mutate(
age = case_when(
age == "eya" ~ 0,
age == "lya" ~ 1,
age == "ma" ~ 2,
age == "old" ~ 3
),
preservation = case_when(
preservation == "fair" ~ 0,
preservation == "good" ~ 1,
preservation == "excellent" ~ 2
)
)
```
We found no correlation between the number of pipe notches
and the concentration of nicotine and cotinine, suggesting that our ability
to detect tobacco consumption in dental calculus does not necessarily rely on
targeting frequent smokers; here, we consider individuals with multiple pipe notches
as likely to have been heavy smokers.
No apparent correlation between the number of pipe notches
and the concentration of nicotine or cotinine (@fig-corr-notches).
The presence of pipe notch(es) in an individual and concurrent detection of nicotine
and/or cotinine is used as a crude indicator of the accuracy of the method. When
combining the results of both batches, the method was able to detect some form
of tobacco in
`r nrow(filter(tobacco_accuracy, pipe_notch == "Y", detection > 0))`
of `r nrow(filter(tobacco_accuracy, pipe_notch == "Y"))`
individuals with a pipe notch
(`r scales::percent(nrow(filter(tobacco_accuracy, pipe_notch == "Y", detection > 0)) / nrow(filter(tobacco_accuracy, pipe_notch == "Y")), accuracy = 0.1)`).
When also considering correct the absence of a tobacco alkaloid together with the absence
of a pipe notch, the accuracy of the method is
`r scales::percent(mean(tobacco_accuracy$correct, na.rm = T), accuracy = 0.1)`.
Accuracy in the old adult age category is
`r scales::percent(filter(accuracy_age, age == "old")$mean, accuracy = 0.1)`.
In the replicated samples only, tobacco detection was successful in
`r nrow(filter(uhplc_accuracy_batch2, pipe_notch == "Y", presence > 0))`
out of
`r nrow(filter(uhplc_accuracy_batch2, pipe_notch == "Y"))`
pipe smokers
(`r scales::percent(nrow(filter(uhplc_accuracy_batch2, pipe_notch == "Y", presence > 0)) / nrow(filter(uhplc_accuracy_batch2, pipe_notch == "Y")), accuracy = 0.1)`)
Including individuals with absence of a pipe notch and concurrent absence of
compounds as a correct identification, gives an overall accuracy of
`r scales::percent(mean(uhplc_accuracy_batch2$correct, na.rm = T), accuracy = 0.1)`.
One individual---an old adult, probable female---was positive
for both nicotine and cotinine, and had no signs of a pipe notch.
## Dental analysis
Pipe notches were identified by wear on the mesial and distal
sides of the crowns between to teeth, resulting from the practice of clenching a
pipe between adjacent and isomeric teeth, and which differs from the occlusal
wear that occurs through mastication. Wear occurring between adjacent and isomeric
teeth were counted as a single pipe notch.
Some of the teeth were missing because they have been sent elsewhere for DNA
sampling. These teeth were considered present when determining antemortem loss
ratios, and absent when scoring caries, periodontitis, and calculus.
An overview of available teeth can be seen in @fig-dental-inv.
```{r}
#| label: fig-dental-inv
#| fig-cap: "Overview of the dental inventory of the sample. Teeth removed for DNA analysis considered 'present'."
dental_long %>%
mutate(status = case_when(status == "dna" ~ "p", TRUE ~ status)) %>%
dental_plot(fill = status)
```
### AMTL
Ratios of antemortem lost teeth per present teeth at the site. Calculated per
individual (@tbl-aml-id), tooth class (@tbl-aml-class), and tooth type (@tbl-aml-type)
```{r}
#| label: tbl-aml-id
#| tbl-cap: "AMTL ratio per individual."
dental_long %>%
group_by(id) %>%
amtl_ratio(.status = status, .add = T) %>%
knitr::kable()
```
```{r}
#| label: tbl-aml-class
#| tbl-cap: "AMTL ratio per tooth class."
dental_long %>%
group_by(class) %>%
amtl_ratio(.status = status, .add = T) %>%
knitr::kable()
```
```{r}
#| label: tbl-aml-type
#| tbl-cap: "AMTL ratio per tooth type."
dental_long %>%
group_by(type) %>%
amtl_ratio(.status = status, .add = T) %>%
knitr::kable()
```
### Caries
Caries were scored as the location on each individual tooth. Multiple locations
on a single tooth were separated with `;`. The size of caries was also
recorded, but not used in further analysis. Large caries that cover multiple
surfaces with an unknown origin were recorded as 'crown'.
| code | surface |
|---|---|
| mes | mesial surface |
| dis | distal surface |
| lin | lingual surface |
| buc | buccal surface (including labial surface) |
| occ | occlusal surface (including incisal surface) |
| crown | caries covers 2+ surfaces |
| none | No caries visible on surface |
| NA | Not observable/tooth missing |
In the `r caries_ratio_site$n_teeth` that were examined, `r caries_ratio_site$count`
teeth had caries
(`r scales::percent(caries_ratio_site$ratio, accuracy = 0.1)`).
This frequency has very little meaning, and was further broken down into a ratio
for each individual and each tooth class (@tbl-caries-id-class and @fig-caries-class).
As expected, the molars have a higher frequency of caries than the other teeth.
```{r}
#| label: tbl-caries-id-class
#| tbl-cap: "Table of caries ratios per individual per tooth class."
caries_count %>%
caries_ratio(.caries = count, id, class) %>%
knitr::kable()
```
```{r}
#| label: fig-caries-class
#| fig-cap: "Plot of caries ratios calculated per individual per tooth class."
caries_count %>%
caries_ratio(.caries = count, class, id) %>%
ggplot(aes(x = class, y = ratio)) +
geom_violin(aes(fill = class)) +
geom_boxplot(width = 0.1) +
theme_minimal() +
theme(legend.position = "none") +
labs(x = "tooth class", y = "caries ratio")
```
```{r}
#| label: fig-caries-fun
#| fig-cap: "Plot of caries rate per tooth in pooled sample from all individuals. Teeth reordered along the x-axis to match position in the mouth (yes, the plot is supposed to resemble a mouth)."
upper_order <- c(paste0("t", 18:11), paste0("t", 21:28))
lower_order <- c(paste0("t", 48:41), paste0("t", 31:38))
maxilla <- caries_count %>%
filter(region == "maxilla") %>%
mutate(tooth = factor(tooth, levels = c(upper_order, lower_order))) %>%
group_by(tooth) %>%
summarise(
n_teeth = n(),
count = sum(count, na.rm = T),
rate = count / n_teeth
) %>%
ggplot(aes(x = tooth, y = rate)) +
geom_col(fill = "white") +
theme_dark() +
scale_y_reverse(limits = c(0.33,0), sec.axis = sec_axis(~.)) +
scale_x_discrete(position = "top") +
theme(
axis.title.x = element_blank(),
axis.ticks.x = element_blank(),
axis.line = element_line(colour = "red", size = 1),
axis.line.x.top = element_line(colour = "red", size = 4)
) +
labs(y = "")
mandible <- caries_count %>%
filter(region == "mandible") %>%
mutate(tooth = factor(tooth, levels = c(upper_order, lower_order))) %>%
group_by(tooth) %>%
summarise(
n_teeth = n(),
count = sum(count, na.rm = T),
rate = count / n_teeth
) %>%
ggplot(aes(x = tooth, y = rate)) +
geom_col(fill = "white") +
scale_y_continuous(sec.axis = sec_axis(~.)) +
theme_dark() +
theme(axis.line = element_line(colour = "red", size = 1),
axis.line.y.right = element_line(colour = "red", size = 1),
axis.line.x.bottom = element_line(colour = "red", size = 6),
axis.ticks.x = element_blank()) +
labs(y = "caries ratio")
maxilla / mandible + plot_layout(guides = "collect")
```
### Periodontitis
Periodontitis was scored qualitatively on a scale from 0-3 as the amount of
horizontal bone loss from the CEJ to the alveolar bone, accounting for ca. 2mm
of gingival thickness. The distribution of scores in the pooled sample dentitions
can be seen in @fig-periodont-scores.
```{r}
#| label: fig-periodont-scores
#| fig-cap: "Distribution of periodontitis scores in each tooth (FDI notation) in the pooled sample."
periodont %>%
dental_longer(-id) %>%
remove_missing(vars = "score") %>%
dental_plot(fill = score)
```
### Calculus
```{r}
calc_index <- calculus_full %>%
dental_longer(-id) %>%
calculus_index()
```
Calculus was scored on each tooth surface (interproximal surfaces were given a single score)
on a scale of 0-3, representing absence of calculus (0) to heavy deposit (3).
Distribution of individual calculus indices within the sample, separated by
quadrant shows that the lower anterior quadrant had the largest deposits (@fig-calculus-quad).
No apparent influence of lower anterior calculus index on the presence/absence
of a compound (or vice versa) (@fig-calc-compound).
```{r}
#| label: fig-calculus-quad
#| fig-cap: "Calculus index per quadrant. LA = lower anterior, LP = lower posterior, UA = upper anterior, UP = upper posterior."
calc_index %>%
ggplot(aes(x = quadrant, y = calc_index)) +
geom_violin(aes(fill = quadrant), alpha = 0.6) +
geom_boxplot(width = 0.1) +
theme_minimal() +
theme(panel.grid.major.x = element_blank())
```
```{r}
#| label: fig-calc-compound
#| fig-cap: "Relationship between the presence (1) or abesence (0) of a compound and the calculus index of the lower anterior quadrant of an individual."
calc_index %>%
left_join(uhplc_calculus_long, by = "id") %>%
filter(
batch == "batch1",
compound != "thc",
compound != "cbd",
compound != "thcva",
quadrant == "LA"
) %>%
ggplot(aes(x = as.factor(presence), y = calc_index)) +
geom_violin(aes(fill = as.factor(presence)), alpha = 0.6) +
geom_boxplot(width = 0.1) +
facet_wrap(~ compound) +
labs(x = "Presence/absence", y = "Calculus index (LA)") +
theme_bw() +
theme(legend.position = "none")
```
## Pathological conditions
Pathological conditions and lesions that occur frequently in the population were
included in the analysis. Data were
dichotomised to presence/absence to allow statistical analysis. A conservative
approach was taken, so when in doubt, absence of a disease was assumed.
Osteoarthritis was considered present in cases where eburnation was visible
on one or more joint surfaces.
Vertebral osteophytosis is identified by marginal lipping and/or osteophyte
formation on the margin of the superior and inferior surfaces of the vertebral
body.
Cribra orbitalia was diagnosed based on the presence of pitting on the superior
surface of the orbit. No distinction was made between active or healing lesions.
Degenerative disc disease, or spondylosis, is identified as a large diffuse
depression of the
superior and/or inferior surfaces of the vertebral body [@rogersPalaeopathologyJoint2000].
Schmorl's nodes are identified as any cortical depressions on the surface of
the vertebral body. A note was made whether the lesion perforated the vertebral
margin, but both perforating and non-perforating lesions were recorded as present.
Data on chronic maxillary sinusitis from @casnaUrbanizationRespiratory2021 were
included in this study to assess the relationship between upper respiratory
diseases with environmental factors (i.e. tobacco smoke, caffeine consumption).
Chronic maxillary sinusitis (CMS) is the inflammation of the lower paranasal
sinuses, air-filled pockets located in the skull that defend the organism against
inhaled particulate matter and pathogens. This occurs through the production of
mucus carried by small hairs toward an opening situated on the superior part of
the sinus, where pathogens are drained [@slavinDiagnosisManagement2005]. Without
drainage, mucus begins to accumulate in the sinuses, providing an ideal environment
for bacterial growth and thereby contributing to inflammation of the mucous
membranes and subsequently of the bone surfaces [@jangBoneInvolvement2002].
Lesions associated with CMS as defined by @boocockMaxillarySinusitis1995 were
recorded for each individual and classified as "pitting", "spicule-type bone
formation", "remodeled spicules", or "white pitted bone". CMS was scored as absent
when the sinus presented smooth surfaces with little or no associated pitting.
To facilitate inspection, fragmented sinuses were cleaned using a dry tooth-brush
and water where necessary. If the sinuses were not observable with the naked eye,
they were examined with a flexible medical endoscope (Pentax, model: FNL-10RBS,
ø=4mm; view angle=30°) inserted through minor breaks naturally occurring on the
inferior nasal conchae and palatine bone, where the bone tissue is thinner.
<!-- description of other diseases
Osteoarthritis
Vertebral osteophytosis
Cribra orbitalia
Degenerative disc disease
Schmorl's nodes
-->
## Statistical analysis
### Point-biserial correlation
Point-biserial (Pearson) correlation was conducted on compound concentrations,
calculus index, caries ratio, and binary variables (@fig-pearson-corr).
This is done to see if any correlations exist prior to discretisation of continuous
variables. Irrelevant correlations (anything not between two continuous or a
continuous and binary variable) are removed from the plot.
```{r}
#| label: fig-pearson-corr
#| fig-cap: "Pearson correlation plot."
conc_cor %>%