We are using corrr package (Kuhn, Jackson, and Cimentada 2022).
Non-numeric variables removed from input: `key`, `date`, and `site`
Warning in stats::cor(x = x, y = y, use = use, method = method): the standard
deviation is zero
Correlation computed with
• Method: 'pearson'
• Missing treated using: 'pairwise.complete.obs'
term nanoeukaryotes picoeukaryotes prochlorococcus
1 nanoeukaryotes
2 picoeukaryotes .34
3 prochlorococcus .38 .02
4 synechococcus .38 .10 .31
5 bacteria_hna .45 .15 .60
6 bacteria_lna .56 .54 .39
7 diatoms .05 -.02 -.10
8 dinoflagellates .10 -.08 .11
9 silicoflagellates -.24 -.19 -.01
10 protozoos .24 .18 -.18
11 toc .24 -.26 .16
12 tn -.37 -.04 -.39
13 ton -.35 .04 -.37
14 nitrate -.22 -.22 -.22
15 nitrite -.15 -.23 -.10
16 silicate .12 .07 -.06
17 phosphate -.02 .01 .00
18 ammonium .16 -.12 .13
19 chl_a .01 -.28 -.10
20 sst .15 -.30 .62
synechococcus bacteria_hna bacteria_lna diatoms dinoflagellates
1
2
3
4
5 .45
6 .43 .45
7 -.32 .10 -.12
8 -.10 .13 -.08 .54
9 -.14 -.14 -.26 -.15 .10
10 -.19 -.03 -.02 .48 .24
11 .31 .34 .19 -.09 .07
12 -.14 -.22 -.31 -.12 -.26
13 -.10 -.19 -.25 -.10 -.25
14 -.23 -.19 -.28 .06 .06
15 -.19 -.03 -.29 .01 .22
16 -.04 .15 .03 .04 .28
17 .03 -.02 .11 -.22 -.18
18 .20 .14 .12 -.24 -.20
19 -.29 -.18 -.33 1.00 1.00
20 .16 .48 -.00 .06 .27
silicoflagellates protozoos toc tn ton nitrate nitrite silicate
1
2
3
4
5
6
7
8
9
10 -.10
11 .20 -.06
12 -.06 -.18 -.06
13 -.09 -.20 -.06 .95
14 .25 .18 -.06 .53 .25
15 .17 -.13 .14 .16 .00 .48
16 .09 -.03 .21 .13 .01 .41 .61
17 .32 -.08 .16 -.05 -.09 .00 -.01 .00
18 -.06 -.00 .11 -.15 -.24 -.01 .09 -.07
19 .31 -.34 -.45 .11 .56 .66
20 .12 .02 .22 -.19 -.19 -.11 .08 -.23
phosphate ammonium chl_a sst
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18 .33
19 .05 .24
20 .02 .29 .01
It calls my attention the .62 for sst:prochlorococcus. Seems to stand above the rest.
Non-numeric variables removed from input: `key`, `date`, and `site`
Correlation computed with
• Method: 'pearson'
• Missing treated using: 'pairwise.complete.obs'
term ecoli_hdt ente_hdt ecoli_nyd ente_nyd
1 ecoli_hdt
2 ente_hdt .12
3 ecoli_nyd .50 .07
4 ente_nyd -.03 .63 -.09
There might be high discrepancy between sources. Correlation nyd:hdt is .63 for Enterococcus and .5 for E. coli. Possible explanations:
- Sampling technique (e.g. depth)
- Sampling hour and weekday
- Sampling exact position
- Tide phase
# A tibble: 9 × 3
ente_hdt ente_nyd n
<dbl> <dbl> <int>
1 0 70 1
2 0 NA 4
3 1 NA 2
4 2 NA 1
5 4 9 1
6 4 NA 1
7 6 NA 2
8 200 340 1
9 NA NA 138
Oh, no! :( There are so few matches (only 9) between environmental parameter and fecal indicators sampling weeks.
We might conclude the database is broken and new samples must be collected synchronously.