R tools and scripts for vegetation science and ecology: February 2015

Tuesday, February 10, 2015

Biodiversity mapping functions for R (5): presence/absence of species across a gridded map

Map grid cell presence/absence

I am sequentially posting some self-contained functions for mapping biodiversity metrics in R.

This function is currently contained within some of the endemism functions I have posted, but I have pulled it out so that it can be used in isolation:

map.pa.matrix.R [see full code at https://raw.githubusercontent.com/GregGuerin/biomap/master/map.pa.matrix.R]

Description --
Given georeferenced incidence data for species, generates a binary presence/absence matrix associated with grid cells of a raster.

Details --
This function generates a binary species presence/absence matrix associated with a raster layer based on georeferenced incidence data. This is a data processing step for mapping various biodiversity metrics onto raster layers. The outputs can be used as inputs into these functions, or if desired they can be used like site-based data (at the resolution of the raster) for various analysis such as ordination, or incidence/frequency data for particular species can be extracted.

Usage --
An example:

require(vegan)
data(mite)
data(mite.xy)
map.pa.matrix(mite, records="site", site.coords=mite.xy)

Biodiversity mapping functions for R (4): Phylogenetic Endemism (test against expectation)

Phylogenetic endemism – non-parametric tests

I am sequentially posting some self-contained functions for mapping biodiversity metrics in R, this one is:

pe.null.test.R [see full code at https://raw.githubusercontent.com/GregGuerin/biomap/master/pe.null.test.R]

Description --
Taking the outputs from the 'phylogenetic.endemism' function, tests whether observed phylogenetic diversity/endemism is higher than expected, using non-parametric methods

Details --
With the outputs from the 'phylogenetic.endemism' function, performs the following tests:

1) non-parametric significance test as to whether observed phylogenetic diversity/endemism is higher or lower than expected, given species richness (and observed species frequencies)

2) identifies and maps outliers (i.e. in terms of map grid cells that have higher or lower PD/PE) based on quantiles. As categorical: whether score lies more than 1.5 (or other user-defined amount) times outside the interquartile range; as continuous: the factor of the interquartile by which observed values differ from the median / 50% quantile). Returns vectors of values plus raster maps.

Usage --
An example:

mite.PE <- phylogenetic.endemism(mite, records="site", site.coords=mite.xy, sep.comm.spp="none", phylo.tree=mite.tree, sep.phylo.spp="none", weight.type="geo")

pe.mite.test <- pe.null.test(mite.PE)

And an example of the raster outputs from phylogenetic.endemism.R and pe.null.test.R:

Biodiversity mapping functions for R (3): Phylogenetic Endemism (raw)

Phylogenetic Endemism

I am sequentially posting some self-contained functions for mapping biodiversity metrics in R, this one is:

phylogenetic.endemism.R [full code at https://raw.githubusercontent.com/GregGuerin/biomap/master/phylogenetic.endemism.R]

Description --
Calculates phylogenetic endemism (phylogenetic diversity inversely weighted by the spatial range of particular branch lengths) or alternatively (unweighted) phylogenetic diversity across gridded maps using individual or site-based point records.

Details --
This implementation of phylogenetic endemism allows alternative calculation of weights for branch length ranges. Weights can be calculated based on the frequency of occurrence in grid cells, or alternatively by the georeferenced span of the range. Unweighted phylogenetic diversity can also be selected.

Usage --
An example:

require(vegan)
data(mite)
data(mite.xy)
require(ape)
mite.tree <- rtree(n=ncol(mite), tip.label=colnames(mite)) #for this example, generate a phylogenetic tree of the species in the mite dataset with random relationships and branch lengths
####Usage of the function:
mite.PE <- phylogenetic.endemism(mite, records="site", site.coords=mite.xy, sep.comm.spp="none", phylo.tree=mite.tree, sep.phylo.spp="none", weight.type="geo")

Biodiversity mapping functions for R (2): Weighted Endemism: test against expectation

Weighted Endemism non-parametric tests

I am sequentially posting some self-contained functions for mapping biodiversity metrics in R, this one is:

endemism.null.test.R
[full code at https://raw.githubusercontent.com/GregGuerin/biomap/master/endemism.null.test.R]:

Update: there is now an associated article: Guerin, G.R., Ruokolainen, L. & Lowe, A.J. (2015) A georeferenced implementation of weighted endemism. Methods in Ecology and Evolution. DOI: 10.1111/2041-210X.12361

Description --
Taking the outputs from the 'weighted.endemism' function (see previous post), tests whether observed endemism is higher than expected, using non-parametric methods

Details --
With the outputs from the 'weighted.endemism' function, performs the following tests:

1) non-parametric significance test as to whether observed endemism is higher or lower than expected, given species richness (and observed species frequencies)

2) identifies and maps outliers (i.e. in terms of map grid cells that have higher or lower endemism) based on quantiles. As categorical: whether endemism score lies more than 1.5 (or other user-defined amount) times outside the interquartile range; as continuous: the factor of the interquartile by which observed values differ from the median / 50% quantile). Returns vectors of values plus raster maps.

Raw weighted endemism scores are biased both by the completeness of species sampling and species richness itself. Correcting by dividing by the observed number of species ('corrected weighted endemism' of Crisp et al. 2001) is a proposed correction, but the relationship between endemism scores and species richness is not linear under a null model (random species draws), as increasingly infrequent species are drawn as richness increases, thereby increasing CWE. While correcting endemism scores in a more sophisticated way is possible, this function does not correct the scores per se, but compares them to a null distribution. This is achieved by making replicate random draws from the species pool based on the observed species richness (i.e. same number of species) and the actual species frequencies (more frequent species more likely to be drawn). The distribution of the resulting set of null endemism scores is compared to observed endemism and subsequently grid cells can be mapped as higher or lower than expected (based on significance testing and comparison to null quantiles).

Usage --
An example:

endemism_mydata <- weighted.endemism(mite, site.coords=mite.xy, records="site")
endemism.test.example <- endemism.null.test(endemism_mydata)

And an example of an output from a regional flora dataset from South Australia (non-parametric statistical significance that endemism is higher (or lower) than expected):

Biodiversity mapping functions for R (1): Weighted Endemism (raw)

Weighted Endemism

I am sequentially posting some self-contained functions for mapping biodiversity metrics in R.

When I began a project producing regional biodiversity maps from a large inventory dataset, I wanted to stay in R for its seamless data processing and downstream modelling and analysis capabilities. The ‘Biodiverse’ software package [see https://code.google.com/p/biodiverse/] already has a large range of biodiversity mapping applications, but as far as I know there were previously no self-contained applications in R for what I needed to do.

Long story short, to do the analysis I wanted in R and with some alternative implementations, I had to code it from scratch (building on functionality such as in package ‘raster’). From this I have developed some self-contained and open-source functions for general use.

Additional options can be incorporated, such as moving window analysis, down-scaling etc, using functions in ‘raster’. I am also working towards integration with other site-based biodiversity estimation functions in R (i.e. to start with incidence data and output rasters and associated data).

The first function here is:

weighted.endemism.R [full code at https://raw.githubusercontent.com/GregGuerin/biomap/master/weighted.endemism.R]:

Update: there is now an associated article: Guerin, G.R., Ruokolainen, L. & Lowe, A.J. (2015) A georeferenced implementation of weighted endemism. Methods in Ecology and Evolution. DOI: 10.1111/2041-210X.12361

Description --Calculates weighted endemism (species richness inversely weighted by species ranges) across gridded maps using single or site-based point records.

Details --This implementation of weighted endemism allows alternative calculation of weights for species ranges as well as the option of user-supplied weights. Weights can be calculated based on the frequency of occurrence in grid cells, or alternatively by the geographical size of the species range, calculated in one (span) or two (area) dimensions.

Usage -- An example:

require(vegan)
data(mite)
data(mite.xy)
endemism_mydata <- weighted.endemism(mite, site.coords = mite.xy, records="site")

R tools and scripts for vegetation science and ecology