Abstract:
This dataset contains gridded spatial predictions of the distribution and density of Antarctic krill (Euphausia superba) in the South Scotia Sea, specifically within Subarea 48.2 of the Convention for the Conservation of Antarctic Marine Living Resources (CCAMLR). Both year-specific and decadal mean predictions are provided across years 2011-2020. All predictions were generated from a two-part hurdle model which used input data from (i) a spatially and temporally consistent acoustic krill survey around the South Orkney Islands and (ii) year-specific environmental covariates. The first hurdle model component was a binomial Generalized Additive Model (GAM) fitted to binary presence-absence krill data which predicts the probability of krill presence. The second component was a Gaussian GAM fitted to non-zero krill data which predicts krill density. Finally, these components were combined to identify where krill were both likely to be present and occur at high densities. Full model details are given in the associated publication. This dataset provides the spatial predictions generated from the binomial GAM, Gaussian GAM, and their combined product.
Funding:
PNT, SF and JJF were supported by the British Antarctic Survey's National Capability Antarctic Logistics and Infrastructure programme CONSEC, supported by the Natural Environment Research Council, a part of UK Research and Innovation.; VW-E and JJF were supported by the Pew Charitable Trusts under grant PA00034295. The South Orkney Islands acoustic trawl survey is part of the ongoing Norwegian Institute of Marine Research (IMR) project KRILL (p.no. 14246), which is supported by the Norwegian Research Council (NFR grant 222798), the Norwegian Ministry of Foreign Affairs, and IMR.
Keywords:
Antarctic krill, South Orkney Islands, South Scotia Sea, hurdle model, interannual variability
Freer, J.J., Warwick-Evans, V., Skaret, G., Krafft, B.A., Fielding, S., & Trathan, P.N. (2025). Modelled spatial predictions of the distribution and density of Antarctic krill in the South Scotia Sea between 2011-2020 (Version 1.0) [Data set]. NERC EDS UK Polar Data Centre. https://doi.org/10.5285/4fd0a1bf-da1a-4021-82eb-2fc513910e32
Access Constraints: | Under embargo until publication of associated article. |
---|---|
Use Constraints: | Data supplied under Open Government Licence v3.0 http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/. |
Creation Date: | 2025-01-10 |
---|---|
Dataset Progress: | Complete |
Dataset Language: | English |
ISO Topic Categories: |
|
Parameters: |
|
Personnel: | |
Name | UK Polar Data Centre |
Role(s) | Metadata Author |
Organisation | British Antarctic Survey |
Name | Dr Jennifer J Freer |
Role(s) | Investigator, Technical Contact |
Organisation | British Antarctic Survey |
Name | Dr Victoria Warwick-Evans |
Role(s) | Investigator |
Organisation | British Antarctic Survey |
Name | Dr George Skaret |
Role(s) | Investigator |
Organisation | Norwegian Institute of Marine Research |
Name | Dr Bjorn A Krafft |
Role(s) | Investigator |
Organisation | Norwegian Institute of Marine Research |
Name | Dr Sophie Fielding |
Role(s) | Investigator |
Organisation | British Antarctic Survey |
Name | Dr Philip N Trathan |
Role(s) | Investigator |
Organisation | British Antarctic Survey |
Parent Dataset: | N/A |
Reference: | Associated publication: Freer JJ, Warwick-Evans V, Skaret G, Krafft BA, Fielding S, Trathan PN, (2025) A new dynamic distribution model for Antarctic krill reveals interactions with their environment, predators and the commercial fishery in the South Scotia Sea region, Limnology and Oceanography Environmental covariate data used as model input: Sea ice extent: Fetterer, F., Knowles, K., Meier, W. N., Savoie, M. & Windnagel, A. K. (2017). Sea Ice Index, Version 3. Boulder, Colorado USA. National Snow and Ice Data Center. DOI: 10.7265/N5K072F8. Bathymetry: GEBCO Bathymetric Compilation Group (2021). The GEBCO_2021 Grid - a continuous terrain model of the global oceans and land. NERC EDS British Oceanographic Data Centre NOC. DOI: 10.5285/c6612cbe-50b3-0cff-e053-6c86abc09f8f Sea surface chlorophyll a and primary productivity: Global Ocean Colour (Copernicus-GlobColour), Bio-Geo-Chemical, L4 (monthly and interpolated) from Satellite Observations (1997-ongoing) E.U. Copernicus Marine Service Information (CMEMS). Marine Data Store (MDS). DOI: 10.48670/moi-00281 Sea surface temperature, sea surface salinity, sea surface height, mixed layer thickness, sea surface geostrophic current velocity, bottom temperature: Global Ocean Physics Reanalysis. E.U. Copernicus Marine Service Information (CMEMS). Marine Data Store (MDS). DOI: 10.48670/moi-00021 Krill acoustic data used as model input: Acoustic data are stored at the Norwegian Marine Data Centre, held at the Institute of Marine Research. Access to them is welcomed for collaborative and comparative efforts; contact BAK Zuur, A. F., E. N. Leno, N. Walker, A. A. Saveliev, and G. M. Smith. 2009. Mixed effects models and extensions in ecology with R. Springer. Wood, S. 2017. Generalized Additive Models: An Introduction with R, 2nd ed. Chapman and Hall/CRC. Wood, S. 2019. mgcv: mixed GAM computation vehicle with automatic smoothness estimation. R-package version 1.8-31 http://CRAN.R-project.org/package=mgcv. Eilers, P. H. C., and B. D. Marx. 1996. Flexible smoothing with B-splines and penalties. Statistical Science 11: 89-102. Krafft, B. A. and others 2018. Summer distribution and demography of Antarctic krill Euphausia superba Dana, 1852 (Euphausiacea) at the South Orkney Islands, 2011-2015. Journal of Crustacean Biology 38: 682-688. Skaret, G. and others 2023. Distribution and biomass estimation of Antarctic krill (Euphausia superba) off the South Orkney Islands during 2011-2020. Ices Journal of Marine Science 80: 1472-1486. |
|
---|---|---|
Quality: | The quality of model outputs are, in part, dependent on the data used for model input, i.e. the krill density data and environmental covariates. Krill density data: As part of the ongoing Norwegian scientific contribution to monitoring distribution, abundance and population characteristics of Antarctic krill, an acoustic survey takes place annually in waters surrounding the South Orkney Islands (Longitudinal stratum boundaries at 43.5 degrees W and 48 degrees W, and latitudinal boundaries at 59.67 degrees S and 62 degrees S). In the present study we use data from the first ten-year time period of this survey (2011 to 2020). All survey transects occurred between January and February each year, but in two of the years - 2013 and 2015 - sea ice prevented full completion of the survey (Krafft et al. 2018). The surveys were conducted by using the Norwegian commercial fishing vessels 'Saga Sea' and 'Juvel' (Aker Biomarine ASA, Oslo, Norway and Rimfrost AS, Fosnavag, Norway) as research platforms except in 2019 when RV 'Kronprins Haakon' was used. The acoustic data used in this study were collected using calibrated hull mounted Simrad echo sounders. We used the swarm-based approach for acoustic target identification of krill which is recommended by CCAMLR when the 38, 120 and 200 kHz frequency combination is not available. Details on the processing procedures used to determine krill density are reported and evaluated by Skaret et al. (2023). The retained nautical area scattering coefficient (NASC) allocated to krill per nautical mile was converted to biomass density (g m-2, hereafter referred to as density) using full Stochastic Distorted Wave Born Approximation (SDWBA) model runs to estimate backscattering cross-sectional areas (delta) for each krill length group of 1 mm increment present in the sample. Each acoustic sample value was matched to the covariate raster data according to the latitude, longitude and year of collection. Finally, the combined krill density-covariate dataset was aggregated to the same spatial resolution as the environmental covariates (0.04×0.04 decimal degrees) by calculating the mean values within each grid cell for each year. This was done to avoid pseudo-replication given multiple acoustic samples within the same grid cell of environmental data, and to reduce any effect of spatial autocorrelation in model residuals. Environmental covariate data: Twelve environmental covariates were identified as candidate explanatory variables for the model. These included three static variables, i.e. unchanging with survey year: water depth (bathymetry), bathymetric slope and distance from shelf break defined as the 500m isobath (where values on-shelf were positive, and those off-shelf were negative). The nine remaining variables were dynamic across survey years: distance from sea ice edge (defined as 15 percentage ice concentration), seven sea surface variables (temperature, mixed layer thickness, sea surface height above geoid, salinity, chlorophyll a, primary productivity, and geostrophic current velocity) and bottom temperature. Raster grids of all covariates were obtained from a combination of empirical observations and model re-analyses (see Section 9. Related datasets). For each of the dynamic covariates, two different temporal scales were extracted. These were: i) a sample scale which averaged conditions during the sampling months (January to February) independently for each year; and ii) a decadal scale climatology which was the average of summer conditions (January to March) between 2011 and 2020. |
|
Lineage: | Hurdle model approach: To model the distribution of Antarctic krill (hereafter krill) we applied a two-part hurdle model or zero-altered model (Zuur et al. 2009), which allows for heavily skewed data distributions. The first model predicts the probability of presence of krill. To do this, the krill density data were transformed into a binary zero/non-zero form (n = 2709) and modelled against the environmental covariates using a binomial generalised additive model (GAM; Wood (2017)) with a logit link function. The second model investigates the relationship between non-zero data (i.e. presence-only data, n = 1833) and environmental covariates. This was carried out using a GAM with a Gaussian distribution and the default identity link function. Based on exploratory density plots, the presence-only data were log transformed to follow a normal distribution. Finally, outputs from both models were multiplied together. This allowed us to identify where krill were both likely to be present and occur at high densities. Model fitting and selection: All GAMs were fitted using the R package 'mgcv' (Wood 2019), with a Restricted Maximum Likelihood (REML) optimisation method to estimate splines, and penalised thin plate regression splines on all smooth terms (Eilers and Marx 1996). To reduce model overfitting the basis dimension (k) was limited to between 3 and 6 with the optimum number guided by edf values and associated p values reported in mgcv's gam.check, and by visualising the partial effects plots for each covariate with the raw data. For both the binomial and Gaussian GAMs, model selection followed a forward stepwise selection approach with five-fold cross validation. Specifically, each environmental covariate was modelled against the response variable independently and repeated five times, each time withholding a different random subset of data (fold) for evaluation. The model coefficients from each run were used to predict the outcome for the withheld fold and performance metrics of the prediction - Root Mean Squared Error (RMSE) and R2 - were extracted. The best performing covariate (i.e. lowest RMSE and highest R2 averaged over five folds) was retained within the model. This selection process was repeated allowing for all possible combinations of environmental covariates at their different temporal scales (sample and decadal). At each iteration, the retained set of covariates were assessed for collinearity using Pearson correlation coefficients and Variance Inflation Factors (VIF), and for concurvity using the worst-case measure of overall concurvity for each smooth. If issues were identified (Pearson's r > 0.7, VIF >3, concurvity >0.8) the next highest-ranking covariate was selected. Forward selection continued until model performance metrics plateaued and/or issues of collinearity and concurvity could not be overcome. Predictions from the final Gaussian GAM were back transformed to obtain outputs on the original density scale. Once the final set of covariates was selected, predictions for the probability of occurrence and estimated krill density were projected onto year-specific grids at the scale of Subarea 48.2. The mean and +/-1 standard deviation of predictions and their product (interpreted as the krill density weighted by the probability of occurrence) across all years were generated. |
Temporal Coverage: | |
---|---|
Start Date | 2011-01-01 |
End Date | 2011-02-28 |
Start Date | 2012-01-01 |
End Date | 2012-02-28 |
Start Date | 2013-01-01 |
End Date | 2013-02-28 |
Start Date | 2014-01-01 |
End Date | 2014-02-28 |
Start Date | 2015-01-01 |
End Date | 2015-02-28 |
Start Date | 2016-01-01 |
End Date | 2016-02-28 |
Start Date | 2017-01-01 |
End Date | 2017-02-28 |
Start Date | 2018-01-01 |
End Date | 2018-02-28 |
Start Date | 2019-01-01 |
End Date | 2019-01-01 |
Start Date | 2020-01-01 |
End Date | 2020-01-01 |
Spatial Coverage: | |
Latitude | |
Southernmost | -64 |
Northernmost | -57 |
Longitude | |
Westernmost | -50 |
Easternmost | -30 |
Altitude | |
Min Altitude | N/A |
Max Altitude | N/A |
Depth | |
Min Depth | 0 |
Max Depth | 0 |
Data Resolution: | |
Latitude Resolution | 0.04 decimal degrees |
Longitude Resolution | 0.04 decimal degrees |
Horizontal Resolution Range | N/A |
Vertical Resolution | N/A |
Vertical Resolution Range | N/A |
Temporal Resolution | N/A |
Temporal Resolution Range | N/A |
Location: | |
Location | Antarctica |
Detailed Location | South Scotia Sea, South Orkney Plateau, CCAMLR subarea 48.2 |
Data Collection: | All analyses were carried out in R version 4.3.3. |
---|
Data Storage: | Folder: annual_krill_rasters Volume: 4.53 megabytes (50 files) Format: raster(.tif) Contents: This folder contains spatial predictions of krill distribution and krill density for each year (2011-2020) and for each model component (binomial GAM, Gaussian GAM, and combined product). The unit for binomial GAM predictions is the probability of krill presence (0-1). The unit for Gaussian GAM predictions is krill density (g m-2). The unit for the combined product is krill density conditional on presence (g m-2). Naming format: year _ hurdle model component _ product Year = year of model prediction (ranges from 2011-2020) Hurdle model component = binomial GAM (gam_binom), Gaussian GAM (gam_gaus) or the product of binomial and Gaussian models (combined) Product = average (mean) or +/-1 standard deviation (sd) of model cross-validation folds Folder: decade_krill_rasters Volume: 558 kilobytes (6 files) Format: raster(.tif) Contents: This folder contains gridded raster files that represent the average of the year-specific predictions (see folder annual_krill_raster) for each hurdle model component. The unit for binomial GAM predictions is the probability of krill presence (0-1). The unit for Gaussian GAM predictions is krill density (g m-2). The unit for the combined product is krill density conditional on presence (g m-2). Naming format: hurdle model component _ product Hurdle model component = binomial GAM (gam_binom), Gaussian GAM (gam_gaus) or the product of binomial and Gaussian models (combined) Product = decadal average (decademean) or +/- 1 standard deviation (decadesd) |
---|