Modelling maturity & sex with multiple stocks

This vignette walks through a script that will generate a gadget3 model, explaining concepts along the way.

Stocks & substocks

As mentioned before in vignette('introduction-single-stock'), gadget3 stock objects do not have to correspond 1:1 with a species.

We can have multiple stock objects representing the same species in a different stage in their life-cycle, most commonly mature and immature versions, male and female versions, or all 4.

The set up is much the same as before, but major differences will be highlighted.

Initial setup & time-keeping is identical:

library(gadget3)
library(dplyr)

actions <- list()
area_names <- g3_areas(c('IXa', 'IXb'))

# Create time definitions ####################

actions_time <- list(
  g3a_time(
    1990L, 2023L,
    step_lengths = c(3L, 3L, 3L, 3L)),
  NULL)

actions <- c(actions, actions_time)

Stocks

We define 2 stocks instead of one, and a list() containing both for convenience:

# Create stock definition for fish ####################
st_imm <- g3_stock(c(species = "fish", 'imm'), seq(5L, 100L, 5)) |>
  g3s_age(1L, 5L) |>
  g3s_livesonareas(area_names["IXa"])

st_mat <- g3_stock(c(species = "fish", 'mat'), seq(5L, 100L, 5)) |>
  g3s_age(3L, 10L) |>
  g3s_livesonareas(area_names["IXa"])
stocks = list(imm = st_imm, mat = st_mat)

Notice that:

The name of the stock has 2 parts. This makes it possible to have parameters that refer to the species as a whole. In the model output the names will have been combined, e.g. "fish_imm".
The age ranges are not identical, obviously mature stocks are older, and we adjust to suit.

Stock actions

Stock actions now need to include interactions between immature & mature:

actions_st_imm <- list(
  g3a_growmature(st_imm,
    g3a_grow_impl_bbinom(
      maxlengthgroupgrowth = 4L ),
    # Add maturation
    maturity_f = g3a_mature_continuous(),
    output_stocks = list(st_mat),
    transition_f = ~TRUE ),
  g3a_naturalmortality(st_imm),
  g3a_initialconditions_normalcv(st_imm),
  g3a_renewal_normalcv(st_imm),
  g3a_age(st_imm, output_stocks = list(st_mat)),
  NULL)

actions_st_mat <- list(
  g3a_growmature(st_mat,
    g3a_grow_impl_bbinom(
      maxlengthgroupgrowth = 4L )),
  g3a_naturalmortality(st_mat),
  g3a_initialconditions_normalcv(st_mat),
  g3a_age(st_mat),
  NULL)

actions_likelihood_st <- list(
  g3l_understocking(stocks, nll_breakdown = TRUE),
  NULL)

actions <- c(actions, actions_st_imm, actions_st_mat, actions_likelihood_st)

actions_st_imm and actions_st_imm are largely similar to our actions_fish from the previous model, but:

We have added a maturity_f to g3a_growmature() to move individuals to the mature stock. The rate of maturity is coupled to growth, which is why g3a_growmature() does both at the same time.
Immature g3a_age() can also move individuals to the mature stock. This will happen if an immature fish ages beyond the final age bin (5 in our case). At that point it matures “by default”.
Mature has no g3a_renewal_normalcv(), as there is no recruitment directly into the mature stock.

Fleet actions

There is very little difference defining a fleet for a multiple stock model vs. single stocks.

To define a fleet, we need to introduce historical data into the model. In our case we will generate random data to use later:

# Fleet data for f_surv #################################

# Landings data: For each year/step/area
expand.grid(year = 1990:2023, step = 2, area = 'IXa') |>
    # Generate a random total landings by weight
    mutate(weight = rnorm(n(), mean = 1e5, sd = 10)) |>
    # Assign result to landings_f_surv
    identity() -> landings_f_surv

# Length distribution data: Generate 1000 random samples in each year/step/area
expand.grid(year = 1990:2023, step = 2, area = 'IXa', length = rep(NA, 1000)) |>
  # Generate random lengths for these samples
  mutate(length = rnorm(n(), mean = 50, sd = 20)) |>
  # Save unagggregated data into ldist_f_surv.raw
  identity() -> ldist_f_surv.raw

# Aggregate .raw data
ldist_f_surv.raw |>
  # Group into length bins
  group_by(
      year = year,
      step = step,
      length = cut(length, breaks = c(seq(0, 110, 5), Inf), right = FALSE) ) |>
  # Report count in each length bin
  summarise(number = n(), .groups = 'keep') |>
  # Save into ldist_f_surv
  identity() -> ldist_f_surv

# Assume 100 * 100 samples in each year/step/area
expand.grid(year = 1990:2023, step = 2, area = 'IXa', age = rep(NA, 100), length = rep(NA, 100)) |>
  # Generate random whole numbers for age
  mutate(age = floor(pmin(rnorm(n(), mean = 6, sd = 1), 10))) |>
  # Generate random lengths for these samples
  mutate(length = rnorm(n(), mean = 30 + age * 2, sd = 20)) |>
  # Group into length/age bins
  group_by(
      year = year,
      step = step,
      age = age,
      length = cut(length, breaks = c(seq(0, 110, 5), Inf), right = FALSE) ) |>
  # Report count in each length bin
  summarise(number = n(), .groups = 'keep') ->
  aldist_f_surv

# Map maturity stage data to stocks
as.data.frame(aldist_f_surv) |>
  # Generate random maturity stage data from age data, our stock matures between 3..5
  mutate(maturity = age > runif(n(), 3, 5)) |>
  # Map maturity stage to the stock name: Note we don't have to use the full stock name
  mutate(stock = ifelse(maturity, "mat", "imm")) |>
  # Remove redundant columns
  mutate(age = NULL, maturity = NULL) ->
  matp_f_surv

For more information on how this works, see vignette("incorporating-observation-data").

As well as the data we defined last time, we also define a maturity stage dataset. We represent the maturity stage data by adding a “stock” column, when we then compare to model gadget3 will breakdown predicted catches by stock.

matp_f_surv[sample(nrow(matp_f_surv), 10),]

##      year step   length number stock
## 4905 2022    2 [95,100)      1   mat
## 4618 2020    2  [45,50)     24   mat
## 2213 2004    2  [30,35)    108   mat
## 4916 2022    2  [40,45)     24   mat
## 3125 2010    2  [15,20)      8   mat
## 582  1993    2     <NA>     20   mat
## 231  1991    2    [0,5)     52   mat
## 4811 2022    2  [60,65)      9   imm
## 2714 2008    2  [65,70)      1   imm
## 1449 1999    2  [55,60)    214   mat

Also note we don’t provide the full name of a stock, this both makes code re-use easier and means that we would combine multiple stocks if there is for example a “fish_imm_m” and “fish_imm_f”.

Our fleet is defined with the same set of actions as the single-species model:

# Create fleet definition for f_surv ####################
f_surv <- g3_fleet("f_surv") |> g3s_livesonareas(area_names["IXa"])

actions_f_surv <- list(
  g3a_predate_fleet(
    f_surv,
    stocks,
    suitabilities = g3_suitability_exponentiall50(by_stock = 'species'),
    catchability_f = g3a_predate_catchability_totalfleet(
      g3_timeareadata("landings_f_surv", landings_f_surv, "weight", areas = area_names))),
  NULL)
actions_likelihood_f_surv <- list(
  g3l_catchdistribution(
    "ldist_f_surv",
    obs_data = ldist_f_surv,
    fleets = list(f_surv),
    stocks = stocks,
    function_f = g3l_distribution_sumofsquares(),
    area_group = area_names,
    report = TRUE,
    nll_breakdown = TRUE),
  g3l_catchdistribution(
    "aldist_f_surv",
    obs_data = aldist_f_surv,
    fleets = list(f_surv),
    stocks = stocks,
    function_f = g3l_distribution_sumofsquares(),
    area_group = area_names,
    report = TRUE,
    nll_breakdown = TRUE),
  g3l_catchdistribution(
    "matp_f_surv",
    obs_data = matp_f_surv,
    fleets = list(f_surv),
    stocks = stocks,
    function_f = g3l_distribution_sumofsquares(),
    area_group = area_names,
    report = TRUE,
    nll_breakdown = TRUE),
  NULL)

actions <- c(actions, actions_f_surv, actions_likelihood_f_surv)

There are 2 differences to before:

All actions use our list stocks, not an individual stock. Without additional changes, the 2 are treated as a single combined stock.
We set g3_suitability_exponentiall50(by_stock = 'species'), instructing it to have a single alpha & l50 parameter for both our stocks, as they have the same species name.
We also define “matp_f_surv”, to make use of our maturity data. We still use g3l_catchdistribution() as before, the difference is in the columns of our input data.

The by_stock parameter is passed through to g3_parameterized(). We can see the result of setting this in the parameter template. If by_stock = TRUE (the default) then we get parameters for both fish_imm.f_surv.l50 & fish_mat.f_surv.l50:

suppressWarnings({  # NB: The model is incomplete, fish_imm__num isn't defined
simple_model <- g3_to_r(list(g3a_time(1990, 2023), g3a_predate_fleet(
    f_surv,
    stocks,
    suitabilities = g3_suitability_exponentiall50(by_stock = TRUE),
    catchability_f = g3a_predate_catchability_totalfleet(1) )))
})
names(attr(simple_model, "parameter_template"))

## [1] "retro_years"           "fish_imm.f_surv.alpha" "fish_imm.f_surv.l50"  
## [4] "fish_mat.f_surv.alpha" "fish_mat.f_surv.l50"   "project_years"

If by_stock = 'species', then there is a single, shared fish.f_surv.l50 parameter:

suppressWarnings({  # NB: The model is incomplete, fish_imm__num isn't defined
simple_model <- g3_to_r(list(g3a_time(1990, 2023), g3a_predate_fleet(
    f_surv,
    stocks,
    suitabilities = g3_suitability_exponentiall50(by_stock = 'species'),
    catchability_f = g3a_predate_catchability_totalfleet(1) )))
names(attr(simple_model, "parameter_template"))
})

## [1] "retro_years"       "fish.f_surv.alpha" "fish.f_surv.l50"  
## [4] "project_years"

The by_stock parameter is just a convenient shortcut to change the default settings, if we specify g3_parameterized() ourselves we can change the parameterization in other ways, for example by_year = TRUE gives us per-year l50:

suppressWarnings({  # NB: The model is incomplete, fish_imm__num isn't defined
simple_model <- g3_to_r(list(g3a_time(1990, 2023), g3a_predate_fleet(
    f_surv,
    stocks,
    suitabilities = g3_suitability_exponentiall50(
        l50 = g3_parameterized("l50", by_stock = 'species', by_predator = TRUE, by_year = TRUE)),
    catchability_f = g3a_predate_catchability_totalfleet(1) )))
names(attr(simple_model, "parameter_template"))
})

##  [1] "retro_years"           "fish_imm.f_surv.alpha" "fish.f_surv.l50.1990" 
##  [4] "fish.f_surv.l50.1991"  "fish.f_surv.l50.1992"  "fish.f_surv.l50.1993" 
##  [7] "fish.f_surv.l50.1994"  "fish.f_surv.l50.1995"  "fish.f_surv.l50.1996" 
## [10] "fish.f_surv.l50.1997"  "fish.f_surv.l50.1998"  "fish.f_surv.l50.1999" 
## [13] "fish.f_surv.l50.2000"  "fish.f_surv.l50.2001"  "fish.f_surv.l50.2002" 
## [16] "fish.f_surv.l50.2003"  "fish.f_surv.l50.2004"  "fish.f_surv.l50.2005" 
## [19] "fish.f_surv.l50.2006"  "fish.f_surv.l50.2007"  "fish.f_surv.l50.2008" 
## [22] "fish.f_surv.l50.2009"  "fish.f_surv.l50.2010"  "fish.f_surv.l50.2011" 
## [25] "fish.f_surv.l50.2012"  "fish.f_surv.l50.2013"  "fish.f_surv.l50.2014" 
## [28] "fish.f_surv.l50.2015"  "fish.f_surv.l50.2016"  "fish.f_surv.l50.2017" 
## [31] "fish.f_surv.l50.2018"  "fish.f_surv.l50.2019"  "fish.f_surv.l50.2020" 
## [34] "fish.f_surv.l50.2021"  "fish.f_surv.l50.2022"  "fish.f_surv.l50.2023" 
## [37] "fish_mat.f_surv.alpha" "project_years"

See ?vignette('model-customisation') for more.

If we had data on the distribution of mature vs. immature, our observation data could contain a stock column with fish_imm or fish_mat. See vignette("incorporating-observation-data").

Again, further fleets can be added by repeating the code above.

Survey indices

Survey indices should be handed the full stocks list, instead of a single stock, but are otherwise the same as before:

# Create abundance index for si_cpue ########################

# Generate random data
expand.grid(year = 1990:2023, step = 3, area = 'IXa') |>
    # Fill in a weight column with total biomass for the year/step/area combination
    mutate(weight = runif(n(), min = 10000, max = 100000)) ->
    dist_si_cpue

actions_likelihood_si_cpue <- list(

  g3l_abundancedistribution(
    "dist_si_cpue",
    dist_si_cpue,

    stocks = stocks,
    function_f = g3l_distribution_surveyindices_log(alpha = NULL, beta = 1),
    area_group = area_names,
    report = TRUE,
    nll_breakdown = TRUE),
  NULL)

actions <- c(actions, actions_likelihood_si_cpue)

Creating model functions and Parameterization

At this point, we are ready to convert our model into code:

# Create model objective function ####################

model_code <- g3_to_tmb(c(actions, list(
    g3a_report_detail(actions),
    g3l_bounds_penalty(actions) )))

Now we should be configuring parameters based on the template.

Thanks to using wildcards in the g3_init_val() calls, a lot of the parameter settings will work regardless of the model being single- or multi-stock, so we don’t need to change the initial values from the previous model:

# Guess l50 / linf based on stock sizes
estimate_l50 <- g3_stock_def(st_imm, "midlen")[[length(g3_stock_def(st_imm, "midlen")) / 2]]
estimate_linf <- tail(g3_stock_def(st_imm, "midlen"), 3)[[1]]
estimate_t0 <- g3_stock_def(st_imm, "minage") - 0.8

attr(model_code, "parameter_template") |>
  g3_init_val("*.rec|init.scalar", 1000, optimise = FALSE) |>
  g3_init_val("*.init.#", 10, lower = 0.001, upper = 10) |>
  g3_init_val("*.rec.#", 100, lower = 1e-6, upper = 1000) |>
  g3_init_val("*.rec.proj", 0.002) |>
  g3_init_val("*.M.#", 0.15, lower = 0.001, upper = 10) |>
  g3_init_val("init.F", 0.5, lower = 0.1, upper = 10) |>
  g3_init_val("*.Linf", estimate_linf, spread = 2) |>
  g3_init_val("*.K", 0.3, lower = 0.04, upper = 1.2) |>
  g3_init_val("*.t0", estimate_t0, optimise = FALSE) |>
  g3_init_val("*.walpha", 0.01, optimise = FALSE) |>
  g3_init_val("*.wbeta", 3, optimise = FALSE) |>
  g3_init_val("*.*.alpha", 0.07, lower = 0.01, upper = 1.8) |>
  g3_init_val("*.*.l50", estimate_l50, spread = 1.5) |>
  # Treat maturity alpha/l50 separately
  g3_init_val("*.mat.alpha", 0.07, lower = 0.0001, upper = 1.8) |>
  g3_init_val("*.mat.l50", estimate_l50, spread = 3.5) |>
  g3_init_val("*.bbin", 100, lower = 1e-05, upper = 1500) |>
  identity() -> params.in

Finally we are ready for optimisation runs. g3_tmb_adfun() is a wrapper around TMB::MakeADFun() and TMB::compile, producing a TMB objective function. gadgetutils::g3_iterative() then optimises based on iterative reweighting

# Optimise model ################################
obj.fn <- g3_tmb_adfun(model_code, params.in)

params.out <- gadgetutils::g3_iterative(getwd(),
    wgts = "WGTS",
    model = model_code,
    params.in = params.in,
    grouping = list(
        fleet = c("ldist_f_surv", "aldist_f_surv", "matp_f_surv"),
        abund = c("dist_si_cpue")),
    method = "BFGS",
    control = list(maxit = 1000, reltol = 1e-10),
    cv_floor = 0.05)

Once this has finished, we can view the output using gadgetplots::gadget_plots().

# Generate detailed report ######################
fit <- gadgetutils::g3_fit(model_code, params.out)
gadgetplots::gadget_plots(fit, "figs", file_type = "html")

Once finished, you can view the output in your web browser:

utils::browseURL("figs/model_output_figures.html")