### Study area

The CS subpopulation is one of 19 recognized polar bear subpopulations [29] ranging from northwestern Alaska to northeastern Chukotka, Russia (Fig. 1). The CS subpopulation’s range [29] is bounded by the Bering Sea to the south, the East Siberian Sea to the west, the Beaufort Sea to the east, and the continental shelf to the north (Fig. 1), representing an area of approximately 1,600,000 km^{2}. Sea ice in the CS is highly dynamic with ice forming, shifting, and melting throughout the year. Sea ice in the CS currently reaches its maximum extent in March and its minimum extent in September [30]. Polar bears in the CS subpopulation primarily prey on ringed (*Pusa hispida*) and bearded seals (*Erignathus barbatus*), with spring being a critical hunting period [27].

### Animal capture

We captured polar bears from a helicopter and immobilized them with a dart containing zolazepam-tiletamine (Telazol or Zoletil; [31]) between mid-March and late April (2010–2017, excluding 2012 and 2014) in U.S. waters between Point Hope and Shishmaref, Alaska (Fig. 1). We fitted captured adult females with a GPS collar that was set to drop off after 12–15 months. Our sample of adult females did not include any females with cubs-of-the-year as these bears were not available to be captured in our study area given that most bears in the subpopulation den on Wrangel Island [17], approximately 600 km northwest. Similarly, we did not include data from bears collared the previous year that had denned. As noted earlier, we did not fit GPS collars to male bears or sub-adults. Instead, we fit these bears with an Argos satellite telemetry tag affixed to either an ear or glued onto their back.

### Location data

Non-collar satellite-based transmitters for polar bears typically have poor performance [15], with a maximum life span often lasting < 6 months [15, 20, 21]. The poor performance of these devices are thought to be related to tags being lost (e.g., pulled out of ear or lost due to molting), antennae failure, and limited battery life [15]. Our comparative analyses were therefore limited to location data obtained between 1 March and 30 June, when the three classes of bears had active tags. To ensure that capturing bears did not influence their movements, we restricted each bear’s data to ≥ 5 days post-capture [32]. Even though we fit adult females with GPS collars, preliminary analyses found that models had difficulty converging when using GPS data from adult females and Argos data from other classes, even if GPS data were subset to match the acquisition intervals from the Argos tags. This was likely due to large differences in uncertainty for GPS-based locations compared to Argos-based locations [33]. Because GPS collars also collected Argos locations, we only used Argos locations for adult females. Given that Argos transmitters can provide multiple locations during a duty cycle, we restricted Argos data to only one location per duty cycle (i.e., satellite acquisition period), keeping the location with the smallest error ellipse [34]. The frequency of Argos location acquisition varied between years and tag types (e.g., collar vs ear tag). Argos acquisition varied between 1 and 4 days, with average durations between acquisition being 3 (SD = 3.1), 3 (SD = 1.4), and 2 days (SD = 1.0) for adult females, adult males, and sub-adults, respectively. Further, we removed individuals from the analysis that had < 3 locations, which is the minimum number of locations needed to estimate movement metrics. Unequal sampling among individuals was accommodated by our hierarchical modeling approach (see below). Lastly, we restricted our analyses to movement and step selection while on sea ice and not land as most bears are on sea ice in spring. Only denning bears would be on land for prolonged periods during spring and as stated earlier, we did not included bears with cubs-of-the-year in our analysis.

Due to computational constraints, we chose to use a data imputation approach to account for uncertainty in locations [35] rather than directly accounting for it in a state-space model (e.g., [34]). Following the guidance of Scharf et al. [35], for each individual’s location data we fit a continuous time correlated random walk model (CTCRW; [10]) that incorporated the location uncertainty estimated by Argos error ellipses [34] with the ‘crawl’ package [36] for R [37]. We then obtained 25 realizations for each animal’s estimated path, with estimated locations obtained at 4-day intervals, which we used for the subsequent analyses of movements and step selection. The CTCRW model was also able to account for missing Argos fixes for an individual, which would lead to greater variability in estimated locations further from observed locations. Twenty-five datasets were then created such that each dataset had one realized path from each individual (without replacement). We provide a general overview of the data imputation approach in Additional file 1.

### Movement analysis

We were interested in determining if there were differences in movement characteristics across different classes of polar bears to help inform movement models used to address management and conservation concerns [23, 38] and to gain a better understanding of the ecological differences among polar bears. Recently methods have been developed to jointly estimate movement metrics and step selection parameters jointly, termed integrated step selection analysis [39]. While our study objectives lend themselves to the use of this approach, we preferred to take a two-step approach (i.e., estimating movement and step selection parameters independently) for a variety of reasons. Given the limited sample sizes for some individuals in our study, especially adult males and sub-adults because of tag retention and performance issues [15], we wanted to take advantage of a trait of Bayesian hierarchical models that allows for borrowing strength across samples with varying levels of information [40]. This is not currently possible with the integrated step selection approach which would have likely required us to omit data from a number of individuals for our analysis. Additionally, integrated step selection analyses generally prefers separate models to be fit to each individual which are then averaged to develop population-level estimates [39]. Given our data imputation approach to handle location uncertainty, this would have led to 3,200 unique models needing to be run. We therefore developed a simple movement model to evaluate step length and directional persistence differences across the three classes studied and then conducted a separate step selection analysis (detailed below). We note, however, that by not taking an integrated step selection analysis approach, we are unable to correct the movement parameters to remove the influence of resource selection, which may lead to some bias in sample of available points [41].

Our use of a data imputation approach allowed us to obtain expected location data for all individuals at fixed time steps, which reduced model complexity compared to other state-space approaches [42,43,44]. Similarly, the data imputation approach eliminated the need to account for uncertainty associated with observed locations in the hierarchical model (see below), further simplifying the model.

We obtained 4-day step lengths and turn angles (i.e., distance traveled and turn angles between locations obtained at 4-day intervals) from the CTCRW output for each individual and used these values as data in our model. We modeled an individual’s step lengths as coming from a Weibull distribution:

$${l}_{i,t}\sim Weib\left({k}_{i},{g}_{i}\right)$$

where \(l_{{i,t}}\) is the observed step length of individual *i* at time *t*, and \({k}_{i}\) and \({g}_{i}\) are individual-level shape and scale parameters, respectively, for individual *i* in the Weibull distribution. We used the method-of-moments approach [40] to parameterize the model to estimate individual-level parameters based on population-level parameters for each of the three classes:

$$\left[\left.{g}_{i}\right|{c}_{i}\right]\sim Gamma\left(\frac{{\gamma }_{c}^{2}}{{\sigma }_{\gamma }^{2}}, \frac{{\gamma }_{c}}{{\sigma }_{\gamma }^{2}}\right)$$

$$\left[\left.{k}_{i}\right|{c}_{i}\right]\sim Gamma\left(\frac{{\kappa }_{c}^{2}}{{\sigma }_{\kappa }^{2}}, \frac{{\kappa }_{c}}{{\sigma }_{\kappa }^{2}}\right)$$

where *c*_{i} is the class for individual *i* (i.e., adult female, adult male, sub-adult), \({\kappa }_{c}\) and \({\gamma }_{c}\) are population-level shape and scale parameters (specific for each class of bear), respectively, for the Weibull distribution of step lengths, and \({\sigma }_{\kappa }^{2}\) and \({\sigma }_{\gamma }^{2}\) represent variation among individuals for the two population-level parameters. We calculated the mean expected step length for class c as: \({\mu }_{c}={\gamma }_{c}\Gamma (1+1/{\kappa}_{c})\), where Γ is the log gamma function.

We modeled an individual’s turn angles as coming from a wrapped Cauchy distribution:

$${\theta }_{i,t}\sim wCauchy\left({\theta }_{i,t-1},{r}_{i}\right)$$

where \({\theta }_{i,t-1}\) is the turn angle for individual *i* at the previous time step (i.e., *t-1*) and \({r}_{i}\) is individual *i’s* estimated directional persistence. We again used the method-of-moments approach to estimate an individual’s directional persistence based on population-level parameters for each of the three classes:

$$\left[\left.{r}_{i}\right|{c}_{i}\right]\sim Beta\left(\frac{{\phi }_{c}^{2}-{\phi }_{c}^{3}-{\phi }_{c}{\sigma }_{\phi }^{2}}{{\sigma }_{\phi }^{2}},\frac{{\phi }_{c}-2{\phi }_{c}^{2}+{\phi }_{c}^{3}-{\sigma }_{\phi }^{2}+{\phi }_{c}{\sigma }_{\phi }^{2}}{{\sigma }_{\phi }^{2}}\right)$$

where \({\phi }_{c}\) is the population-level mean for directional persistence specific to each class of bear, and \({\sigma }_{\phi }^{2}\) is the inter-individual variation in directional persistence. We gave all population-level parameters vague priors, with \({\gamma }_{c}\sim Uniform\left(\mathrm{0,500000}\right)\), \({\kappa }_{c}\sim Uniform(\mathrm{0,10})\), \({\phi }_{c}\sim Uniform\left(\mathrm{0,1}\right)\), \({\sigma }_{\gamma }\sim Uniform(0, 50000)\), \({\sigma }_{\kappa }\sim Uniform (\mathrm{0,5})\), and \({\sigma }_{\phi }\sim Uniform(0, 0.5)\). Because the model for individual turn angles required the previous turn angle as a parameter, we also estimated the starting (i.e., *t* = 1) step length and turn angle as vague priors, with \({s}_{i,1}\sim Uniform\left(\mathrm{0,100000}\right)\) and \({\theta }_{i,1}\sim Uniform(-\pi ,\pi )\).

### Step selection analysis

To assess differences in polar bear step selection among classes, we developed step selection functions (SSFs) at the scale of 4-day step lengths. For each of the 25 imputed data sets, we estimated SSFs with a hierarchical Bayesian conditional logistic regression model [18, 45], which accounted for the lack of independence among multiple clusters (i.e., one used location with multiple random locations; see below) for individual bears. We obtained samples of available locations for each individual *i*’s used locations by first drawing a random sample from the individual’s posterior distributions for the shape (i.e., *k*_{i}), scale (i.e., *g*_{i}), and directional persistence (i.e., *r*_{i}) parameters. We then obtained 25 random samples of step length from a Weibull distribution with the values of the sampled shape and scale parameters, and 25 random samples of turn angles from a Wrapped Cauchy distribution with the values of the previous turn angle for individual *i* (i.e., *θ*_{i,t-1}) and the sample directional persistence parameter. We defined a used location with its associated available locations as a cluster. A sample of > 20 available locations has been shown to be sufficient for obtaining accurate estimates of habitat selection [46]. We generated available locations as:

$${{\varvec{v}}}_{{\varvec{t}},{\varvec{a}}}= {{\varvec{s}}}_{{\varvec{t}}-1}+\boldsymbol{ }\left(\genfrac{}{}{0pt}{}{{l}_{a}\mathrm{cos}{\theta }_{a}}{{l}_{a}\mathrm{sin}{\theta }_{a}}\right)$$

where *v*_{t,a} is a two-element vector with an available location *a*’s x- and y-coordinates at time *t*, which is a function of the used location at the previous time step (\({{\varvec{s}}}_{{\varvec{t}}-1}\); similarly a two-element vector with x- and y-coordinates) and a randomly sampled step length (\({l}_{a}\)) and turn angle (\({\theta }_{a})\) from the posterior distribution as described above. We estimated the choice probability [47] for the used location, \({{\varvec{s}}}_{{\varvec{i}},{\varvec{t}}}\) for individual *i* at time *t* as:

$$\lceil{{\varvec{s}}}_{{\varvec{i}},{\varvec{t}}}|{{\varvec{s}}}_{{\varvec{i}},{\varvec{t}}-1},\boldsymbol{ }{{\varvec{s}}}_{{\varvec{i}},\boldsymbol{ }{\varvec{t}}-2}, {{\varvec{x}}}_{i,t}\rceil= \frac{{e}^{{\boldsymbol{\alpha }}_{{\varvec{i}}}{{\varvec{x}}}_{{\varvec{s}}}}}{\sum_{r\in {S}_{t}}{e}^{{\boldsymbol{\alpha }}_{{\varvec{i}}}{{\varvec{x}}}_{{\varvec{r}}}}}$$

where \({{\varvec{s}}}_{{\varvec{i}},{\varvec{t}}-1}\) is individual *i*’s used location at time *t-1*, \({{\varvec{x}}}_{i,t}\) is a vector of attributes corresponding to the set of used and available locations for individual *i* at time *t, S*_{t} is the set of all available and used points for individual *i* at time *t*, **x**_{s} is a vector of attributes corresponding to used location \({{\varvec{s}}}_{{\varvec{i}},{\varvec{t}}}\) (see below), and *α*_{i} is a vector of parameters for animal *i* (i.e., individual-level selection parameters). We modeled each individual selection parameter as:

$${\alpha }_{i}\sim Normal({\beta }_{c},{\sigma }_{\beta })$$

where *β*_{c} represents the corresponding population-level selection parameter for class *c*, and \({\sigma }_{\beta }\) represents the inter-individual standard deviation for that selection parameter. To simplify the model, we assumed that \({\sigma }_{\beta }\) did not vary among classes. We assigned all *β*_{c} a vague prior, *β*_{c} ~ Normal(0,100). Similarly, we assigned all \({\sigma }_{\beta }\) a vague prior, \({\sigma }_{\beta }\) ~ Gamma(0.01,0.01).

We estimated selection using covariates that have previously been identified as important for describing polar bear resource selection patterns [18, 19]. Additionally, the following variables might be expected to have different relationships with other classes of polar bears given their connection to prey distribution [48] and observations that some classes may be more likely to occur near communities [25, 26] and therefore closer to land. Specifically, we estimated selection patterns across classes for ocean depth (resolution = 16 km^{2}; http://dx.doi.org/10.7289/V5C8276M, accessed 17 March 2022); sea-ice concentration; sea-ice concentration-squared; the standard deviation of sea-ice concentration within 100 km of points, as an index of sea-ice variability [18]; distance to land; distance to land-squared; distance to the 10% ice concentration contour, as an index of the distance to the sea-ice edge and related to prey distribution [48]; and distance to the 10% ice concentration contour-squared. For all sea ice-related variable, we obtained data from Cavalieri et al. [49]. We also used the sea ice concentration maps to define distance to land as those datasets also classified the presence of land. Sea ice concentration-derived layers had a spatial resolution of 625 km^{2} and were available daily, so we related them the specific day and year a location was obtained.

We scaled each variable by subtracting their overall means from the observed values and then dividing by their overall standard deviations. To assess the predictive capacity of the model we used a cross-validation approach designed for case–control analyses [50]. We withheld 20% of clusters to use as a test set in for each of the 25 imputed data sets and used the remaining 80% of the clusters as the training data set. For each location in the withheld cluster, we calculated the choice probability based on the parameters estimated from the training dataset. We then ranked the calculated choice probabilities within each cluster from lowest to highest, extracted the rank for each used location, and calculated the number of used locations that were ranked in each of the 26 bins (with bins 1 and 26 having the lowest and highest calculated choice probabilities, respectively, within the cluster). Finally, we used Spearman’s rank correlation of the bin counts to assess predictive capacity of the model. A high correlation value is expected for good-fitting models because used locations should rank higher (on average) than available locations.

### Model implementation

For the two analyses (i.e., movement and step selection), we estimated the posterior distribution for each parameter with Monte Carlo Markov Chains using the package ‘rjags’ [51] to run the program JAGS [52] from R [37]. We ran the 25 imputed data sets in parallel, allowing 5,000 iterations for the adaptation phase and a burn in of 100,000 and 10,000 iterations for the movement and step selection analyses, respectively. We then obtained 100,000 and 5,000 iterations from each chain for the movement and step selection analyses (respectively) and thinned each by 100 and 5 (respectively), resulting in a total of 1,000 samples from the posterior distribution. We visually assessed each parameter for convergence. We then combined the 1,000 posterior samples from each imputed data set for each analysis, resulting in a posterior sample of 25,000 which we used to estimating summary statistics for model parameters. Given that our primary study objective was to assess if differences exist in parameter estimates between polar bear classes we assessed if the 95% Credible Intervals (CI) for a given class’s parameter estimate overlapped with the other class’s 95% CIs.