A data-driven method to locate nest sites and estimate reproductive outcome from avian telemetry data

The use of tracking devices for collecting animal movement data has become widespread in recent decades. In parallel, this has sparked a proliferation of methods to infer individual behavior from tracking data. Being able to learn more than only the movement trajectories of animals from telemetry data is one of the major forces pushing the field of movement ecology forward. One application that has been poorly explored is the use of movement data to estimate reproductive success. We pioneered this application in birds by introducing a method to locate nesting attempts and estimate their outcome from tracking data. We implemented our method in the R package nestR and illustrate its application to three bird species: the wood stork ( Mycteria americana ), the lesser kestrel ( Falco naumanni ), and the Mediterranean gull ( Ichthyaetus melanocephalus ). We identified nest site locations based on the analysis of recursive movement patterns of breeding individuals acting as central place foragers. Using trajectories with known breeding attempts, we estimated a set of species-specific criteria for the identification of nest sites, using non-reproductive individuals as controls. We estimated nest survival as a measure of reproductive outcome from nest site re-visitation histories, using a Bayesian hierarchical modeling approach that accounted for temporally variable re-visitation patterns, probability of visit detection, and missing data. Provided the availability of adequate tracking data, our method can be broadly applied to estimate reproductive outcome in a variety of central place forager species. Inferring reproductive outcome from tracking data will allow ecologists to bridge the gap between movement and space use behavior, environmental factors, and their fitness consequences.


Introduction
Technological advancements have resulted in an exponential increase in available animal tracking data in recent decades (Urbano et al., 2010;Kays, Crofoot, Jetz, & Wikelski, 2015;Wilmers et al., 2015). The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint The ability to obtain movement data at very fine resolutions brings us closer to understanding the behaviors underlying movement (Cagnacci, Boitani, Powell, & Boyce, 2010;Tomkiewicz, Fuller, Kie, & Bates, 2010). Inferring more than just the movement trajectories of animals from telemetry data is one of the major forces pushing the field of movement ecology forward (Nathan et al., 2008;Schick et al., 2008;Wilmers et al., 2015).
Knowing what an animal is doing when it is moving a certain way can improve our understanding of the links between movement and resource dynamics, species interactions, distribution, and individual fitness, which is a fundamental driver of population dynamics and evolutionary processes (Mueller & Fagan, 2008;Schick et al., 2008;Morales et al., 2010). The importance of this pursuit justifies the recent and ongoing proliferation of analytical techniques to infer behavior from movement (Gurarie et al., 2016;Thiebault, Dubroca, Mullers, Tremblay, & Pistorius, 2018). Many of these approaches seek to identify behavioral modes by splitting movement trajectories into behaviorally homogeneous bouts based on quantitative properties of the track (Edelhoff, Signer, & Balkenhol, 2016). For instance, topology-based segmentation techniques classify locations along a track into distinct groups based on the distribution of path metrics, such as turning angles and step length (Van Moorter, Visscher, Jerde, Frair, & Merrill, 2010) or speed (Garriga, Palmer, Oltra, & Bartumeus, 2016). Time-series analyses, such as behavioral change-point analysis (Gurarie, Andrews, & Laidre, 2009) or wavelet analysis (Soleymani, Pennekamp, Dodge, & Weibel, 2017) decompose trajectories by detecting significant changes of a path-signal in time. State-space models use a mechanistic approach to identify latent states in the data corresponding to different behavioral modes (Beyer, Morales, Murray, & Fortin, 2013).
These analytical tools differ from one another in their functioning, but all of them provide ways to relate movement patterns to the underlying behavioral processes (Edelhoff et al., 2016;Gurarie et al., 2016). The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint directed flight, and resting/searching, in magnificent frigatebird (Fregata magnificens) movement tracks using hidden Markov models.
These event-oriented applications aim to isolate focal episodes along a movement track, rather than dividing the entire trajectory into behavioral phases. Whether the objective is segmentation or the detection of events of interest, the signal used depends on what type of behavioral processes are being targeted (Nams, 2014). For wolf kill sites, the signal of interest would be a phase of restricted movement around a carcass (Webb et al., 2008). For calving moose and caribou, it would be a slowdown in movements due to the reduced mobility of newborns (DeMars et al., 2013;Nicholson et al., 2019).
In some contexts, recursive movement patterns can be an indicative signal of underlying processes (Berger-Tal & Bar-David, 2015;Bracis, Bildstein, & Mueller, 2018). Many animals keep returning to places of ecological significance, such as dens, nests, and foraging patches (Bracis et al., 2018). The periodicity of such recursive movements can provide insight into ecological processes (Riotte-Lambert, Benhamou, & Chamaillé-Jammes, 2013). For example, recursive movement patterns have been studied in herbivores with the objective of understanding movement responses to spatio-temporal variability of resources, such as cycles of forage depletion and greening (Bar-David et al., 2009). Bracis et al. (2018) analyzed recursive movement patterns of a turkey vulture (Cathartes aura) to identify roost sites.
Recursive movement patterns can provide insight into behavior and life-history as well (Bracis et al., 2018). Some animals are tied to specific locations in specific phases of their life cycle and exhibit recursive movements to and from those locations (Bracis et al., 2018). These include altricial breeding bird species. While breeding, altricial bird species act as central place foragers (sensu Orians & Pearson, 1979), recursively departing from their nest site to embark on foraging trips and returning to The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint incubate their eggs or provide food for their nestlings (Andersson, 1981;Alonso, Carrascal, Muñoz-Pulido, & Alonso, 1994;Burke & Montevecchi, 2009). Back-and-forth trips can therefore be observed in movement data and used to identify when and where a bird is nesting. Once the location of a nest has been identified, patterns of nest site re-visitation may reveal information about the fate of a breeding attempt and may be used to infer reproductive outcome. Being able to estimate reproductive outcome from tracking data connects movement to reproduction, an important component of individual fitness (Morales et al., 2010;Singh & Ericsson, 2014). This could be especially valuable whenever gathering data on reproductive outcome through direct observation is logistically difficult or impractical (Etterson et al., 2011;Wilmers et al., 2015;Nicholson et al., 2019). Moreover, inferring reproductive outcome from movement establishes a direct link between environmental dynamics and components of individual fitness if tracking data is coupled with remote sensing data Pettorelli et al., 2014). While tools to infer behavioral structure along animal trajectories have been extensively used to evaluate responses to environmental factors or internal state, few studies so far have attempted the use of movement data to assess reproductive success (DeMars et al., 2013). The applications of path segmentation to detect ungulate parturition have been the first attempts at the task (DeMars et al., 2013). Besides these examples, efforts to apply movement pattern detection to the estimation of reproductive metrics have been limited. To our knowledge, no study has attempted to infer avian reproductive fitness from movement data.
Here, we introduce a data-driven method to locate breeding attempts along movement trajectories of altricial bird species and estimate their outcome based on patterns of nest re-visitation. Our approach combines event-oriented pattern detection with the analysis of recursive movement patterns. Our workflow is implemented in the R package nestR (https://github.com/picardis/nestR). We demonstrate the broad applicability of our approach by illustrating its use on GPS-tracking data for three altricial species that broadly differ in their breeding habitat and ecology: the wood stork (Mycteria americana), the lesser kestrel (Falco naumanni), and the Mediterranean gull (Ichthyaetus melanocephalus).
Example datasets for all three species are available in nestR. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint

Method description
Nest site detection - Figure 1 provides a roadmap of our workflow. Nest sites are identified as repeatedly visited locations along individual trajectories (Figure 1). Returns to a location are defined as returns to a circular area of a user-defined radius, obtained by placing a buffer around each point of the trajectory. Due to both behavior and GPS error, GPS points at a revisited location may be spatially scattered around the true center . Defining locations as buffers around points helps account for this scattering. The size of the buffer sets the spatial scale at which re-visitation patterns will be calculated. Re-visitation patterns are described by the following set of parameters: the maximum number of consecutive days a location is visited; the percentage of days visited between the first and last visit; and the percent fixes at the nest on the day with maximum attendance. Parameter values are then used as diagnostic features to filter nest sites among re-visited locations, based on the rationale that re-visitation patterns of nest sites differ from those of non-nest sites. Nest sites are often visited for longer stretches of consecutive days, on more days, and more frequently or for longer within a day than other types of re-visited locations. Especially during incubation, many bird species attend their nests intensively (e.g., Norton, 1972;van Vessem & Draulans, 1986;Ojowski, Eidtmann, Furness, & Garthe, 2001). The parameters we use to describe re-visitation patterns are meant to capture these diagnostic behaviors and are used to identify nest sites among re-visited locations. In nestR, the identification of nests is implemented in the function find_nests(). Besides the tracking data and the set of filtering parameters, this function also takes as input a series of species-and data-specific parameters used to tailor the algorithm to different case studies. These include arguments that restrict the analysis within the breeding season for a given species, and arguments that help account for data sampling rate and fix failure rate. For a more thorough discussion of function arguments, we direct the reader to the nestR package vignette.
Unless prior knowledge is available about re-visitation patterns to nest sites, researchers will need ways to inform their choice of parameter values used for filtering nest sites among re-visited locations ( Figure 1). If the true location of a nest is known for a subset of the data, researchers can compare re- The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint visitation patterns at known nest sites to those of non-nest sites and find one or more sets of parameter values to discriminate between them. If no on-ground data is directly available, researchers can visually explore the data and identify trusted nest sites, where possible. For example, likely nest sites can be recognized based on habitat features or proximity to known breeding colonies. An interactive visualization tool is available in nestR in the form of a Shiny (https://shiny.rstudio.com) app that lets users dynamically explore re-visited locations on satellite imagery while manipulating parameters in real time. Once known or trusted nest sites are identified, non-nest sites can be selected based on a criterion of temporal overlap; revisited locations that were visited simultaneously with the breeding attempt can be assumed to not be nest sites, assuming birds cannot breed in two places at the same time. Running the function find_nests() with non-constraining values for the filtering parameters will return any re-visited locations, among which the user can pick known nest and non-nest sites to compare ( Figure 1).
Several approaches are possible to identify sets of parameter values to distinguish nest from non-nest sites. In nestR, we implemented one possible approach based on classification and regression trees (CART; De'ath & Fabricius, 2000). The function discriminate_nests() applies a CART algorithm to a dataset of known nest and non-nest sites given as input. The tree gets pruned to the optimal number of nodes based on a minimum relative error criterion (De'ath & Fabricius, 2000). If input data is split into training and testing fractions, the CART also provides estimates of classification error rates through cross-validation.
The CART identifies one or more sets of parameter values that best discriminate between nest and nonnest sites ( Figure 1). Researchers can then plug these parameter values into the filtering arguments of find_nests() to identify nest sites only among re-visited locations ( Figure 1). Researchers can decide how to deal with candidate breeding attempts that temporally overlap with each other. Assuming birds cannot breed in two places at the same time, the recommended option is to pick the top candidate among any set of temporally overlapping ones and discard the rest. For each nest site, find_nests() outputs a summary of re-visitation patterns and a history of nest re-visitation (in the form of a presence/ absence time series, with GPS fixes within the nest buffer assigned to the status "present" and GPS The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint fixes outside of the nest buffer to "absent"). This time series is then used to estimate the outcome of breeding attempts.
Reproductive outcome estimation -The outcome of each identified breeding attempt is estimated using a Bayesian hierarchical modeling approach (Figure 1). A breeding attempt is considered successful if the nest site was visited until the end of a complete breeding cycle for the focal species. The model estimates nest survival based on patterns of nest site re-visitation, taking into account imperfect detection of nest visits and missing fixes. The MCMC algorithm is implemented in JAGS (Plummer, 2003) via the R package rjags.
The model specification includes two processes: the survival process, which is not directly observable, and the observation process, which is the signal observed in the re-visitation histories. Much like a Bayesian implementation of a Cormack-Jolly-Seber capture-mark-recapture model (Lebreton, Burnham, Clobert, & Anderson, 1992), the latent nest survival variable is modeled at the daily scale as a function of survival status at the previous time-step and daily survival probability: Observed visits on a given day are modeled as a function of current nest survival status, probability of visit detection on that day, and number of GPS fixes available on that day: Where the probability of detection is: Reproductive outcome is defined as the probability that the nest was still surviving on the last day of the theoretical duration of a complete breeding attempt: Where T is the duration of a complete breeding attempt. Both survival and detection probability are modeled using a binomial Generalized Linear Model as a function of the day of the attempt: The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint The model is fully specified by using uninformative priors on the Beta parameters, in this case a normal distribution with a mean of 0 and precision of 1e -5 . In the current implementation, daily survival and detection are assumed to be the same for all nests in the population. The model outputs daily estimates of survival and detection probability at the population level, as well as daily survival estimates for each breeding attempt along with credible intervals.

Applications to data
We applied our workflow to GPS-tracking data for 148 individual-years for wood storks (henceforth storks), 53 for lesser kestrels (henceforth kestrels) and 29 for Mediterranean gulls (henceforth gulls).
All tags were solar-powered and recorded fixes primarily during daytime. Details about devices, settings, harnesses and study areas regarding storks and kestrels can be found in Borkhataria et al. (2008) and Cecere et al. (2018), respectively. We split data into individual-years such that breeding activities were contained within them. To find nest sites, we restricted the analysis to data within the breeding season for each species (Table 1). While both kestrels and gulls have a well-defined breeding season between April and August in our study areas (Snow, Perrins, Hillcoat, Gillmor, & Roselaar, 1997), storks in the southeastern U.S. can breed at slightly different times of the year depending on latitude (Coulter, Rodgers, Ogden, & Depkin, 1999; Table 1). In this case, we only excluded the window of time where no breeding activities were expected to occur anywhere in the range from the analysis.
Given the spatial resolution of the GPS data (Table 1) and the expected scale of movements around the nest site for all three species, we used a buffer of 40 m around each GPS position. We initially screened trajectories for any re-visited locations using non-constraining values in filtering parameters (which is equivalent to not applying any filtering). We then used on-ground data on known nest locations to select true nests and non-nest sites among re-visited locations. Kestrels and gulls were captured at the nest site (Table 1), so the location of the nest was known. For storks, on-ground data on nest locations was available for 10 individual-years (Bear D., unpublished data). We explored the remaining stork trajectories and identified those for which the top visited location was at a known breeding colony (data The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint from USFWS 2018). We identified 97 of these, marked them as trusted and treated them as known nest sites for the rest of the analysis.
We used CART to compare re-visitation patterns between nest and non-nest sites. We split each of the three datasets in two parts for training and testing (3:2 ratio). We used the resulting sets of parameter values to filter nest sites among re-visited locations in the trajectories of breeding individuals (Table 2).
Even when the CART did not suggest that the number of consecutive days visited was an important predictor of true nest sites, we chose a reasonable value to use as a threshold for this parameter (Table   2). We did not expect to have enough power to discern nest from non-nest sites for attempts that failed in the first handful of days. We only retained a top candidate among any sets of breeding attempts that were temporally overlapping. We used non-breeder trajectories (subadults in the case of storks, nonbreeding season data in the case of kestrels and gulls) to validate our results against false positives. We calculated the positive predictive value of our algorithm as the percentage of known nest sites among the total number of nest sites we found for each species. We calculated the sensitivity of our algorithm as the percentage of the known nest sites that were identified. We calculated the false negative rate as the percentage of known nest sites that we failed to identify. Finally, we calculated the false positive rate as the percentage of non-breeding individual-years for which we erroneously identified a nest site.
We fit the nest survival model described above to estimate the outcome of the identified breeding attempts. Since kestrels and gulls were captured after they had already started breeding, the initial part of every breeding attempt was missing from the data. To account for this, we subtracted the average number of days from start to hatching (for kestrels) and to late incubation (for gulls) from the full breeding cycle when specifying the expected duration of a complete attempt (Table 1).

Results
The initial screening with no filtering identified 9954 re-visited locations (i.e., potential nest sites) for storks, 1285 for kestrels, and 1342 for gulls. Results from the CART showed that the optimal set of parameter values to discriminate nest from non-nest sites was 18 minimum consecutive days visited and 90% minimum nest attendance on the top day for storks, and 52% and 45% minimum attendance The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint on the top day for kestrels and gulls, respectively (Figure 2). In addition to the parameters suggested by the CART, we used a minimum of 4 and 8 consecutive days visited for kestrels and gulls, respectively (Table 2). By filtering re-visited locations using the parameters listed in Table 2, we identified 106 nest sites for storks, 49 for kestrels, and 28 for gulls, which closely matches the number of nest sites we were expecting to find (Table 1). As a consequence, the positive predictive value of the algorithm ranged between 78-94%, the sensitivity between 75-86%, and the false negative rate between 14-25% ( Table 3). The false positive rate was below 7% for storks and gulls but reached 68% for kestrels ( Table   3). The probability of detecting nest visits decreased throughout the breeding attempt for all three species (Figure 3). The distribution of survival estimates at the individual level differed between known failed and successful attempts, although with some degree of overlap, especially for kestrels ( Figure 4).
No data on true outcomes were available for storks, therefore we were unable to verify survival estimates for this species.

Discussion
We present an original, data-driven method to identify nest site locations of altricial avian species and estimate the outcome of breeding attempts based on GPS-tracking data. This is among the first attempts to use telemetry data to infer a major component of fitness (DeMars et al., 2013), and the first applied to birds. We demonstrated the broad applicability of our method by illustrating its use on GPS-tracking data from three species representative of different ecosystems, including a subtropical wading bird, a small steppe raptor, and a seabird. The implementation of our workflow in the R package nestR is straightforward and requires the use of intuitive, biologically interpretable parameters.
Nest site detection -Our nest site detection method performed well on all three species, allowing us to correctly identify most of the known nest sites from movement trajectories of breeding individuals. We achieved high positive predictive value and sensitivity. Moreover, we may have underestimated the positive predictive value by not being able to confirm possible second attempts following early failures.
All the species we tested our method on may attempt to breed a second time if their first clutch fails The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint early on, and it is possible that the nest sites we were unable to confirm included second attempts in addition to non-nests.
Differences between our case-study datasets both in terms of species ecology and data characteristics also presented us with different challenges. The main problem we had to deal with when applying our method to storks was not having much on-ground data. Most of the nest locations we treated as known were trusted nest sites for which we did not have on-ground confirmation (97 out of 107). Moreover, we had no information regarding the outcome of the breeding attempts, and therefore we were not able to validate results of the outcome estimation for this species. The strengths of this dataset were, first, the large sample size; and second, the fact that birds were tagged at fledging or while non-breeding and were tracked for several years, so we had data for the entire breeding cycle for all the attempts. For kestrels and gulls, this was not the case. Both kestrels and gulls were tagged after the breeding attempt had already started, so the incubation phase was partly (for gulls) or entirely (for kestrels) missing from the data. Missing the first part of breeding attempts affected our ability to identify nest sites. Nest attendance is high during incubation in many altricial bird species whose nestlings are not able to thermoregulate autonomously (DuRant, Hopkins, Hepp, & Walters, 2013; see Clark, 1980 for an example on storks). Not having tracking data for kestrels and gulls for the phase where re-visitation patterns most obviously diagnose nesting behavior hampered our ability to identify nest sites. For storks the lower false negative rate might have resulted from the relatively strict constraint we enforced for the minimum number of consecutive days; it is possible that some of the stork breeding attempts we missed failed before reaching the 18-day mark. Not identifying breeding attempts whose duration does not exceed the minimum constraint applied is a logical implication rather than a failure. In 3 cases out of 7, early failed attempts (within the 2 days following tagging) also explain missed detection of gull nest sites. For kestrels, the behavior of males might also have played a role in hampering nest site detection: 3 of the 10 nest sites that we failed to identify appear to be possible roosting sites used by males at night in the proximity of the nesting colony. Because the use of the roosting sites is temporally overlapping with the breeding attempt, roosting sites were likely to be selected for males instead of nest sites whenever attendance was higher at the roost compared to the nest. When considered in the light of The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint these unavoidable data limitations and behavioral constraints, the performance of our nest site detection method is well above satisfactory.
Another drawback of the kestrel and gull datasets, especially gulls, is that sample size was small. Small sample size might have played a role in determining the high error rates we obtained as estimates from the CART and from our post-hoc assessment. Increasing the proportion of data used for training the CART should lead to more discerning power and more accurate results, likely at the cost of an accurate estimation of cross-validation error rates. When constrained by small sample sizes, the choice of whether to prioritize more accuracy in parameter selection or cross-validation is a judgement call that depends on the study objectives and circumstances. If a post-hoc evaluation of classification performance is deemed sufficient, researchers may choose to use the entire subset of data for which nest locations are known to train the CART and skip the cross-validation.
Several factors may contribute to explain the discrepancies (both positive and negative) we observed between the error rates we expected based on the CART output and the realized error rates. The error rates estimated by the CART do not take into account an additional layer of skimming that is introduced later in the analysis when competing attempts that temporally overlap to a top candidate are discarded. This should reduce the realized false positive rate compared to what is expected based on cross-validation alone. In our case, this was verified for both storks and gulls. However, spurious nest sites were identified in some of the non-breeding kestrel tracks. This is likely explained by speciesspecific behavior: non-breeding kestrels spend long stretches of time and consecutive days on a perch while scanning for prey or resting. Distinguishing these patterns of attendance and re-visitation from those of a nest might be challenging without applying restrictions based on seasonality and geographical area. The underestimation of false negative rates based on the CART cross-validation compared to our post-hoc assessment is in part due to the constraints we added in terms of minimum consecutive days for kestrels and gulls. While decreasing our power to detect some early failed breeding attempts, applying these additional constraints increased the positive predictive value in both cases. Discrepancies between the expected and realized error rates also result, in part, from the randomization selection of the sample used to train the CART. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint Error rates for nest site identification vary in importance depending on the study objectives. If the objective is to estimate reproductive outcome, ensuring that attempts are not missed should receive priority over avoiding the selection of non-nest sites. Any re-visited location that gets erroneously identified as a nest site would likely be classified as a failed attempt eventually anyway, and would not affect estimates of the absolute number of successful attempts at the population level. In this case, we suggest that researchers may want to focus on minimizing false negatives. Conversely, if the objective of a study is, for instance, to analyze factors associated with nest site selection, minimizing false positives should be the priority.
In an ideal situation, researchers would have access to high (or high-enough-) resolution data for the focal species, with abundant on-ground data about nest locations and outcome to train the algorithm of choice, estimate error rates, and validate results. Once this is done, data characteristics being equal, the parameters found to identify nest sites can be applied to new individuals of the same species for which on-ground information is not available. If CART is the tool of choice to inform the choice of parameter values, we underline that classification thresholds in output should be used with caution, and considered more as range indications than as clear-cut rules. We also recommend that parameter values suggested by the outcome of the CART should be critically evaluated for their biological significance before use, and that adjustments should be made as needed based on knowledge of the species biology.
Future efforts to improve our method for the identification of nest locations will include incorporating uncertainty in our estimates of nest sites, allowing us to interpret classification results in a probabilistic framework.
Reproductive outcome estimation -We obtained a satisfactory degree of separation in the estimated probabilities of survival between failed and successful attempts in both gulls and kestrels, with the estimates generally reflecting the true outcome of breeding attempts. However, having incomplete data for each breeding attempt for kestrels and gulls affected the accuracy of our outcome estimation.
Missing the initial part of the attempts required adjustments of the parameter used to define the duration of a breeding cycle in the nest survival models, which inevitably came with some assumptions and generalizations. Namely, we assumed that the number of days that were missing from the data was The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint the same for all individuals within each species. If inaccurate, this assumption can result in wrong estimation of reproductive outcome. Assuming too short a duration of the missing initial phase can lead to underestimating reproductive outcome, by predicting true successes as failures, while assuming too long a duration can result in the opposite error, leading to the prediction of true failures as successes.
Given the asynchrony of tagging dates and actual start dates of each breeding attempt, subtracting the same number of days from the breeding cycle for all individuals likely led to temporal mismatches. If the association between wrong estimation and true outcome is random, estimation errors should not affect population-level inference on the proportion of successful breeding events. In the absence of the data limitations we discussed, the performance of our method would likely improve compared to the results we presented.
An advantage of the kestrel and gull datasets, unlike the stork dataset, was the high temporal resolution.
Lower sampling rates more likely result in lower detectability of nest visits, especially in some phases of the breeding attempt. For example, storks greatly reduce the frequency of their nest visits in the late nestling-rearing phase, making the detection of nest visits drop in the final part of breeding attempts with data at a 1-hour resolution. Detecting nest visits is critical for the estimation of reproductive outcome. Therefore, the higher the temporal resolution of the GPS data, the more reliable the estimates of reproductive outcome will be. Detectability of nest visits depends on the interaction of both species behavior and data resolution, as the probability of detecting a visit results from the combination of the actual frequency and duration of visits and the sampling rate. Therefore, there is no absolute rule-ofthumb as to what constitutes an adequate time resolution. Rather, sampling rate may be adjusted according to behavioral characteristic of each species when the study is designed.
The considerations we have made so far highlight how knowledge of the ecology of the focal species is necessary to successfully use the tools we presented in this paper. Researchers also need to be aware of data constraints and limitations. Overall, our application cases provide an illustration of how researchers can make the most of their imperfect data and still get useful, robust results while considering real-world shortcomings. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint Synthesis and significance -The most important implication of our work is the ability to use telemetry data to obtain estimates of bird reproductive outcome. Reproductive outcome is an important component of fitness, and estimating it from tracking data will help bridge the long-sought connection between movement and space use and components of demography at the individual level (Nathan et al., 2008;Morales et al., 2010). Our method allows researchers to obtain critical information on reproductive outcome for birds that nest in remote or inaccessible locations where it is difficult or risky to collect on-ground data (Götmark, 1992;Mayer-Gross, Crick, & Greenwood, 1997;Etterson et al., 2011;Wilmers et al., 2015). Getting information on reproductive outcome from tracking data also has the advantage of allowing ready combination with environmental conditions experienced by individual birds Pettorelli et al., 2014). This opens new, previously unattainable research avenues regarding links between components of individual fitness, movement strategies, and habitat.
A limitation of our approach is that it does not provide estimation of reproductive success in terms of number of offspring, but only in terms of overall success or failure (where success corresponds to at least one nestling fledged and failure to none). However, estimating reproductive success at a finer level might be possible in species whose behavioral signature in terms of nest site re-visitation patterns differs depending on clutch size.
Besides estimating reproductive outcome, our method is useful as a tool to identify nest site locations.
Identifying nest sites is valuable in and of itself: for example, it may allow researchers to find previously unknown nesting sites in species for which breeding locations are partly or entirely unknown. Overall, our method can appeal to researchers with different objectives. First, it may be useful to researchers that want to investigate reproductive outcome in relation to movement and environmental factors. Second, it may serve researchers that want to obtain data on reproductive outcome for species that are not easy to monitor on the ground. Third, it may be of use to researchers that want to identify the location of nest sites for analyses of breeding site fidelity, nest-site selection (Powell, Hodgman, Glanz, Osenton, & Fisher, 2010), or discovery of new breeding sites. Conservation and management applications may both benefit from the availability of the tools we presented. Our The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint where the study is explicitly designed with these objectives in mind. Potentially, our framework can even be adjusted for the identification of types of re-visited locations other than nest sites and taxa other than birds.
Future research could focus on analyzing the temporal variation of the periodicity of nest visits throughout the attempt, and possibly use that to recognize specific phases within the attempt (incubation, early nestling-rearing, etc.). This would provide insight over the temporal component of nesting patterns, allowing researchers to pinpoint the occurrence of events such as egg-laying or hatching.

Conclusions
In the context of the ever-growing availability of telemetry data for more and smaller species and at higher spatio-temporal resolution, the potential of the application we presented is vast. Our method can be used to identify nest sites and estimate reproductive outcome of a broad variety of avian species.
When looked at from multiple angles and in different ecological contexts, the relationship between movement and reproductive success can be investigated in a cohesive overarching framework at the interface of avian and movement ecology. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. Ecology, 96 (7), 1741-1753. doi:10.1890/14-1401.1 The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint error rates. Values marked with an asterisk were not taken from the CART output, but were chosen as reasonable thresholds for selecting nest sites. Consecutive days: maximum number of consecutive days spent at a location. Days visited: percentage of days when a location was visited between the first and last visits. Attendance on top day: percentage of GPS fixes at a location on the day with maximum attendance.  All rights reserved. No reuse allowed without permission.

Figures and Tables
The copyright holder for this preprint (which was not peer-reviewed) is the author/funder. . https://doi.org/10.1101/562025 doi: bioRxiv preprint Figure 1 -Workflow of the analysis to identify nest sites and estimate reproductive outcome from telemetry data. The R package nestR includes functions to tackle each of the steps depicted in the boxes. 478 479 480 All rights reserved. No reuse allowed without permission.