This code creates a fake dataset by taking 25 samples from band_colors with replacements. It then combines the first result in each list, so that it is a list of 25 sets of 3 band colors. It gets rid of any duplicates and then shows the first 12 results.
We generated 100 observations, but bloy_chicks has 94 observations. I think this is because it removes duplicates using distinct().
# Find most frequent beach per birdbeach_freq <- bloy_chicks %>%group_by(bird) %>%count(bird, beach) %>%filter(n ==max(n)) %>%ungroup()# Find first date for each bird+beachbeach_early <- bloy_chicks %>%group_by(bird, beach) %>%summarize(earliest =min(survey),.groups ="drop")# Join the two conditions and retain most frequent beach, only earliesthatch_beach <- beach_freq %>%left_join(beach_early, by =c("bird", "beach")) %>%group_by(bird) %>%filter(earliest ==min(earliest)) %>%sample_n(1) %>%# Randomly choose 1 row. See ?sample_nungroup()
Custom function!
The logic:
Put the logic for estimating the hatching beach in a single function.
Group the data by bird
Summarize each group using your custom function
Using this workflow: Most frequent site -> earliest day -> choose 1
find_hatching_beach <-function(site, date) {# Start with a data frame (or tibble) of site and date for *one* bird# Use pipes and dplyr functions to find the hatching beach bird_observations <-tibble(site, date) result <- bird_observations %>%count(site) %>%filter(n ==max(n)) %>%left_join(bird_observations, by =c("site")) %>%filter(date ==min(date)) %>%sample_n(1)# use as many pipes and dplyr functions as necessary# result should end up as a data frame with one row for the hatching beachreturn(result$site) # return the hatching beach}# split-apply-combinehatch <- bloy_chicks %>%group_by(bird) %>%summarize(hatch =find_hatching_beach(beach, survey))
The column I used for “date” was “survey”; the column I used for “site” was “beach”.
The hatching beach for both TWG and WYB was Mitchell’s.