Tutoring new song elements to male birds in the wild: Lessons learnt from playback tests with the collared flycatcher

Many vocalisations of songbirds are sexually selected and socially learnt behavioural traits that are subject to cultural evolution. For cultural inheritance, it is required that individuals imitate the song elements and build them into their repertoire, but little is known about how such learning mechanisms take place in natural populations of birds with large repertoire size. Using a Hungarian population of the collared flycatcher ( Ficedula albicollis ) as a model, we tested how often adult males can build new song elements (artificially modified or originated from distant populations) into their repertoire during mating season by using a playback approach. We predicted that when individuals incorporate new elements into their repertoire, the formerly unfamiliar elements from the playback songs would be recovered in the recorded songs of the focal males. We performed a teaching procedure with 26 males, in which we played back song sequences containing three artificially modified and three foreign syllables for each male. We recorded the song of the focal males twice a day for 2–6 days long. Then, we applied a thorough search based on a combined automatic and manual identification method to detect the tutorial syllables in the recorded songs. We found one foreign syllable type in the recordings from one male which indicates that male collared flycatchers may learn new syllable types in the courtship season. As our study has some limits, we highlight some general challenges concerning the use of playback approaches in the field for demonstrating the incidences of learning of particular song elements.

Ornis Fennica 99: 52-59.2022 Many vocalisations of songbirds are sexually selected and socially learnt behavioural traits that are subject to cultural evolution.For cultural inheritance, it is required that individuals imitate the song elements and build them into their repertoire, but little is known about how such learning mechanisms take place in natural populations of birds with large repertoire size.Using a Hungarian population of the collared flycatcher (Ficedula albicollis) as a model, we tested how often adult males can build new song elements (artificially modified or originated from distant populations) into their repertoire during mating season by using a playback approach.We predicted that when individuals incorporate new elements into their repertoire, the formerly unfamiliar elements from the playback songs would be recovered in the recorded songs of the focal males.We performed a teaching procedure with 26 males, in which we played back song sequences containing three artificially modified and three foreign syllables for each male.We recorded the song of the focal males twice a day for 2-6 days long.Then, we applied a thorough search based on a combined automatic and manual identification method to detect the tutorial syllables in the recorded songs.We found one foreign syllable type in the recordings from one male which indicates that male collared flycatchers may learn new syllable types in the courtship season.As our study has some limits, we highlight some general challenges concerning the use of playback approaches in the field for demonstrating the incidences of learning of particular song elements.

Introduction
Cultural transmission can be defined as the inheritance of phenotypic traits through the process of social learning (Jenkins 1978, Slater 1986, Luther & Baptista 2010, Garland et al. 2011).Consequently, individuals can accumulate and use information from others concerning food preference, sexual behaviour, predator avoidance and habitat choice.Such learning helps individuals gain their fitness by adapting to the quickly changing environment and lead to processes of cultural evolution (Mesoudi et al. 2016, Aplin 2019).Cultural evolution has been shown to affect the communication system of many animal species, of which birdsong is the most studied model.Several studies have identified local dialects (Harbison et al. 1999, Nelson et al. 2004, Podos & Warren 2007) or the change of repertoire composition in a population over time (Byers et al. 2010, Williams et al. 2013) suggesting that cultural evolution takes place.However, the underlying assumption of individual learning remains to be proven in many cases.
Few studies have demonstrated experimentally that individual birds are able to copy song elements from tutor songs, most of which were performed in captivity.These experiments revealed that song learning is often linked to a specific sensory phase, when tutees need to be exposed to tutor songs, while the production of the learnt elements corresponds to a sensorimotor phase when birds sing the learned songs (Marler 1970, Baptista & Petrinovich 1986, Baptista & Morton 1988, Slater et al. 1988, Beecher & Brenowitz 2005).In songbirds, there is a considerable interspecific variance concerning the timing of learning: closed-ended learners have a restricted sensitive phase (Nottebohm 1984, Böhner 1990, Beecher & Brenowitz 2005, Kiefer et al. 2014), while open-ended learners remain sensitive throughout their lifetime (McGregor & Krebs 1989, Chaiken et al. 1994, Brainard & Doupe 2002, Eriksen et al. 2011, Araya-Salas & Wright 2013).Laboratory studies are biased towards closed-ended learners with simple songs (small repertoire of syllables in repeated sequences), and most of the field experiments were also conducted on such birds (Jenkins 1978, Mennill et al. 2018).Meanwhile, evidence for the learning of particular song elements is scarce for species with complex song (large repertoire of syllables in various orders).For example, in case of the pied flycatcher (Ficedula hypoleuca), it has been shown that adult males were able to imitate unfamiliar syllables in playback tests in the field (Eriksen et al. 2011).Further studies on similar species would be of particular importance because the underlying learning mechanisms in open-ended learners with complex songs potentially involve many elements with potentially different functions (Garamszegi et al. 2012).
The demonstration of vocal imitation in species with complex songs is a challenging task for at least two reasons.First, ideally one should study natural systems, because individuals may not sing the whole repertoire in captivity, and/ or may not be as responsive to social stimuli in the laboratory as in the wild (Rivera-Gutierrez et al. 2011).Second, learning should be proven experimentally, otherwise, it is impossible to disentangle if a newly detected element in the repertoire is a result of learning from an immediate vocal interaction, or it was already known, and the current stimulus recalled it from the memory.The collared flycatcher (Ficedula albicollis) as an oscine, is strongly assumed to learn its song elements (Kroodsma & Miller 2016).Furthermore, in case of the strongly related pied flycatcher it was experimentally proven, that it learns its song (Eriksen et al. 2011).We also know that both temporal and spatial variations in repertoire content exist at the population level, which implies roles for social learning in this species (Vaskuti et al. 2016), but alternative explanations (such as genetic drift) cannot be ruled out.Here, we aim to study how frequently collared flycatcher males imitate syllables in territorial interactions using a playback design.We played back modified songs of the same species that included syllables unknown for the population.We predicted that when imitation occurs, then the novel elements would be detectable in the song of the focal males.

Preparation of playback sequences
Each playback tutorial sequence was built based on three different sources of syllables: recordings from 2017 on the experimental site (Source 1); foreign syllables originating from recordings in other distant places (Source 2); and artificially modified syllables (Source 3).We considered the syllables from Source 1 as known and Source 2 and 3 as unknown for the focal population (Fig. 1).All of these syllables were taken from recordings with the best available quality (low background noise and without vocal disturbance from other birds).Source 1 recordings were used to generate the baseline sequence of the syllables in which tutorial syllables from Sources 2 and 3 were inserted (Fig. 1a).
The syllables from Source 2 were obtained from the song recordings downloaded from the Xeno-Canto website (www.xeno-canto.org)and originating from several countries of Europe (Supplementary Table 1).The minimum distance of these recordings from our study sites was ca.400 km and the maximum distance was ca.1300 km (871 ± 291 km in mean ± SD).We assumed that the syllables from these recordings have species-specific characteristics and contain population-specific syllables that are unknown for the males in the studied Hungarian population.
The modified syllables (Source 3) originated from the same area as Source 1 syllables, but they were modified to create novel syllable types.To carry out this manipulation we used the "Pitch shifter" function of Adobe Audition 3.0 (Adobe Systems Inc.).With this tool we shifted the frequency of the syllables, meanwhile length remained the same.The modified syllables remained within the frequency range that is typical for the species but resulted in a particular frequency profile for the modified syllable that are unknown for the population.
To ensure that the tutorial syllables (Sources 2 and 3) were not present in the repertoire of the local population, we conducted thorough search in our long-term syllable library (see supplementary material).Altogether 39 syllable types from 16 recordings (Source 2 and Source 3) were used in our experiment.

Field procedure
The playback tests were performed in April and May of 2018-2019 in the Pilis-Visegrádi Mts., Hungary (47°43'16''N, 18°59'56''E) in a range of around eight kilometers.Males involved in the study were free-living individuals occupying natural tree holes.Males were identified as unpaired males by their conspicuous courtship behaviour (i.e.singing and displaying).After finding a suitable male, we placed the playback installations 4-6 m high on a tree trunk about 20 meters away from the nest hole of the focal male.With this setup, we imitated a newly arrived singing conspecific neighbour.The volume of the speaker was set up by a human listener in a way to obtain a natural sound intensity similar to the singing males.After starting the playback (around 6-8 AM), we simultaneously recorded around 50 good quality songs from the focal male and then we left the area and kept the playback to continue.We returned to the focal territory in 4-5 hours and recorded another set of ca.50 songs, just before remounting the installation and terminating the playback for that given day.In the subsequent days, we repeated this procedure until the focal male has paired and stopped singing.This approach resulted in recordings from 26 males, spanning 2-6 days per males (3.8 ± 1.5 days in mean ± SD) including 2-16 successful recordings (7.9 ± 3.7 recordings in mean ± SD) from each male.

Detection of the tutorial syllable types in the recordings
In the first step of syllable detection, we scanned the recordings for the presence of the tutorial syllable types.We first used a spectrographic cross-correlation approach with the library of 'monitoR' (Hafner & Katz 2018) in R (R Core Team 2019) to detect candidate syllables that could potentially represent learned syllables.To do so, for each tutorial syllable type, we built a filter window relying on the minimum and maximum frequency of the template syllable to narrow down the automatic scanning into the appropriate frequency range and to remove the effect of the background noise outside of this frequency range.To determine the detection threshold, we used the part of the recordings that contained the playback songs from the speaker, so we were certain that the tutorial syllable appears in the recording.The detected cross-correlation values between the template syllables and their correspondent syllables retained from the recordings were between 0.55 and 0.88 (0.68 ± 0.09 in mean ± SD).Therefore, we defined a detection threshold at a cross-correlation cut-off value of 0.55 for the automatic selection of candidate syllables potentially representing incidences of true copies (see supplementary material in the online version of this article).
In the second step, we manually screened the candidate syllables to eliminate the false positives by the visual inspection of the spectrographic representation of the syllables.The final judgement by human observers was necessary for making conclusions about qualitative matches by also appreciating some level of variance within the same syllable type.The conclusions of the visual inspection were finally confirmed by the three authors to reach a consensus for incidences for learnt syllable types.

Results
Based on our screening routine, we found that one tutorial syllable type appeared in the recordings at one out of 26 males involved in the tutoring tests.We could detect 11 instances of this template-like tutorial syllable type in the given individual (Fig. 2).The first instance appeared in the songs from the second recording of the first day.Similarly to the original, all the copied syllables were between 4-6 kHz in frequency, 0.2-0.25 seconds long and has similar structure with a shorter higher frequency part (5-6 kHz), and a longer lower frequency part (4-5 kHz).Differences arose mainly in the relative duration of these parts or the duration of the whole syllable and in the frequency track of parts slightly decreasing or increasing.The cross-correlational scores between the instances and the template were between 0.590 and 0.652 (0.14 ± 0.020 in mean ± SD).

Discussion
In general, we found weak evidence for the learning of new syllable types in the collared flycatcher, as most of the tested males did not incorporate novel elements from the playback recordings into their songs.Below, we provide a critical interpretation for these results, then -based on the experienced shortcomings -we provide some methodological recommendations that can be used to improve future studies.
We cannot be sure that the individual that produced similar syllables to one of the playback stimuli actually learnt the template syllable.In the process of learning, syllables go through the crystallization and thereafter they are sung by relatively low variation (Read & Weary 1992, Tumer & Brainard 2007).As our test examines a nearby period of the imitative learning of some syllables, one can appreciate that the learnt element would not have the exact representation on the sonograms to the tutorial syllable type.Therefore, upon the detection of imitation events, such learning mistakes should be considered (Marler 1970, Slater et al. 1988).This may warrant more permissive approaches for syllable categorization allowing a certain degree of within-individual variation of the same syllable type, however, the extent of that mistakes remains unknown in our model species.Accordingly, we cannot be sure that the detected similarity between the template syllable and the 11 template-like syllables is due to true learning (variations in Fig. 2 capture the variance of the same syllable type) or due to observer effect (variations in Fig. 1 capture the among-syllable type variance).
Even if we accept the above incidences for a single male as evidence for successful learning of the template syllable type, we can conclude that the success rate of our tutoring test was relatively low (it would mean that only the 3.8% of males were able to pick up a new syllable type from a playback).There are several reasons that can explain the low rate of learning of novel syllable types by the males in our tests.First, it is possible that the chosen stimulus did not achieve a sufficiently natural effect and so the constructed playback sequences may be not suitable to induce biological response from the focal birds.For example, although we aimed to mimic a natural situation with the structure of the playback sequences, we have repeated the same set of sequences several times.Furthermore, the playback songs were played back from exactly the same location and at the same volume, which may have also represented unnatural situation.Also, we performed the tests in the absence of a visual stimulus that would establish a particular social context, while for a successful learning the presence of a live tutor might be necessary (Rice & Thompson 1968, Kroodsma & Pickert 1984, Baptista & Petrinovich 1984, 1986, Chaiken et al. 1993).Playback sequences that reflect better the natural variance of song content or the better elaboration of playback conditions (visual stimulus, various volume and direction of playback) may have led to better results (Beecher & Burt 2004).
We believe that our recording regime was sufficient to recover the learnt syllables as previous studies showed that 20 songs are feasible to reliably describe the song repertoire of a male collared flycatcher, particularly, the vast majority of the syllables known by an individual are produced already in 15 songs (Garamszegi et al. 2002, Garamszegi et al. 2012).We recorded 100 songs daily for 2-6 days to reveal the repertoire of each collared flycatcher male, nonetheless, it is plausible, that rarely sang, newly acquired syllables might occur only later, after the playback procedure (Chaiken et al. 1994, Kiefer et al. 2010).We cannot exclude the possibility that collared flycatcher males do not copy syllables from each other during the courtship period, but they might study novel song elements out of the breeding sites (Sorensen et al. 2016).
Despite the above remarks, our study points out some important phenomena that should be considered in similar tutoring tests in species with complex songs and could be used to design firm experiments.We would like to emphasise especially the problem of the learning mistakes that may lead to some extra variation in the physical structure of the learned syllable, which can raise some uncertainty around judgements about the imitative learning.Additional data processing techniques, like cluster analysis of syllables (e.g.software KOE https://koe.io.ac.nz, Fukuzawa et al. 2020), might reveal the learnt syllables in a more sensitive way than the spectrographic cross-correlation technique we used.Also the structural variation of syllables could be analysed along the sequences of recordings from each male under the prediction that learning mistakes decrease as the male practices the acquired syllables, thus within-individual variance of the same syllable type should be decreased by time.Future studies along this direction may warrant insights on the detailed mechanisms of vocal learning in general.

Fig. 1 .
Fig. 1.Playback sequences used for tutoring: (A) Spectrogram of a song we played back.Each song contained syllables originating from the local population, tutorial syllables originating from a foreign population and syllables that were modified artificially.(B) The block diagram of the song sequences.The sequences of the songs were arranged in a natural way including shorter and longer pauses.

Fig. 2 .
Fig. 2. Spectrograms of the tutorial syllable originated from an Italian population (recording number in Xeno-Canto: XC375479) and its potential copies found in the songs in one tutored individual.The tutorial syllable type is indicated with bold frame in the upper left corner.