Background

Most research on the behavior and ecology of wild animal populations requires that study subjects are individually recognizable. Individual identification is necessary to ensure unbiased data collection and to account for individual variation in the variables of interest. For short-term studies, researchers may rely on unique methods for identification based on conspicuous natural variation among individuals at the time of data collection, such as differences in body size and shape or the presence of injuries and scars. These methods may or may not allow for identification of individuals at later dates in time. To address many evolutionary questions, however, it is necessary to collect data on known individuals over long periods of time [1]. Indeed, longitudinal studies are essential for characterizing life history parameters, trait heritability, and fitness effects (reviewed in [1]). Consequently, they are invaluable for identifying the demographic and evolutionary processes influencing wild animal populations [1].

Unfortunately, longitudinal monitoring can be challenging, particularly for long-lived species. One of the primary challenges researchers face is establishing methods for individual identification that allow multiple researchers to collect consistent and accurate demographic and behavioral data over long periods of time (in some cases several decades). Current methods for individual identification often involve either capturing and tagging animals with unique identifiers, such as combinations of colored collars and/or tags [25], or taking advantage of natural variation in populations (e.g., scars, skin and pelage patterns) and relying on researchers’ knowledge of individual differences [69]. The former method (or a combination of the two methods) has been used in some of the best established long-term field studies, such as the St. Kilda Soay Sheep and Isle of Rum Red Deer Projects [2, 3], as well as the Wytham Tit and Galápagos Finch Projects [4, 5]. Because they have long-term (multi-generation) data on known individuals, these projects have contributed substantially to the field of evolutionary biology by documenting how and why populations change over time (e.g., [1013]).

Similar methods involving capturing and collaring have been used in many longitudinal studies of wild primates, such as owl monkeys [14], titi monkeys [15], colobines [16], and in particular, many Malagasy lemurs [1720]. Through the long-term monitoring of individuals, many of these studies have provided important data on longevity, lifetime reproductive success, and dispersal patterns [15, 17, 18, 2023].

Despite its utility for many longitudinal studies, the tagging process might sometimes be inappropriate or otherwise impractical. Tagging often requires that study subjects be captured via mist netting or in nest boxes (for birds) [4, 5], trapping (e.g., Sherman traps or corrals for some mammals) [2, 3, 24], and, in the case of some larger mammals, including many primates, darting via blow gun or air rifle [10, 2527]. Capturing has several advantages, such as enabling data to be collected that would otherwise be impossible (e.g., blood samples, ectoparasites), but it can also be expensive, often making it unfeasible for studies with large sample sizes and/or those conducted over large spatial and temporal scales. Furthermore, capturing and tagging may pose additional risks to already threatened species. For example, such methods have been shown in some cases to cause acute physiological stress responses [16], tissue damage [28] and injury (e.g., broken bones, paralysis) [29], as well as disrupt group dynamics, and pose risks to reproduction, health, and even life [2932].

An alternative method for individual identification relies on researcher knowledge of variation in individual appearances. It is less invasive and removes some of the potential risks associated with capturing and tagging. Such methods have been successfully used in long-term studies of elephants, great apes, and baboons (among others) and have provided similarly rich long-term datasets that have been used to address demographic and evolutionary questions [69]. However, this method is more vulnerable to intra- and inter-observer error and thus can require substantial training. Moreover, for research sites involving multiple short-term studies in which researchers may use different methods for individual identification, it can be difficult to integrate data [33]. Additionally, long-term research is often hindered by disruptions to data collection (e.g., between studies, due to lack of research funds, political instability [1]). These breaks can result in lapses of time during which no one is present to document potential changes to group compositions and individual appearances, which can also complicate integrating data collected at different time points.

Under such circumstances, projects would benefit from a database of individual identifications, as well as a rapid method for identifying individuals that requires little training and can be used across different field seasons and researchers. The field of animal biometrics offers some solutions [34]. For example, some methods that have shown promise in mammalian (among other) research, including studies of cryptic animals, combine photography with computer-assisted individual identification programs to facilitate long-term systematic data collection (e.g., cheetahs: [35]; tigers: [36]; giraffes: [37]; zebras: [38]). These methods use quantifiable aspects of appearances to identify individuals based on probable matches in the system [34]. Because assignments are based on objective measures, these methods can minimize intra- and inter-observer error and facilitate integrating data collected across different studies [34]. At the same time, in study populations with large sample sizes, researchers might be limited in the number of individuals known on-hand. Computer-assisted programs can facilitate processing data to rapidly identify individuals when datasets are large, which reduces the limitations on sample size/scale imposed by the previous methods [34].

Despite their potential utility, such methods have not been incorporated in most studies of wild primates, and, particularly in the case of wild lemur populations, even with several drawbacks, capture and collar methods remain common [1720]. As a result, multi-generation studies of lemur populations that incorporate individual identification are limited.

Here we present a method in development for non-invasive individual identification of wild lemurs that can help mitigate some of the disadvantages associated with other methods, while also facilitating long-term research (Table 1). Our system, called LemurFaceID, utilizes computer facial recognition methods, developed by the authors specifically for lemur faces, to identify individual lemurs based on photographs collected in wild populations [39].

Table 1 Individual identification methods

Facial recognition technology has made great strides in its ability to successfully identify humans [40], but this aspect of computer vision has much untapped potential. Facial recognition technology has only recently expanded beyond human applications. While there has been limited work with non-human primates [41, 42], to our knowledge, facial recognition technology has not been applied to any of the >100 lemur species. However, many lemurs possess unique facial features, such as hair/pelage patterns, that make them appropriate candidates for applying modified techniques developed for human facial recognition (Fig. 1).

Fig. 1
figure 1

Examples of different lemur species. Photos by David Crouse (Varecia rubra, Eulemur collaris, and Varecia variegata at the Duke Lemur Center), Rachel Jacobs (Eulemur rufifrons in Ranomafana National Park), and Stacey Tecot (Hapalemur griseus, Eulemur rubriventer in Ranomafana National Park; Propithecus deckenii in Tsingy de Bemaraha National Park; Indri indri in Andasibe National Park)

We focus this study on the red-bellied lemur (Eulemur rubriventer). Males and females in this species are sexually dichromatic with sex-specific variation in facial patterns ([43]; Fig. 2). Males exhibit patches of white skin around the eyes that are reduced or absent in females. In addition, females have a white ventral coat (reddish-brown in males) that variably extends to the neck and face. Facial patterns are individually variable, and the authors have used this variation to identify individuals in wild populations, but substantial training was required. Since the 1980s, a population of red-bellied lemurs has been studied in Ranomafana National Park, Madagascar [4447], but because researchers used different methods for individual identification, gaps between studies make it difficult to integrate data. Consequently, detailed data on many life history parameters for this species are lacking. A reliable individual identification method would help provide these critical data for understanding population dynamics and addressing evolutionary questions.

Fig. 2
figure 2

Red-bellied lemurs. The individual on the right is female, and the individual on the left is male

In this paper we report the method and accuracy results of LemurFaceID, as well as its limitations. This system uses a relatively large photographic dataset of known individuals, patch-wise Multiscale Local Binary Pattern (MLBP) features, and an adapted Tan and Triggs [48] approach to facial image normalization to suit lemur face images and improve recognition accuracy.

Our initial effort (using a smaller dataset) was focused on making parametric adaptations to a face recognition system designed for human faces [49]. This system used both MLBP features and Scale Invariant Feature Transform (SIFT) features [50, 51] to characterize face images. Our initial effort exhibited low performance in recognition of lemur faces (73% rank-1 recognition accuracy). In other words, for a given query, the system reported the highest similarity between the query and the true match in the database only 73% of the time. Examination of the system revealed that the SIFT features were sensitive to local hair patterns. As matting of hair changed from image to image, the features changed substantially and therefore reduced match performance. The high dimensionality of the SIFT features also may have led to overfitting and slowing of the recognition process. Because of this, the use of SIFT features was abandoned in the final recognition system.

While still adapting methods originally developed for humans, LemurFaceID is specifically designed to handle lemur faces. We demonstrate that the LemurFaceID system identifies individual lemurs with a level of accuracy that suggests facial recognition technology is a potential useful tool for long-term research on wild lemur populations.

Methods

Data collection

Study species

Red-bellied lemurs (Eulemur rubriventer) are small to medium-sized (~2 kg), arboreal, frugivorous primates, and they are endemic to Madagascar’s eastern rainforests [46, 52] (Fig. 3a). Despite their seemingly widespread distribution, the rainforests of eastern Madagascar have become highly fragmented [53], resulting in an apparent patchy distribution for this species. It is currently listed by the IUCN as Vulnerable with a decreasing population trend [54].

Fig. 3
figure 3

Map of Madagascar and study site. a Range of E. rubriventer, modified from the IUCN Red List (www.iucnredlist.org). Range data downloaded May 26, 2016. Ranomafana National Park (RNP) is shown within the grey outline and depicted in black. b RNP depicting all photograph collection sites. Modified from [74], which is published under a CC BY License

Study site

Data collection for this study was concentrated on the population of red-bellied lemurs in Ranomafana National Park (RNP). RNP is approximately 330 km2 of montane rainforest in southeastern Madagascar [22, 55] (Fig. 3b). Red-bellied lemurs in RNP have been the subjects of multiple research projects beginning in the 1980s [4447].

Dataset

Our dataset consists of 462 images of 80 red-bellied lemur individuals. Each individual had a name (e.g., Avery) or code (e.g., M9VAL) assigned by researchers when it was first encountered. Photographs of four individuals are from the Duke Lemur Center in North Carolina, while the remainder are from individuals in RNP in Madagascar. The number of images (1–21) per individual varies. The dataset only includes images that contain a frontal view of the lemur’s face with little to no obstruction or occlusion. The dataset comprises images with a large range of variation; these include images with mostly subtle differences in illumination and focus (generally including subtle differences in gaze; ~25%), as well as images with greater variation (e.g., facial orientation, the presence of small obstructions, illumination and shadows; ~75%). Fig. 4 contains a histogram of the number of images available per individual. Amateur photographers captured photos from RNP using a Canon EOS Rebel T3i with 18–55 and 75–300 mm lenses. Lemurs were often at heights between 15–30 m, and photos were taken while standing on the ground. Images from the Duke Lemur Center were captured with a Google Nexus 5 or an Olympus E-450 with a 14–42 mm lens. Lemurs were in low trees (0–3 m), on the ground, or in enclosures, and photos were taken while standing on the ground.

Fig. 4
figure 4

Number of images per individual

The majority of images taken in Madagascar were captured from September 2014 to March 2015, though some individuals had images captured as early as July 2011. Images from the Duke Lemur Center were captured in July 2014. Due to the longer duration of the image collection in Madagascar, there was some difficulty establishing whether certain individuals encountered in 2014 had been encountered previously. In three cases, there are photographs in the dataset labeled as belonging to two separate individuals that might be of the same individual. These images were treated as belonging to separate individuals when partitioning the dataset for experiments, but if images that might belong to a single individual were matched together, it was counted as a successful match. Figure 5 illustrates the facial similarities and variations present in the dataset. Figure 5a illustrates the similarities and differences between the 80 wild individuals (inter-class similarity), while Fig. 5b shows different images of the same individual (intra-class variability). In addition to the database of red-bellied lemur individuals, a database containing lemurs of other species was assembled. This database includes 52 images of 31 individuals from Duke Lemur Center and 138 images of lemurs downloaded using an online image search through Google Images. We used only those images with no apparent copyrights. These images were used to expand the size of the gallery for lemur identification experiments.

Fig. 5
figure 5

Variation in lemur face images. a Inter-class variation. b Intra-class variation. Some images in this figure are modified (i.e., cropped) versions of images that have been previously published in [74] under a CC BY License

Recognition system

Figure 6 illustrates the operation of our recognition system (LemurFaceID). This system was implemented using the OpenBR framework (openbiometrics.org; [56]).

Fig. 6
figure 6

Flowchart of LemurFaceID. Linear discriminant analysis (LDA) is used for reducing feature vector dimensionality to avoid overfitting

Image pre-processing

Eye locations have been found to be critical in human face recognition [40]. The locations of eyes are critical to normalizing the facial image for in-plane rotation. We were unable to design and train a robust eye detector for lemurs because our dataset was not sufficiently large to do so. For this reason, we used manual eye location. Prior to matching, the user marks the locations of the lemur’s eyes in the image. Using these two points, with the right eye as the center, a rotation matrix M is calculated to apply an affine transformation to align the eyes horizontally. Let lex, ley, rex, and rey represent the x and y coordinates of the left and right eyes, respectively. The affine matrix is defined as:

$$ \begin{array}{l}M = \left[\begin{array}{ccc}\hfill 0\hfill & \hfill 0\hfill & \hfill rex\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill rey\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill 1\hfill \end{array}\right] \times \left[\begin{array}{ccc}\hfill cos\left(\theta \right)\hfill & \hfill - sin\left(\theta \right)\hfill & \hfill 0\hfill \\ {}\hfill sin\left(\theta \right)\hfill & \hfill cos\left(\theta \right)\hfill & \hfill 0\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill 1\hfill \end{array}\right] \times \left[\begin{array}{ccc}\hfill 0\hfill & \hfill 0\hfill & \hfill -rex\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill -rey\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill 1\hfill \end{array}\right]\\ {}\\ {}\\ {}\kern16em \theta = atan\left(\frac{ley-rey}{lex-rex}\right)\end{array} $$

The input image is rotated by the matrix M and then cropped based on the eye locations. Rotation is applied prior to cropping so that the area cropped will be as accurate as possible. The Inter-Pupil Distance (IPD) is taken as the Euclidean distance between the eye points. The image is cropped so that the eyes are \( \frac{IPD}{2} \) pixels from the nearest edge and 0.7 × IPD pixels from the top edge, with a total dimension of IPD × 2 pixels square. This image is then resized to the final size of 104 × 104 pixels, which facilitates the patch-wise feature extraction scheme described below. This process is illustrated in Fig. 7. Following rotation and cropping, the image is converted to gray-scale and normalized. Although individual lemurs do show variation in pelage/skin coloration, we disregard color information from the images. In human face recognition studies, skin color is known to be sensitive to illumination conditions and therefore is not considered to be a reliable attribute [57, 58].

Fig. 7
figure 7

Eye selection, rotation, and cropping of a lemur image

Since the primary application of the LemurFaceID system is to identify lemurs from photos taken in the wild, the results must be robust with respect to illumination variations. To reduce the effects of ambient illumination on the matching results, a modified form of the illumination normalization method outlined by Tan and Triggs [48] is applied. The image is first convolved with a Gaussian filter with σ = 1.1, and is then gamma corrected (γ = 0.2). A Difference of Gaussians (DoG) operation [48] (with parameters σ 1 and σ 2 corresponding to the standard deviations of the two Gaussians) is subsequently performed on the image. This operation eliminates small-scale texture variations and is traditionally performed with σ 1 = 1 and σ 2 = 2. In the case of lemurs, there is an ample amount of hair with a fine texture that varies from image to image within individuals. This fine texture could confuse the face matcher, as changes in hair orientation would result in increased differences between face representations. To reduce this effect in the normalized images, σ 1 is set to 2. The optimal value of σ 2 was empirically determined to be 5. The result of this operation is then contrast equalized using the method outlined in Tan and Triggs [48], producing a face image suitable for feature extraction. Figure 8 illustrates a single lemur image after each pre-processing step.

Fig. 8
figure 8

Illumination normalization of a lemur image

Feature extraction

Local Binary Pattern (LBP) representation is a method of characterizing local textures in a patch-wise manner [50]. Each pixel in the image is assigned a value based on its relationship to the surrounding pixels, specifically based on whether each surrounding pixel is darker than the central pixel or not. Out of the 256 possible binary patterns in a 3 × 3 pixel neighborhood, 58 are defined as uniform (having no more than 2 transitions between “darker” and “not darker”) [50]. The image is divided into multiple patches (which may or may not overlap), and for each patch a histogram of the patterns is developed. Each of the 58 uniform patterns occupies its own bin, while the non-uniform patterns occupy a 59th bin [50]. This histogram makes up a 59-dimensional feature vector for each patch. In our recognition system, we use 10 × 10 pixel patches, overlapping by 2 pixels on a side. This results in 144 total patches for the 104 × 104 face image.

Multi-scale Local Binary Pattern (MLBP) features are a variation on LBP which use surrounding pixels at different radii from the central pixel [50], as shown in Fig. 9. For this application, we used radii of 2, 4, and 8 pixels. Therefore, each patch generates 3 histograms, one per radius, each of which is normalized, and then concatenated and normalized again, both times by L2 norm. This process results in a 177-dimensional feature vector for each 10 × 10 patch. Figure 10 shows an example of three face images of the same individual with an enlarged grid overlaid. As demonstrated by the highlighted areas, patches from the same area in each image will be compared in matching.

Fig. 9
figure 9

Local binary patterns of radii 1, 2, and 4. Image from https://upload.wikimedia.org/wikipedia/commons/c/c2/Lbp_neighbors.svg, which is published under the GNU Free Documentation License, Version 1.2 under the Creative Commons

Fig. 10
figure 10

Patches and corresponding LBP histograms compared across different images of a single lemur (Avery)

To extract the final feature vector, linear discriminant analysis (LDA) is performed on the 177-dimensional feature vector for each patch. LDA transforms the feature vector into a new, lower-dimensional feature vector such that the new vector still captures 95% of the variation between individuals, while minimizing the amount of variation between images of the same individual. For this transformation to be robust, a large training set of lemur face images is desirable. LDA is trained on a per-patch basis to limit the size of the feature vectors considered. The resulting vectors for all the patches are then concatenated and normalized to produce the final feature vector for the image. Because each patch undergoes its own dimensionality reduction, the final dimensionality of the feature vector will vary from one training set to another. The LemurFaceID system reduces the mean size of the resultant image features from 396,850 dimensions to 7,305 dimensions.

Face matching

In preparation for matching two lemur faces, a gallery (a database of face images and their identities against which a query is searched) is assembled containing feature representations of multiple individual lemurs. The Euclidean distance d between feature vectors of a query image and each image in the gallery is calculated. The final similarity metric is defined as [1 − log(d + 1)]; higher values indicate more similar faces. A query can consist of 1 or more images, all of which must be of the same lemur. For each query image, the highest similarity score for each individual represents that individual’s match score. The mean of these scores, over multiple query images, is calculated to obtain the final individual scores. The top five ranking results (i.e., individuals with the 5 highest scores) are presented in descending order. We evaluated LemurFaceID systems’ recognition performance with queries consisting of 1 and 2 images.

Figure 11a shows match score histograms for genuine (comparing 2 instances of the same lemur) vs. impostor (comparing 2 instances of different lemurs) match scores with 1 query image. Figure 11b shows score histograms with fusion of 2 query images. Note that the overlap between genuine and impostor match score histograms is substantially reduced by the addition of a second query image.

Fig. 11
figure 11

Histograms of genuine (correct match) vs. impostor (incorrect match) scores. a Results with only one query image (4,265 genuine, 831,583 impostor). b Results with 2 query images (4,317 genuine, 841,743 impostor)

Statistical analysis

We evaluated the accuracy of the LemurFaceID system by conducting 100 trials over random splits of the lemur face dataset (462 images of 80 red-bellied lemurs) that we collected. To determine the response of the recognition system to novel individuals, the LDA dimensionality reduction method must be trained on a different set of individuals (i.e., training set) from those used to evaluate matching performance (known as the test set). To satisfy this condition, the dataset was divided into training and testing sets via random split. Two-thirds of the 80 individuals (53 individuals) were designated as the training set, while the remainder (27 individuals) comprised the test set. In the test set, two-thirds of the images for each individual were assigned to the system database (called the ‘gallery’ in human face recognition literature) and the remaining images were assigned as queries (called the ‘probe’ in human face recognition literature). Individuals with fewer than 3 images were placed only in the gallery. The gallery was then expanded to include a secondary dataset of other species to increase its size.

Testing was performed in open-set and closed-set identification scenarios. Open-set mode allows for conditions encountered in the wild, where lemurs (query images) may be encountered that have not been seen before (i.e., individuals are not present in the system database). Queries whose fused match score is lower than a certain threshold are classified as containing a novel individual. Closed-set mode assumes that the query lemur (lemur in need of identification) is represented in the gallery and may be useful for identifying a lemur in situations where the system is guaranteed to know the individual, such as in a captive colony.

For open-set testing, one-third of the red-bellied lemur individuals in the gallery were removed. Their corresponding images in the probe set therefore made up the set of novel individuals. For open-set, the mean gallery size was 266 images, while for closed-set the mean size was 316 images. Across all trials of the LemurFaceID system, the mean probe size was 42 images.

Results

Results of the open-set performance of LemurFaceID are presented in Fig. 12, which illustrates the Detection and Identification Rate (DIR) against the False Accept Rate (FAR). DIR is calculated as the proportion of non-novel individuals that were correctly identified at or below a given rank. FAR is calculated as the number of novel individuals incorrectly matched to a gallery individual at or below a given rank. In general, individuals are correctly identified >95% of the time at rank 5 or higher regardless of FAR, but DIR is lower (<95%) at rank 1, only approaching 95% when FAR is high (0.3).

Fig. 12
figure 12

DIR curve for open-set matching with 2 query images. Plots show the proportion of in-gallery lemurs that were correctly identified (DIR) at (a) rank 1 and (b) rank 5 versus the proportion of novel individuals that were matched to a gallery individual (FAR)

Rank 1 face matching results for closed-set operation are reported in Table 2, and the Cumulative Match Characteristic (CMC) curves for 1-image query and 2-image fusion (combining matching results for the individual query images) are shown in Fig. 13. This plot shows the proportion of correct identifications at or below a given rank. The mean percentage of correct matches (i.e., Mean True Accept Rate) increases when 2 query images are fused; individuals are correctly identified at Rank 1 98.7% ± 1.81% using 2-image fusion compared to a Rank 1 accuracy of 93.3% ± 3.23% when matching results for a single query image are used.

Table 2 Face matcher evaluation results (Rank 1, closed-set)
Fig. 13
figure 13

CMC curves for closed-set performance. a Performance of our method with 1 image as query. b Performance of our method with 2 images as query. CMC indicates the percentage of correct matches at each rank and below

Discussion

Our initial analyses of LemurFaceID suggest that facial recognition technology may be a useful tool for individual identification of lemurs. This method represents, to our knowledge, the first system for machine identification of lemurs by facial features. LemurFaceID exhibited a relatively high level of recognition accuracy (98.7%; 2-query image fusion) when used in closed-set mode (i.e., all individuals are present in the dataset), which could make this system particularly useful in captive settings, as well as wild populations with low levels of immigration from unknown groups. Given the success of LemurFaceID in recognizing individual lemurs, this method could also allow for a robust species recognition system, which would be useful for presence/absence studies.

The accuracy of our system was lower using open-set mode (i.e., new individuals may be encountered) where, regardless of the False Accept Rate (FAR), non-novel individuals were correctly identified at rank 1 less than 95% of the time and less than 85% of the time given a FAR of 0. These numbers are expected to improve with a larger dataset of photographs and individuals. In our current sample, we also included photographs exhibiting only subtle variation between images. Given that the ultimate goal of LemurFaceID is to provide an alternative, non-invasive identification method for long-term research, it will also be important to test its accuracy using a larger dataset that includes only photographs with large variation (e.g., collected across multiple, longer-term intervals).

We also note that our system focuses specifically on classifying individuals using a dataset of known individuals in a population. Such a tool can be particularly useful for maintaining long-term research on a study population. This approach differs, however, from another potential application of face recognition methods, which would be to identify the number of individuals from a large image dataset containing unknown individuals only (i.e., clustering) [59, 60]. The addition of a clustering technique could allow for more rapid population surveys or facilitate the establishment of new study sites, but such techniques can be challenging as clustering accuracy is expected to be lower than the classification accuracy [59, 60]. That said, in future work, the feature extraction and scoring system of LemurFaceID could potentially be combined with clustering techniques for segmenting datasets of unknown individuals.

Despite some current limitations, LemurFaceID provides the groundwork for incorporating this technology into long-term research of wild lemur populations, particularly of larger-bodied (>2 kg) species. Moving forward, we aim to 1) expand our photographic database, which is necessary to automate the lemur face detector and eye locator, 2) increase open-set performance by improving the feature representation to provide better separation between scores for in-gallery and novel individuals, and 3) field test the system to compare the classification accuracy of LemurFaceID with that of experienced and inexperienced field observers. Once optimized, a non-invasive, computer-assisted program for individual identification in lemurs has the potential to mitigate some of the challenges faced by long-term research using more traditional methods.

For example, facial recognition technology would remove the need to artificially tag individuals, which removes potential risks to animals associated with capturing and collaring; some of these risks, including injury, occur more frequently in arboreal primates [29]. At the same time, many costs incurred using these techniques are removed (e.g., veterinary services, anesthesia), as are potential restrictions on the number of individuals available for study (e.g., local government restrictions on captures). More traditional non-invasive techniques that rely on researchers’ knowledge of natural variation can be similarly advantageous, but facial recognition programs can help ensure that data are collected consistently across multiple researchers. That said, we would not recommend researchers become wholly reliant on computer programs for individual identification of study subjects, but training multiple researchers to accurately recognize hundreds of individuals is time-consuming and costly, as well as potentially unrealistic. Facial recognition technology can facilitate long-term monitoring of large populations by removing the need for extensive training, or potentially accelerate training by making phenotypic differences more tangible to researchers and assistants. Moreover, in studies with large sample sizes where immediate recognition of all individuals might be impossible, facial recognition technology can process data more quickly. For example, LemurFaceID takes less than one second to recognize a lemur (using a quad core i7 processor), which will save time identifying individuals when manual comparisons of photographs/descriptions are necessary.

Ultimately then, LemurFaceID can help expand research on lemur populations by providing a method to systematically identify a large number of individuals over extended periods of time. As is the case with other long-term studies of natural populations, this research has the potential to provide substantial contributions to evolutionary biology [1]. More specifically, lemurs are an endemic mammalian lineage that evolved in Madagascar beginning >50 million years ago [61]. Over time, they have greatly diversified with >100 species recognized today [43]. They occupy diverse niches (e.g., small-bodied, nocturnal gummivores; arrhythmic frugivores; large-bodied, diurnal folivores) across Madagascar’s varied habitats (e.g., rainforests; spiny, dry forest) [43], and they have recently (in the last ~2,000 years) experienced extensive ecological change owing largely to human impact [62]. Accordingly, this mammalian system provides unique opportunities for studying ecological and evolutionary pressures impacting wild populations.

Data obtained from longitudinal studies of lemurs can also aid in conservation planning and management for this highly endangered group of mammals. Demographic structure and life history parameters documented from long-term research can provide insights into the causes of population change and be used to model extinction risk [6365]. LemurFaceID also has potential for more direct applications to conservation. One notable threat to lemurs [66, 67], as well as many other animal species [68, 69], is live capture of individuals for the pet trade. LemurFaceID could provide law enforcement, tourists, and researchers with a tool to rapidly report sightings and identify captive lemurs (species and individuals). A database of captive lemurs can help with continued monitoring to determine if individuals remain constant over time.

Importantly, the face recognition methods we developed for LemurFaceID could be useful for individual identification in other primates, as well as other non-primate species, especially those with similarly variable facial pelage/skin patterns (e.g., bears, red pandas, raccoons, sloths). Furthermore, as camera trapping has become increasingly useful for population monitoring of many cryptic species (e.g., [70, 71]), our facial recognition technology could be potentially incorporated into long-term, individual-based studies conducted remotely. That said, it will be necessary to make unique modifications to methods for different lineages.

To illustrate this point, recent publications also have explored the area of facial recognition for primates. For example, Loos and Ernst’s [41] system for recognizing chimpanzees has a similar approach to pre-processing as LemurFaceID, but they use a different illumination normalization method and correct for greater difference in perspective. In feature extraction, their use of speeded-up robust features (SURF), a gradient-based feature similar to SIFT, underscores the difference in lemur and chimpanzee faces, namely the lack of hair/fur in chimpanzees to confound the directionality of the features [41]. Their selection of Gabor features also reflects the relative lack of hair, as such indicators of edgeness would exhibit significantly more noise in lemurs [72]. More recently, Freytag et al. [73] were able to improve upon recognition accuracy of chimpanzees by applying convolutional neural network (CNN) techniques. Their results identify CNNs to be a promising direction of animal face recognition research, but such methods also require datasets that are orders of magnitude larger than our current dataset [73]. Thus, although they are beyond the scope of this study, CNNs could be an interesting avenue for future research in lemur face recognition.

In contrast to these approaches, Allen and Higham [42] use a biologically-based model for identification of guenons. Their feature selection is based on guenon vision models, using the dimensions of facial spots to identify species and individuals [42]. While E. rubriventer individuals also possess prominent facial spots, these are not common across different lemur species and therefore unsuitable for use in our system. The wide variety of approaches used underscores that there is no “one size fits all” approach to animal facial recognition, but once developed, this technology has the potential to facilitate long-term research in a host of species, expand the types of research questions that can be addressed, and help create innovative conservation tools.

Conclusions

Our non-invasive, computer-assisted facial recognition program (LemurFaceID) was able to identify individual lemurs based on photographs of wild individuals with a relatively high degree of accuracy. This technology would remove many limitations of traditional methods for individual identification of lemurs. Once optimized, our system can facilitate long-term research of known individuals by providing a rapid, cost-effective, and accurate method for individual identification.