The application serves as an external memory of the user. To do so, one needs to pre-register the biographical information of a person together with the facial features. The biographical information consists of a number of items that are organized as six categories (Table 1). Considering that different aspects of personal information are contingent on the social context (e.g. a formal meeting or a casual conversation) (Xu et al. 2015), the biographical information is organized according to how important an item or a category is to the current context. The app makes use of the environment as the context of a social interaction, which is identified based on a method for scene recognition (Li et al. 2014). However, we did not activate automatic scene recognition for context-awareness because the system was trained on specific scenes in a particular building. And thus it may not recognize the context in an unconstrained environment. In the current app, the context is selected manually.
Meanwhile, facial features of a few persons are pre-registered in the system. In particular, for each person, images of the facial area are captured from multiple angles. These facial images are fed to the system to extract key facial features for identification. Details of the face recognition algorithm are given in the subsection Software.
The system consists of a wearable camera (i.e. a Google Glass) and a smart phone running Android OS. The two devices are connected via Bluetooth. The camera provides egocentric vision/perception, i.e. it shares similar viewing angle with the user. In our application, a client-server-cloud structure is adopted. At the front-end, the client (i.e. Google Glass) acquires images, displays results, and issues voice instructions. The server runs on the mobile phone that relays the images to the cloud. The image is processed in the cloud to try to find matching faces. If a person is recognized, the personal information is sent back to the client and displayed to the user. Figure 1 shows a photo of a user demonstrating the glass and smart phone running SocioGlass.
The core technology of the application is face recognition. When the app is started, the Glass camera captures image sequences continuously at a resolution of 640*480 pixels. To reduce the data flow between the devices, images are cropped out from the original images, so that only the central region (320*240 pixels) is used for processing.
SocioGlass leverages on the face recognition algorithm proposed in (Mandal et al. 2014). The algorithm performs face detection using OpenCV, eye localization using OpenCV and ISG (Integration of Sketch and Graph) patterns, and face recognition using an extended eigenfeature regularization and extraction (ERE) approach involving subclass discriminant analysis method (Mandal et al. 2015). This method has been evaluated extensively on numerous large face image databases. For unconstrained face recognition, it has achieved an error rate of 11.6 % with just 80 features on challenging YouTube face image database (Mandal et al. 2015). With similar number of features on wearable device database of 88 people comprising of about 7075 face images, captured mostly with Google Glass, it can achieve more than 90 % accuracy with just 7 images in the gallery (Mandal et al. 2014). Furthermore, to support accurate and fast face recognition in dynamic environments (e.g. varying illumination conditions, motion blur and changes in viewing angles), we adopt a multi-threaded asynchronous structure to leverage the opportunistic multi-tasking on the smart phone (Chia et al. 2015).
The apk files can be downloaded for the mobile phone http://perception.i2r.a-star.edu.sg/socioglass/socioglassserver-debug.apk
and Google Glass http://perception.i2r.a-star.edu.sg/socioglass/socioglass-debug.apk. Instructions on the installation and usage of the app are available in the user guide (http://perception.i2r.a-star.edu.sg/socioglass/SocioGlass_user_guide-v1.pdf).
The interaction protocol of SocioGlass is as follows.
Provided that the mobile phone and Google Glass are paired and connected via Bluetooth, and that the mobile phone is connected to the cloud server via Internet. The user starts the app on the phone (Fig. 2
b). He can then place the phone aside because he only needs to interact with the Glass subsequently.
The user starts the app on the Glass, the camera of Google Glass starts to capture live image feeds and sends them to the phone via Bluetooth connection (Fig. 2
a). A textual instruction is shown in the Glass display “Place the target person in the box below”. The user is prompted to position the Glass so that the face of the target person is located in a central region of the Glass display.
The phone receives the images from the Glass and starts to detect faces. If a face is detected, it will match them with pre-registered ones in the database. If a match is found, it will retrieve the person’s biographical information and send it to the Glass. Accordingly, the Glass displays the information together with a portrait photo of that person (Fig. 2
c). The portrait photo is displayed on the left side while the biographical information is displayed on the right side. The portrait photo and the text under it (i.e. name, position, and company) are always visible. A user can navigate between categories (bottom-right tab menus) by swiping forward or back on the Glass touch pad, and browse items within a category (if the category contains more than three items) by swiping up or down. To make the information easily accessible, we include an “All” category, which contains all information items of a person in alphabetical order.
If the system detects a face that is not recognized (e.g. not registered in the database), it displays “Unknown”. This means that the detected face is not registered in the system or his identity is not enrolled in the database.
The procedure is shown in the video that can be downloaded at http://perception.i2r.a-star.edu.sg/socioglass/Socioglass.mp4.