Keywords

1 Introduction

Mirrors by nature reconstruct a duplicated world optically through reflecting photons of light from the real world to human’s eyes. Standing in front of a mirror, a user can perceive not only his/her own figure but also surroundings, including objects, other figures and spaces. This made mirrors essential household necessities, fulfilling different functionalities at various locations. For example, a full-length mirror not only assists users for dressing when placed in the bedroom but also creates illusion of more space when hanging on the wall of the living room.

Researchers tried to extend the functionalities of mirrors by augmenting digital properties on top of the reflective reality. These technical discoveries, in general, were categorized as the study of Augmented Reflection. In contrast to Augmented Reality, Augmented Reflection overlays 3D digital information on the reflection of the real world instead of the real world itself. This technique has the potential to realize many household use cases and scenarios, requiring 3D floating information, wearable-device-free interaction and two-handed direct manipulation [1, 3, 4, 6, 10, 12].

Most of the related works have mainly focused on developing technological advances, aiming at reflective optics and computational geometry, to combine natural and rendered optical information. Related systems have successfully aligned the reflections of real-world objects with their 3D corresponding information to create high fidelity illusion of coexistence. However, such augmented mirrors, being similar to the Magic Mirror in the story of Snow White (Fig. 1), still follows the conventional interaction model, where information retrievals are triggered by users’ determined inquiries according to their explicit demands.

Fig. 1.
figure 1

Magic Mirror in the story of Snow White answers Evil Queen’s requests [13].

What if an augmented mirror was like the Mirror of Erised (Fig. 2), capable of serving users’ implicit needs instead of waiting for their explicit requests? Could this proactive capability provide even more intuitive interaction than what the current model does? Our vision is that augmented mirrors are not only a passive portal for users to access augmented information of everyday objects, but also an active server to dynamically fulfill users’ instant needs, elicited from their day-to-day interactions with physical objects and associated information.

Fig. 2.
figure 2

Mirror of Erised in Harry Potter shows the most desperate desire of a person’s heart [11].

To explore this vision, we propose an advanced augmented mirror system that reflects images of real-world objects by nature, collocates digitally rendered information with the reflections and further learn the correlations among users, objects and augmented information.

2 Related Works

There are two kinds of augmented mirror systems, one uses the video see through technique, the other implements the optical combinator concept.

In video see through systems, the mirrors are large format displays, showing digital information overlaying on top of the 2D video streams, captured by high resolution cameras. This approach is easier technically and popular in commercial applications, such as i-Mirror [15], Toshiba’s Virtual 3D Dressing Room [14] and Adidas virtual mirror [2]. The limitation is that they have fixed and slightly tilted perspectives, caused by the positions of camera.

The optical combinator systems, which realize the true Augmented Reflection idea, combine a half-silvered mirror with a 2D display to collocate digital rendered 3D information with the reflection of real world. This implementation creates magical illusion, making users to believe the 3D digital rendered information truly exist, such as The Mixed Reality Mirror Display [4], Air Drum [3], and Holoflector [6]. These systems rely on tracking the user’s head position to generate dynamic perspectives, hence limiting to single user scenarios. Furthermore, the lack of depth and stereo affects the comfort of visual perception.

3 Design Space

According to the vision, we divide the design space into three parts to explore, including augmentation, interaction and correlation.

3.1 Augmentation

Most published research results addressing the issue of augmenting reflections with virtual elements have primarily dealt with the technical aspects of it, creating the illusion of coexistence by developing view dependent 3D rendering algorithms. However, less work has been done on exploring design possibilities, arranging virtual and reflected elements in numerous ways to generate diverse design solutions. We hence identified a framework suggesting various information placements for different use cases.

Around.

Augmentations can be placed around the reflection of an object, making the best use of the surrounded 3D virtual space instead of being limited by the real estate of a 2D screen. Virtual elements, such as texts, lines and graphics, can be placed close to a mirrored object as an annotation, be put between mirrored objects to illustrate their relations, and be located near where users are gazing at to draw attention. The tabletop surface nearby an object can provide physical affordances for users to interact with the touchable Graphical User Interface (GUI) elements on the mirrored tabletop (Fig. 3A).

Fig. 3.
figure 3

Design space for augmentation design.

On.

Augmentations can be attached on the surface of the reflection of an object, enhancing the object’s surface properties. Virtual elements, such as underlines, frames, symbols, and backgrounds, can be attached to highlight or modify existing information. 2D GUI elements, such as icons, buttons, and hyperlinks can be added as interactive properties, triggering additional functionalities, and linking them to external information on websites or clouds (Fig. 3B).

Inside.

Augmentations can be embedded inside the reflection of an object, becoming a hidden layer of information to be revealed when needed. Virtual elements, such as volumes, structures and contents, can simulate internal properties, which users physically have difficulty perceiving and inspecting. This internal information can be predefined contents providing corresponding information on demand, synchronized real world inputs updating states of contents in real time, and even simulated volumetric data according to users’ direct inputs (Fig. 3C).

3.2 Interaction

The current interaction techniques for augmented reflection enable users to use physical tools, two hands, and the body to interact with virtual contents in the mirror. These techniques highly rely on hand-eye coordination skills, performing physical actions to align with visual feedback in the mirror. However, we believe tangible mediated interaction can provide physical affordances, and hence can be more intuitive. Therefore, we developed three tangible mediated interactions to assist, which are object-mediated, surface-mediated and face-mediated interactions.

Object-Mediated.

The significant advantage of an augmented mirror is that it allows users to have two-handed direct manipulation of physical objects. Since having the full control of physical objects via hands, users can use one hand to slightly move an object to call out the information, pick up/put down the object to switch functionalities, and rotate it to adjust properties. Users can also use two hands to manipulate the proximity of two physical objects to specify their relationships or transfer properties from one to another (Fig. 4A).

Fig. 4.
figure 4

Design space for interaction design.

Surface-Mediated.

2D GUI elements are augmented on the mirrored tabletop to leverage the affordance of the physical tabletop surface for users to interact with. Popular touch actions, such as tap to select, swipe to browse, and drag to move, are supported. The same touch actions can be applied on the GUI elements attached on the surface of a mirrored object, enabling users to trigger the associated functionalities described in the augmentation session (Fig. 4B).

Face-Mediated.

In addition to using hands to manipulate tangible user interfaces and fingers to operate graphical user interfaces of our mirror system, a user’s face can become an input channel by deploying a facial recognition technique. A user’s presence in front of the mirror extracts augmented information and functionalities belonging to him/her. Using a finger to point at distinct parts of a face, such as eyes, nose, mouth, etc., can also retrieve augmented information accordingly (Fig. 4C).

3.3 Correlation

Current augmented mirror systems, like other augmented reality systems, manually bind physical objects with their augmented information, ensuring that users always receive the right information regarding the target objects. However, there are hidden correlations among objects, information, and users, which only emerge during the actual and frequent interactions on a daily basis and have not been addressed yet. These correlations might reflect users’ behavior patterns and even implicit needs, and hence are valuable to advancing our augmented mirror system.

Dependency.

Our system not only can recognize users’ presences but also can identify objects the users are using. When objects are frequently manipulated by the users, the system learns and builds dependencies between the objects and the users. In addition to one object mapping to one user, an object can belong to multiple users, while a user can also own many objects (Fig. 5A).

Fig. 5.
figure 5

Design space for correlation design.

Group.

The objects belonging to a user can be further categorized into different groups according to the ways that the user interact with them. A user might collect different objects before using them, revealing the user’s intention of forming these objects into group. A user might use one object after another within a short period of time, also showing that these objects have a higher chance of becoming a group (Fig. 5B).

Sequence.

Objects in a group might contain hidden orders, providing rules for the system to predict users’ possible next steps. Users always use objects with various functions in diverse orders to fit different purposes, forming unique sequences. Our system recognizes and learns these sequences, and later reminds a user of his/her belongings, keeps a user in the right procedure, and recommends to other users how to achieve similar goals (Fig. 5C).

4 Use cases

After exploring and defining the general design principles for developing the augmented mirror, we further extended these generic principles to three demonstration scenarios, to provide a better understanding and envision real benefits.

4.1 Hallway

A user always leaves a car key on the table in front of the mirror when arriving home. The mirror is gradually aware of the fact that the user owns the key. When the user checks his or her appearance in front of the mirror before going out, the mirror augments a graphical reminder around the key to draw the user’s attention. The mirror also displays a UI with two icons, which represent ignition and garage door, on the mirrored table top nearby the key. The user touches one button to start the car for warming up and the other to open the gate of the garage remotely (Fig. 6).

Fig. 6.
figure 6

Interacting with Erised in the hallway setting.

4.2 Kitchen

A wife puts a food material on the countertop with a mirror attached vertically. She seasons the material with seasoning A, seasoning B, and seasoning C in turn. When her hand moves the seasoning cans, the volumetric renderings are displayed on the mirrored cans showing the remain amounts. When noticing that the can of seasoning B has a few servings of seasoning left, she lifts up then puts down the can to call out an augmented menu. She rotates the can to highlight the ‘buy’ function in the menu and lifts up then puts down the can again to select. The selection triggers the connection to the supermarket and orders a new can of seasoning B (Fig. 7B, C). She then uses a knife to cut the material into four pieces, and the mirror learns the locations where she has cut the material.

Fig. 7.
figure 7

Interacting with Erised in the kitchen setting.

Some days later, the husband must cook himself, while the wife is on a business trip. When he puts the material that his wife prepared for him on the counter (Fig. 7A), the way in which his wife previously cooked the material is augmented on the mirrored image of the material step by step, including what seasonings to use, what sequence to follow, and where to cut the material (Fig. 7D).

4.3 Bathroom

One day, a female buys four facial care products for her forehead, eyes, nose, and mouth, and puts them on the wash stand in front of a mirror. She always consciously takes a little cream with her finger from a specific care product, then puts it on the related part of the face. Or she always unconsciously uses her fingers and hands to check the parts of her face, before using the care products. These conscious and unconscious actions are learned by the mirror.

Another day, a male living with the female uses his finger to check the nose in front of the mirror. As soon as he feels that he might need some care products for his nose, he notices that there is an augmented indicator on a bottle (Fig. 8A). When he uses his finger to point to different parts of the face, there are augmented indicators showing up on other bottles in turn accordingly. He picks and opens the can for nose and touches a button nearby the can to link to a website showing the product details and instructions (Fig. 8C, D).

Fig. 8.
figure 8

Interacting with Erised in the bathroom setting.

5 System Implementation

To achieve the above-mentioned use cases, we built customized hardware and developed software by integrating several popular toolkits and algorithms.

5.1 Hardware

The developed system has two hardware modules, a mirror module, and a tabletop module. The mirror module has a 39 cm × 56 cm half-silvered mirror overlaying on a 24-in. LCD display to merge virtual and real images. The mirror module also has a Microsoft Kinect 2 attached on top of it to capture depth image for facial recognition and tracking.

The mirror module is placed on the tabletop module, which consists of a 39 cm × 39 cm translucent table top, a 1920 px × 768 px camera with an IR lens covered facing upwards 40 cm from the bottom of the tabletop, and two 30 × 30 IR led panels illuminating the tabletop from below as well. This setting limits the camera to only see IR-reflective tabletop activities (Fig. 9).

Fig. 9.
figure 9

Software and hardware design and implementation.

5.2 Software

To realize the high-fidelity illusion of the coexistence of mirrored and digitally rendered images, we implemented the popular Eye Dependent 3D Rendering Algorithm. The Microsoft Kinect 2 on the mirror module captures the depth image of the user and recognizes and tracks the user’s head position [7]. In the 3D digital scene, the moving head position is further bound with a perspective camera looking into the viewport on the LCD screen behind the half-silvered mirror.

We adopted the reacTIVision framework to make the tabletop module an interactive surface capable of recognizing and tracking fiducial markers and finger touches [5]. The recognized marker identifications (IDs) are pointers to the associated augmented information and GUI elements. The continuous actions of lifting up then putting down a marker are interpreted as a click event, while rotating the marker is used for rotary control. A finger touch can be recognized to trigger, swipe, and drag-and-drop GUI elements.

To identify different users in front of the mirror, we implemented a facial identification module based on NodeJS Support Vector Machine (SVM) API [9]. The Kinect HD Face API recognizes users’ faces and identifies the position of each facial element, including forehead, eyes, nose, and mouth. The relative positions between these elements are then calculated to define five unique features of a face. These features are ultimately fed into an SVM to train.

We also used the NodeJS Hidden Markov Model (HMM) API [8] to implement the correlation module building the relationships between users, objects and information. The computed results of the facial recognition module and the recognized marker IDs of the reaTIVsion toolkit and the GUI elements triggered by the finger touch are defined as attributes to train the HMM, predicting the augmented information for the users and physical objects (Fig. 9).

6 Preliminary Evaluation and Results

To collect feedbacks, we invited eight users to experience our augmented mirror and conducted a preliminary study.

Since the correlation module requires a relatively large dataset and takes time to train, it was difficult to demonstrate during the user tests. Therefore, we pre-trained the models for the three demonstration scenarios and let the users to experience the training results. The users directly manipulated the physical objects to interact with the mirror and answered a questionnaire containing seven Likert Scale questions. After the questionnaire, we interviewed the users to acquire explanations for their answers.

The results of our survey indicated some benefits the system provides, including:

  • The users feel that the effect of augmenting virtual elements with the mirrored objects is magical.

  • They would love to have such a smart mirror at home to assist them with their daily activities.

  • The augmented indicators serve to remind them with a relatively “calm” way compared to conventional pop-up notifications of mobile phones.

  • Manipulating the physical objects to call out related digital information and GUI elements is intuitive.

  • They feel the next-step instructions help them to continue their tasks without searching online or asking family members.

    Some problems were also identified, including:

  • There was difficulty in focusing on the merged images in the mirror, (using one eye is better than two eyes).

  • The rendered images were a little bit dark.

  • Some latencies occurred while users were moving their heads too fast, causing the rendered and mirror images to be unmatched.

  • The touch sensitivity and precision were not high enough and caused some operational issues.

7 Conclusion, Limitations and Future Works

According to the collected data, we concluded that augmented reflection offered an intuitive and convenient way for users to retrieve digital information of everyday objects with the freedom to use both their hands, and without wearing and carrying any additional device. The study also revealed that the augmented information of predicted correlations provides users needed information at the right time, allowing them to make better decisions and take the appropriate actions. Most of the cons found in this study, such as poor rendering performance, insufficient brightness, occlusion problems and lack of visual depth, were technical issues and noted for future improvements and further study.