1 Introduction

There is no universal definition of the Metaverse, but for the purpose of discussing the psychological impact, this article refers to the author and former Head of Strategy at Amazon Studios, Matthew Ball: “The Metaverse is a massively scaled and interoperable network of real-time rendered 3D virtual worlds which can be experienced synchronously and persistently by an effectively unlimited number of users with an individual sense of presence, and with continuity of data, such as identity, history, entitlements, objects, communications, and payments” [1]. Today, the Metaverse often gets misused as marketing-term, promoting a virtual reality-platform or -game, but only if all of Ball’s definition is holistically applied, we can speak of the Metaverse. He further specifies it as persistent, synchronous & live, providing each user with an individual sense of “presence”, a fully functioning economy, bridging the digital and physical worlds, offering unprecedented interoperability of data, digital items & assets, and content [2]. Accordingly, the Metaverse does not only require various virtual reality platforms (or one including all aspects), but also a standardized interconnection to use the own avatar or the later defined Virtual Beings on the different locations.

Similar to other technical developments, like for example robots, spaceships and flying cars, the Metaverse had been anticipated by various science fiction authors. Even more, the term “Metaverse” was originally coined by author Neal Stephenson for his 1992 novel “Snow Crash” [3], describing a three-dimensional social platform. In opposite to the other mentioned technologies, authors saw the Metaverse always related to a dystopia. For example, “Simulacron-3” [4], “Snow Crash”, “The Matrix” [5] or “Ready Player One” [6]. According to Mark Zuckerberg, in opposite to today’s mostly two dimensional internet, based on text, photos or videos, the Metaverse will create more immersive experiences [7].

1.1 Types of Metaverse platforms

Ball defines the Metaverse as the interconnection of different virtual worlds, including with data, information, and knowledge. Each of these platforms may offer different user experiences. We can distinguish roughly between two different types of Metaverse platforms: for companies & education, and for private use, including leisure. Of course, like in the physical world, also in the Metaverse both usages blur.

An example for the second is to be expected to come from Facebook’s mother company Meta Platforms. CEO Mark Zuckerberg refers to “Snow Crash” and “Ready Player One” when describing his vision. Both books describe a day-by-day usage of the Metaverse, also as escapism from a dysfunctional world [8]. He explains the company’s vision: “We hope to basically get to around a billion people in the Metaverse doing hundreds of dollars of commerce, each buying digital goods, digital content, different things to express themselves, so whether that’s clothing for their avatar or different digital goods for their virtual home or things to decorate their virtual conference room, utilities to be able to be more productive in virtual and augmented reality and across the Metaverse overall” [7]. Based on this vision, the company’s approach is in the tradition of Linden Lab’s “Second Life”, already established back in 2002.

Focusing on the professional sector, various companies are in the process to implement their virtual reality platforms. For example, Engage, which offers a learning and meeting platform. The big player to be expected here is Microsoft, as it can connects its platform with strong existing apps like MS Teams and Office. The including of Digital Twins into the virtual reality (VR) offers the ability for learners to experiment with a test version of the actual machinery, learn the handling and study the impact on potential changes. The physical and digital world bridge, as employees can remotely operate a machine inside the VR, while these changes also the performance of the physical twin. On the other hand, employees standing at the physical machine can via Augmented Reality glasses, or directly the smartphone, access the Metaverse to read manuals, write protocols, work with real-time data, and even interact with other avatars [9].

Gamification (the transport of information via gaming) is another professional usage, already deployed for massive online games like “EVE Online”. This game simulates a virtual universe, where players can take on the role of a spaceship captain and discover the wonders of the galaxy. In collaboration with Geneva University, the game became reality, as the users, sitting before their computers, became citizen scientists acting inside a virtual space. The software asked the virtual captains to evaluate exoplanets using data from the university’s database. If a sufficient number of users evaluated the planet information, the game sent the information back to the university [10]. With their work, the users not only supported science, but also gained perks for the game.

Similar to roleplay games, for educational purpose, historic scenes could be recreated to let the students experience them inside the virtual reality, which could be like a 3D movie (for example “Gladiators at the Colosseum” [11]) or also include interaction [12].

1.2 Metaverse quantity

For most private users, the Metaverse will be nothing more than a continuous development of the known 3D multiplayer games, adding social media functions. To predict a psychological impact, relevant to consider today’s time spent online. The average user in the US (18 years and older) spent a day on Facebook 33 min, on TikTok and Twitter each 31 min, on Instagram 29 min, and on Snapchat 28 min in the year 2022 [13].

In the age group 4–18 years, gaming platforms have a high popularity. For example, the average user is expected to spend 180 min daily on the gaming and game creation platform Roblox [14]. In this virtual reality, users can play different games and create their own virtual habitat, including houses. Decorations and clothes can be bought, so that the young users adapt to spending money. In the second quarter of 2022, Roblox reported over 52.2 million daily active users worldwide [15]. It is a virtual game platform, but especially for younger users also serves as a preferred social meeting platform, as in opposite to classic social media, groups can stay limited in size, and enjoy privacy, including from parents. Roblox is less comparable with social media like Facebook and Instagram (with its possibilities and pressures to present your own life as positive as possible) but hangouts in a physical location, where there is less pressure for pretending.

The US research company Gartner predicts that by 2026, 25% of the population will spend at least 1 h per day in the Metaverse. This includes that work, education, but also leisure will shift partly from the physical to the virtual reality [16]. Considering Generation Z’ usage of Roblox alone, a realistic prediction. Generation Z (born from 1997 to 2012) is entering the workforce and universities. They have been socialized with the omnipresent access to VR platforms, as for example the first version of the mentioned Roblox was released in September 2006.

1.3 Cocooning

Futurist Faith Popcorn coined the term “Cocooning” in 1981, describing it as staying inside your own home, as hiding from a perceived unfriendly physical and social environment. In Japanese language it became known as “Hikikomori”, defined in 2012 by the psychiatrist Tamaki Saito as “a state that has become a problem by the late twenties, that involves cooping oneself up in one's own home and not participating in society for 6 months or longer, but that does not seem to have another psychological problem as its principal source” [17]. In 2019, 36.5 million people lived alone in the United States, which is less a sign of loneliness, but an active decision based on culture and socialization, as people value their alone time. Popcorn identified different styles of “cocooning”, including “armored” and “wandering”. The first understands the house as castle, the second as base for regular “expeditions” [18]. Even if self-selected, in future these single households may get vulnerable, for example related to climate change and raising costs. The “wandering” may change from the physical to the virtual world, especially if costs for housing are high. Various musicians performed live concerts in Roblox. As the boundaries between physical and virtual realities blur, we can imagine the next step: a concert could be at the same time in both worlds. Like today’s different zonings at the concert hall, attending in person or just in the Metaverse will be mostly a question of the personal budget. Besides that, age restrictions or geography are other potential limiting factors. Similar applies for Digital Nomads, working from anywhere via laptop, as they may move to cheaper locations, for instance further away from the regular office. Being active in the Metaverse comes with a price, not only the required investment into hardware, but also virtual goods and entrances have a cost. If the user’s overall budget (money, but also time) is limited, it may lead to the consequence that experiences in the physical world must be partly substituted with comparable in the Metaverse.

In the classic cocooning approach, the human reduces its social contacts. If cocooning combines with spending time on the Metaverse, users may regain social contacts or replace human-interaction in the physical world with interaction in the Metaverse, which could be with other users or Virtual Beings (VBs). Sophistic algorithm may create high quality experiences, so even being physically alone not automatically leads to a perceived loneliness.

2 Psychology

Since the existing of computer games, different groups, like parents, politicians, educators, and psychologists, discussed the impact on the players, mostly children, teenagers, and young adults. Focus had been on aggression and a potential Katharsis effect, but also learning aspects. Studies concentrated on the psychological pre-conditions, why people play excessive time video games. This article discusses the case that complete societies shift time from the physical to the virtual world. For this, the affected group is different, and results cannot be transferred. In the following, the article focuses on the socialization inside the virtual world, including shifted roles and perceptions. In a second step, these factors may lead to an AI humanization bias, enabling the AI to interact with the human, up to influencing it. This as base where actions and decisions may be delegated to the machine.

2.1 Socialization

Growing up with virtual reality, the boundaries between physical and virtual reality blur, as actions in the one does have consequences in the other one. Human perception combines external stimuli, received by the individual’s senses, with internal stimuli, like interest, motivation, and experience. Accordingly, perception is no simple mapping of the environment, but a constructive process [19]. Stimuli get received by the human senses and automatically get interpreted in relation of the own body. “The various attributes of the body such as shape, proportion, posture, and movement can be both derived from the various sensory systems and can affect perception of the world” [20].

The Metaverse offers the possibility to represent the individual’s physical body, but the user may also choose a complete different one, up to fictive surreal beings. If since early childhood people act in both, the virtual and physical reality, they perceive their combination as one reality. Interpretation of stimuli coming from the sensors may get interpreted through the experience of the various body representations, so that behavior in the physical world also depends on the learned in the virtual reality, and vice versa [9].

The regular usage of an avatar different from the physical body and its interaction with the environment can lead to a change of self-understanding. To make this a positive experience, it is imperative that the Metaverse not only respects human rights, but actively supports the creation of a diverse and inclusive virtual reality. This means that in the Metaverse apply the same legal regulations as in the physical world and that operators of the various platforms offer an advanced individualism of the avatar’s outer appearance. As big companies invest into Metaverse platforms, we can expect that they aim on commercial usage, including costs for virtual goods, similar to the physical world. The quality of the experience may depend on the budget, what triggers the risk of crimes, like theft or kidnapping of the avatar. Harassment can happen in both worlds, but the risk may get reduced with safety tools, like an alarm button, where the avatar can separate itself from any other characters around [21].

2.2 Social roles and perception

Inside social organizations, independent if family, school, or work, everyone plays a role to fit inside the group. Depending on the group culture, the individual has a different grade of liberty on how to interpret the role. The rules inside a society are partly written and non-written. In addition, everyone plays an active part in society, so has its own space of gravity, where it can shift the boundaries of the own role.

The Metaverse is not limited to the laws of physics, nor biology or evolution. If not narrowed by the platform’s technology, users are free to create all forms of human-like bodies, up to no-human ones. Humans are not continuously self-aware. Duval and Wicklund defined in their “Objective Self-Awareness Theory” a self-system consisting of a self (a person’s knowledge of themself) and standards (correct behavior based on own’s values and attitudes) [22]. Perceiving their interaction in a different body with environment and the regarding perception by other users should affect the self-esteem and lead to a different interpretation of the own role, not only in the Metaverse, but also the physical world. Despite the artistic possibilities, the Metaverse including avatars and VBs may stay somehow similar to the known appearance of the physical world, as human brain and sensors developed here over millions of years, and too large deviations would bring the human users out of their comfort zone, avoiding the establishment of trust into the platform.

Based on Leon Festinger’s “Theory of Cognitive Dissonance”, a person perceives an inner pressure to have cognition, feelings, and behavior in line [23]. If due to society, individuals cannot live their beliefs in the physical world, the Metaverse may serve as escapism. If accepted there, the individual may feel motivated to change behavior in the physical world (at least inside defined limits) to reduce also there the perceived dissonance. This as experiences in the Metaverse are mostly limited to the two senses vision and hearing (haptic gloves are in development); with this are weaker than comparable experiences in the physical world, as they had been created via vision, hearing, taste, smell, and touch. If used with a satisfactory result inside the Metaverse, users may want to repeat the same inside the physical world.

Each individual plays various roles and can change them regarding the situation. In opposite to the physical world, in the virtual one the outer appearance is not fix but can be switched instantly (if not restricted by the platform). As this change perceptions and treatment by other avatars and Virtual Beings, it can lead to a revision of classic self and role understanding.

2.3 Humanization of Artificial Intelligence

The number of active users depends on the perceived benefit based on the platform’s purpose and popularity. In the conventional web, company websites often offer a chat-bot for the first-level support of potential customers, and Instagram includes various virtual influencers. We can predict that algorithms and outer appearances will be merged so that the Metaverse will not only be populated by human controlled avatars (in virtual reality platforms a 3D figure representing the user, who has often the choice to switch to a “first person view”, especially when using Virtual Reality glasses), but also Virtual Beings (VBs) controlled by an Artificial Intelligence (AI). On the platform, the second may act independently from the users or being aligned to them. The combination of machine learning with a 3d representation, potentially embedded with a fictive biography should lead to a perceived two-way relationship between human user and the creature. The term “Virtual Being” gets pushed today by Fable Studios CEO Edward Saatchi, who is also organizer of the “Virtual Beings Summit” [24].

For example, the Israeli company “The Digital Pets Company Ltd.” is working on virtual dogs. The company’s idea is that users establish an emotional connection with the virtual character, so that it can act as emotional anchor and stress relief. An important function in times of isolation based on home-office and hyperconnectivity. To ensure such functionality, it is planned that the dog is compatible to various virtual platforms. That way the user can take it everywhere the avatar goes. As dog owners know, having such an animal besides, for example in the park, is an easy way to get into contact with other people. Independent if physical or virtual dog, humans tend to humanize non-humans, and this regarding establish and one-sided emotional relation with the AI [25]. Furthermore, the virtual dog could function as a graphical interface to the platform or the internet itself, an evolution of the voice-assistants. The dog is an ideal representation for an aligned VB, as this domesticated animal the human as alpha acknowledges and the role as beta accepts. Nevertheless, thanks to evolution, dogs developed the known “puppy eyes”, which enables them to consciously or subconsciously to influence the human owner, for example to receive more and better food. Similar behavior and be programmed for the virtual dog. Independent from its presentation, such VBs can include sophisticated algorithms and connect to the user’s information to influence, for example to stay longer on the platform. This for or against the human’s benefit.

Depending on the individual human, but also the circumstances (for example, an allergy forbids having a pet, or a small department prevents this), the existing wish to have a dog leads to the bias to perceive the virtual dog as real. Psychologist Peter Ditto explained the effect of “motivated reasoning” that “our wishes, hopes, fears and motivations often top the scales to make us more likely to accept something as true if it supports what we want to believe” [26].

The behavior of a Virtual Being may be undistinguishable from a human controlled avatar. Together with motivated reasoning, it can lead to effect that users in the Metaverse perceive VBs as real, similar to the case of a Google engineer who claimed that the company’s chatbot is sentient. Bioengineer Enzo Pasquale Scilingo suggested to measure such effects of a machine (like a VB) on the human to understand how sentient it can be perceived by a user [27]. To reduce the described impact on the human, law may require disclosing artificiality. For example as defined by the California Bot Disclosure Law, paragraph 17941: “It shall be unlawful for any person to use a bot to communicate or interact with another person in California online, with the intent to mislead the other person about its artificial identity for the purpose of knowingly deceiving the person about the content of the communication in order to incentivize a purchase or sale of goods or services in a commercial transaction or to influence a vote in an election” [28]. Such a regulation helps the human to identify a VB as such and accordingly reduce its subconscious humanization.

2.4 Influencing humans

In the physical world, humans can be observed via cameras or their usage of credit cards. In the Metaverse, the platform could directly track all inputs by the user, up to the monitoring of eye-movements in the case that the individual uses Virtual Reality (VR) glasses. Based on this information, the algorithm of VB could perfectly respond to the user, up to manipulate it. This especially if the avatar appears sympatric, non-aggressive or cute, which due to a “halo effect” gets interpreted as trustworthy by the human.

Even more, a VB may act completely caring (Claim by Replika: “The AI companion who cares. Always here to listen and talk. Always on your side” [29]) about the human avatar (and due to this, about the human behind) without any perceived egoism. Such form of unconditional attention is impossible to receive by a human being and may lead to the consequence that users who spend a longer time with such VB may become unable for sustainable relationships with other humans, especially if they already before lacked social skills.

A VB aligned to a human user, particularly if it has a human-like appearance, instead of just being a dog, may be perceived as serving, subservient, up to as a modern slave. In antique civilizations, slaves not only existed for the hard works, but more educated slaves worked inside the house, often also as teachers for the children or welcomed conversation partner. Even if the United Nations declared in September 1926 (entry in force: March 1927) their intention to end the traffic of slaves, and the related abolition of slavery in all forms, modern slavery is still a relevant topic [30]. VBs, precisely following each of the user’s orders, may appear human-like, but they are just an algorithm, unable to perceive feelings. Due to this, there would not be a negative impact on the VB, but the scenario may disturb the psychology of the human user, as sparking empathy for the machine and/or lowering of empathy for other humans. It can be assumed that the more human-like the aligned Virtual Being, the more relevant the psychological impact, as for example an inflation of ego, as the sociologist and civil rights activist Du Bois observed at US slave owners [31]. Perceived superiority may be lived out later in the physical world, furthermore the representation of the VB may be perceived as harassment, especially by humans with similar appearance, leading to frictions inside society. A circle, as existing user beliefs may influence the designs of the serving VBs. Based on local anti-discrimination regulations, aligned human-like Virtual Beings may be prohibited or limited in usage.

Such described development can lead to hyper-individualism, as users may co-create the Metaverse up to the design of virtual friends, including appearance and behavioral patterns. Similar to the concept of “cocooning” the individual may not only physically stay away from the outside, but also psychologically protect itself against other fellow men.

2.5 Offloading to the Virtual Being

Based on Ball’s definition, the Metaverse bridges to conventional internet and physical world. To comply with this claim, the avatar must align with the user. One way to reach this, is to have one or various unique avatars representing the user on the various platforms. In opposite to the idea of the Digital Twin, the avatar does not have to be a copy of the user’s appearance, but illustrates the user’s idea of itself, which could be a different human or non-human appearance. The virtual avatar is no copy of the physical person, but both, the human and the avatar aligned with the concept of Gestaltpschologie are no two separate sides, but create one holistic body, considered for the self-understanding.

If the avatar will include an algorithm and connect to a database with the user’s preferences and past behavior on the various virtual platforms (indicating knowledge and attitudes); it will be able to autonomous act inside the Metaverse. This could be additional to the human control, like for example showing typical gestures and movements (as result of a training by its user or pattern detection), up to the autonomous execution of tasks, for example participate at a meeting and later resume the content for the use, or continuously acting for the time that the user is not connected to the platform. This evolves the avatar into a Virtual Being. This way, the VB can be included into a given scenario to observe how it behaves. This could be a learning scenario or just a “The Sims”-like game. Again, an opportunity for the user to observe, how the VB acts based on detected behavioral patterns.

Like humans can enter the virtual reality, the VB may act out of it into the physical world. Microsoft’s VR communication platform “Mesh” connects with the company’s collaboration platform “Teams”. Due to this, users of both platforms can choose, if they want to use their camera for the video-call, or let the camera analyze owns movements, including eyes and lips, but then the other participants see the avatar acting inside the 2D image. This feature aligns with research led by University of Georgia psychologist Kristen Shockley, who suggests that not the remote meetings are a direct cause of the known “Zoom fatigue”, but particularly the usage of cameras [32]. This as for most users, remote meetings with camera are a non-natural situation, far outside their comfort zone, requiring a double focus: one the topic itself and also on ensuring an adequate appearance inside the picture. The higher the perceived pressure to look good on camera, the higher the perceived levels of fatigue.

Here a software like Mesh can support. Being connected with such technology, other participants will see an avatar interpreting in Realtime the users’ movements, including enhancing to a VB, when autonomously including supporting gestures or showing adequate behavior when listening to others, like for example nodding in agreement or smiling if laughing hints a joke. In 2020 Google presented its AI assistant Duplex with a demonstration, where the algorithm first asked the use for the date and time to get an appointment at the hairdresser and then Duplex called the hair-saloon to speak with a person over the phone and arrange the appointment [33]. This can be reached if the VB connects out of the Metaverse to a calendar and other tools.

3 Conclusion

Up to today, apart from exceptions like Roblox (which comes from the path of established multiplayer games), no virtual reality platform can present a relevant number of active users. Many experts predict that this will change, but so far, even bigger companies struggle to attract users to stay active on their Metaverse platforms. We may assume that first progress comes from the gaming sector, where newer titles may focus less on the game itself but allow more space for “social hangouts”. In parallel, other virtual reality platforms will come from the professional sector, supporting learning at schools, university, and companies.

Due to the definition, the Metaverse is a connection of various virtual reality platforms, requiring an inter-compatibility of the avatar or VB. This comparable to an actor, who plays completely different roles, but keeps a similar perception including usage of the same gestures. Only if large scales can be reached, the Metaverse will manifest, making the virtual reality more attractive to visit and stay.

With entering the Metaverse, humans will be in the “natural habitat” of the AI, what can offer opportunities, but also opens the risk that the algorithm, including the organizations and humans behind the technology, can exploit the human vulnerabilities, especially if they use the virtual reality as escapism from the physical world. It is imperative that legal frameworks are as efficient inside the virtual reality as they are in the physical world. Furthermore, as we speak still about unexplored virtual worlds, it is recommended to establish an efficient supervisory board to analyze and discuss how the technology impacts the human psychology. As the Metaverse is a new development, we can predict impacts, but need to continuously study the effects on the human mind, if possible, in real-time. This regarding, the article is to understand as prediction and invitation to start the discussion to foster the opportunities and reduce the risks. If favorable conditions exist, we can conclude that if humans growing up spending relevant time in the Metaverse with avatars and VBs, it will become an obvious part of themselves, leading to new self-understanding.