1 Introduction

The symbiotic relationship between humans and emerging technologies is crucial in the social-technical transformation toward Industry 5.0, given the significant value that humans hold in future industrial workplaces [1, 2]. Human-cyber-physical systems (HCPS) [3] and cyber-physical-social systems (CPSS) [4] are emerging paradigms that incorporate human factors into cyber-physical systems. Basic human needs like safety, health, and well-being attract great attention, especially in human-centric smart manufacturing [5]. As such, human-centric techniques play an important role to put core human needs at the center of the production process [6].

Industry 5.0 identified digital twins (DT) as one of the enabling technologies that are better to merge the physical and virtual worlds [1]. The success of DT technology in physical systems (e.g., machines and devices) has motivated the exploration of DT in humans [7]. The human digital twin (HDT) emerges as a promising human-centric technology that has drawn attention from academia and industry [3, 8]. For instance, in the past three years, some concepts relevant to HDT (e.g., digital human and digital twin of person) have appeared in Gartner’s hype cycle for emerging technologies, experiencing upward growth in the innovation trigger phase (see Figure 1). Consequently, both Wang et al. [3, 9] and Sparrow et al. [10] regarded HDT as an enabling technology for HCPS with a bright future, which brings humans and physical systems together in a virtual space.

Figure 1
figure 1

Emerging technologies relevant to HDT in Gartner’s hype cycle

The majority of current HDT research is in the fields of healthcare and medicine, such as the DT of the human body for managing athletes’ fitness [7] and patients’ health [11], or the DT of organs for medical services [12]. For instance, Dassault Systèmes has developed a DT heart that uses magnetic resonance imaging and electrocardiogram (ECG) to visualize the difficult-to-see anatomy in the real world [13]. This kind of HDT seems to have a strong correlation with human lives and anatomy. In comparison, there are fewer studies on HDT in the manufacturing industry. Nevertheless, frameworks and concepts are only briefly described in a few HDT studies. Wang et al. [8] proposed the HDT-driven HCPS framework that has the potential to enhance human-robot collaboration (HRC). Hafez et al. [14] introduced the HDT concept for monitoring the human-machine state.

In the industrial fields, little in-depth research has been conducted on the DT of workers because human needs have not received as much attention as production efficiency and product quality of production from a techno-economic perspective [6]. More and more researchers have argued that humans are irreplaceable for flexibility, creativity, and problem-solving capabilities in manufacturing [3]. However, human modeling and simulation remain a challenge in HDT because humans are a far more complex system than machines and other devices, and also because human behavior is unpredictable [3]. To build a human model, several researchers have combined finite element models (FEM) with computed tomography (CT) [7]. But there are numerous physiological parameters, which result in huge computational costs [7]. Human factors—rather than specifics of human anatomy—are what human-centric manufacturing systems are concerned with to improve human well-being and overall system performance [15]. Therefore, an easier and more practical method is needed for human modeling and simulation in HDT.

The human factors, often known as ergonomics, considers different aspects of humanity from an interdisciplinary viewpoint, including physical, psychological (affective and cognitive), and social [16]. Digital human modeling (DHM) provides digital solutions for modeling and simulating the physical and cognitive aspects of humans, and bridges computer-aided engineering and human factors engineering [17, 18]. For instance, DHM tools could generate biomechanical models using anthropometric data, enabling the analysis of muscle forces and spine loads during manual material handling [19]. Additionally, models of DHM have been constructed and validated with professional/domain knowledge from disciplines like anatomy, biomechanics, and psychology [20]. For these reasons, the DHM is taken as a breakthrough in this research and has the potential to enhance HDT modeling and simulation.

To the best of our knowledge, there is no discussion about the relationship between DHM and HDT. Paul et al. [21] first suggested that a core element of Ergonomics 4.0 is the transition of DHM into DT, which is distinguished from each other by real-time and personalization. The former concentrates on the biomechanical and cognitive models of human manikins [18], while the latter primarily tackles the cyber-physical integration of physical entities [22]. However, it can be difficult for DHM to represent real-world human conditions accurately. Although DHM uses anatomy information to build highly realistic and accurate generic human body models, DHM often involves scaling the generic model to reflect the statistical body dimensions of the end-user population [21]. In addition, DHM applies inverse kinematic techniques with precise kinematics and dynamics data, which may not be suitable for all simulation situations. It is challenging for DHM to reflect how the human body interacts with realistic environments. In contrast, HDT takes into account real-time data, simulation data, and the fusion of physical and virtual data to meet the needs for personalization and responsiveness [22]. Based on abundant data and models, HDT is able to construct multi-scale human body models to fit different application scenarios and needs for fidelity. From the perspective of human factors, HDT could be the future trend for human modeling in DHM, but this is still worth discussing.

According to recent studies, the current trend in human factors is to implement real-time assessments and interventions in the production process [21, 23]. However, DHM is often used during the design phase since the manikins must be manually adjusted and the lab sensors are chunky and expensive [24]. In recent years, a growing number of studies have been combing human factors with advanced digital technologies like the Internet of Things (IoT), artificial intelligence (AI), and eXtended reality (XR) to respond to the development trends [21, 25]. For instance, advanced, unobtrusive, and body-worn sensors are utilized for on-site measurement enabling biomechanical analysis during work, such as inertial measurement units (IMUs) and wireless wearable electromyography (EMG) [26, 27]. These innovations help to rapidly provide accurate data for the virtual-real mapping of human throughout production or service stages [28]. In the human factors perspective, HDT is a digital representation of the human body that integrates modeling, simulation, and digital technologies to assess the condition of human factors and provide feedback to the system. It is essential to give a holistic view between DHM and HDT and to investigate a unified HDT framework to integrate human factors and digital technologies.

In response, this study seeks to answer the following three research questions:

  • What are the distinguishing features and evolutionary shifts involved in transitioning from DHM to HDT?

  • From a human factors perspective, what technical details should be available regarding the HDT framework?

  • What are the future industry applications and challenges associated with HDT?

The rest of this paper is organized as follows: In Section2, the current state of the art in DHM and HDT is reviewed; in Section 3, the evolutionary shifts involved in transitioning from DHM to HDT are introduced; in Section 4, the HDT framework is elaborated in detail; and in Section 5, the future perspectives are highlighted with open discussions.

2 Overview of DHM and HDT

2.1 Brief Review of DHM

DHM is defined as the simulation of human anthropometric, biomechanical, and perceptual-cognitive attributes in a computerized environment [18, 28]. With DHM, people can make proactive ergonomics designs and economically simulate and test a diverse variety of underlying hypotheses about human behavior [18].

The literature review for DHM is focused on modeling and simulating human behaviors and cognition in science, not in the arts. Since the 1960s, there has been a gradual evolution of scientific DHM, accompanied by rising needs and technological advancements. The evolution of DHM from 1960s is represented in Figure 2.

Figure 2
figure 2

Main development lines of DHM (Adapted and modified from Ref. [28])

Based on the functionality, DHM models can be broadly divided into three main branches: anthropometric models, biomechanical models, and perceptual-cognitive models [28]. The functions, applications, and tools corresponding to these three types are summarized in Table 1.

Table 1 Functions, applications, and common tools in DHM

The first type is anthropometric models, which involve describing the geometric dimension of human body measurements with physical properties [38]. These models provide insights for product or space design considering the geometric constraints of the human body, such as workplace or cockpit design in terms of reachability analysis, visibility analysis, and ergonomic assessment (e.g., OWAS and RULA). Representative tools include JACK [29], RAMSIS [30], and SANTOS [31], etc.

The second type focuses on the biomechanical properties of human body, realizing multibody mechanics of the human musculoskeletal system [39]. Musculoskeletal modeling is usually driven by kinematics and kinetics principles, such as Lagrange’s equation [28]. With these, musculoskeletal forces and loading calculation, gait analysis, postures and dynamics prediction can be reached. Typical tools include AnyBody [32], OpemSim [33], and 3DSSPP [34], etc. Also, SANTOS and JACK added biomechanical modules in their later versions. Though it is rarely studied, biomechanical models can also be developed using finite element analysis [28, 40].

The third type deals with the perceptual-cognitive aspect of humans. It aims to simulate how people perceive, process, and memorize information and how decisions are made so that the cognitive state and performance of humans can be predicted [41]. Some researchers also referred to it as human performance modeling [42]. This type is less studied compared with the other two, as modeling the cognitive aspect of a human is more challenging. Most underlying theories are conceptual or empirical, so the context of use is limited. Typical models like ACT-R [35], QN-MHP [36], and SOAR [37], etc.

It is worth noting that the distinction between the three types is not absolute. Sometimes, anthropometric models are the prerequisite of complex movements measurement in biomechanical models coinciding with the length of limb segments, location of mass centers, and so on [37, 43, 44].

Currently, many DHM tools have been commercialized and widely used in different industries [32, 45]. Some DHM software tools make an effort to synchronize motion and physiological data acquired from sensors at the same time by integrating multiple modules and providing functional modules or interfaces [45, 46]. For example, biomechanical analysis and monitoring are achieved by the integration of musculoskeletal models with synchronized EMG, forces, and motion data [47, 48]. Although DHM tools have worked on data synchronization, there are several limitations: (1) scenarios that can only be used in a lab setting for short-term monitoring, (2) model parameters that are fixed by offline learning, and (3) lack of attention to historical behavior and contextual information about the individuals.

2.2 Brief Review of HDT

With increasing applications of DT in many fields, research on HDT is gradually emerging. The state-of-the-art of HDT is presented in terms of both human health and performance.

In terms of human health, some researchers suggested that HDT technology can be used to monitor and predict an individual’s health status. As a related standard, the development and use of ISO/IEEE 11073 makes it easier to use the gathered health data [49], laying the foundation for the implementation of HDT. El Saddik et al. [50] developed an ecosystem of the DT to promote health and well-being in a data-driven manner, which lacks mechanism models (e.g., biomechanical models of humans). Okegbile et al. [51] proposed the framework of HDT for personalized healthcare services and highlighted its key technologies, challenges, and future directions. Ferdousi et al. [52] compared the well-being DT with product DT and gave attention to the mental health and social aspects of human beings. Moreover, HDT focusing on certain scenarios or human organs has also aroused a wide range of discussion. Barricelli et al. [7] studied HDTs of a sports team, which could monitor and manage athletes by collecting measurements describing their behavior. Cardio twin, a DT of the human heart running on the edge based on the ecosystem [50] mentioned, was presented to detect, prevent and reduce the risk of suffering heart diseases [12]. He et al. [53] proposed a method for constructing a shape-performance integrated DT of the lumbar spine to predict the real-time biomechanics of the lumbar spine during human movement. These studies have greatly promoted the development of HDT, but there are still certain limitations in scalability and universality.

In terms of human performance, there is no recognized standard or framework for HDT due to the wide range of application scenarios involved. The main topic of relevant studies is how HDT can be used to enhance human performance or play better human roles in various scenarios. Bilberg et al. [54] proposed a DT-driven HRC assembly system for dynamic skill-based task allocation between humans and robots, task sequencing, and real-time control of flexible assembly cells. Fan et al. [55] explored a vision-based HDT modeling approach for three human statuses, including 3D human posture, action intention, and ergonomic risk in an HRC scenario. According to Graessler et al. [56], the development of an HDT that can represent employees enables user feedback to be generated instantly and used to support decision-making in the production system. In the human-centered application scenario, some studies focus on creating a more lifelike twin of humans to improve the interactive experience by combining interaction and visualization technologies. Orts-Escolano et al. [57] proposed an end-to-end system for Augmented Reality/Virtual Reality (AR/VR) telepresence to realize real-time 3D reconstructions of people, furniture, and objects; and transmitted them to remote users in real-time. Chen et al. [58] used AR technology to deliver interactive, holistic, whole-body visual information to create a virtual human body, and conducted safe work posture training. Meanwhile, Wang et al. [9] presented a comprehensive discussion on definition, architecture, enabling technologies, and applications from an Industry 5.0 perspective.

Currently, although the studies on HDT reflect the key features of DT like interoperability, real-time, and fidelity, the implementation of HDT is still in its early stage [51]. Most studies focus more on a specific organ, function, or aspect of life, lacking a holistic description of the human, body, and having poor applicability in other scenarios like manufacturing. Compared with DHM, HDT includes real-time information detection and an AI-inference module [12], which can address the limitations of DHM to some extent, such as no real-time data import and parameters derived from offline learning. Moreover, HDT is applied not only for simulating but also for monitoring and intervention.

2.3 Comparison of DHM and HDT

Based on the reviews above, Table 2 presents a summarized comparison of key features of DHM and HDT. Both technologies use digital representations to demonstrate human behavior. DHM is used to simulate and predict the physical and cognitive behavior of humans, grounded in underlying mathematical models and theoretical principles. But, it is rarely utilized to track real-time human performance or physical safety [42]. In contrast, HDT relies on computation resources and real-time data from the physical world to maintain continuous high-fidelity monitoring, prediction, and proactive interactions. As a result, DHM is a valuable approach for understanding human behavior. It provides strong interpretability based on well-established scientific knowledge. On the other hand, HDT promotes an attention to time-effectiveness and personalized services.

Table 2 Comparison of DHM vs. HDT in key features

3 Moving from DHM to HDT

Simulating and modeling are essential steps in the creation of an HDT model. The literature review indicates that HDT integrates a variety of technologies that will provide efficient ways to address the limitations of DHM. Furthermore, DHM provides a strong theoretical foundation for modeling and simulating of humans in the aspects of physical, physiology, and psychology [59]. This indicates that the transition from DHM to HDT is worth being better tracked to further the development of human factors. As shown in Figure 3, industry systems now contain diverse elements, which increase the complexity of interaction between humans and systems. This increase has led to a gradual change of human factors from observation-based techniques, such as time-and-motion studies, to techniques based on multi-sensors and intelligent algorithms [60]. From the perspective of human factors, the evolution of DHM into HDT will embrace three characteristics: model-data-hybrid-enabled, online learning, and timely interaction.

Figure 3
figure 3

Major advances of human factors in the industry

3.1 From Model-Enabled to Model-Data-Hybrid-Enabled

The model-driven approach known as DHM generates posture and motion based on anthropometric databases, inverse kinematics and motion capture (MoCap) systems, etc. [61]. There is a lack of diversity in personal characteristics and environmental conditions [62], despite the fact that model-enabled approaches provide a strong theoretical foundation. Data-driven approaches may be able to bridge the gap. For instance, by combining the anthropometric model with data-based body size variation, new motions for humans in and out of vehicles can be generated without MoCap [62]. Additionally, data-driven methods are in high demand due to personalization, continuous monitoring, and evaluation, whereas model-enabled solutions are generally theoretical or hypothetical [25, 63]. For example, a musculoskeletal model can reveal a lot of information about muscular dynamics, but it is complex enough that requires lots of computing resources [64]. A hybrid model that combines a musculoskeletal model with a predictive skeletal model could significantly reduce the use of experimental data and increase calculation efficiency [64]. As a result, our objective is to develop a model-data-hybrid-enabled HDT framework that involves different DHM models and multi-modal data sources.

3.2 From Offline to Online Learning Paradigm

HDT is anticipated to be a technique used in realistic scenarios to monitor and evaluate the health status of workers and system performance [12], as well as to provide two-way feedback between humans and machines/robots [3]. Therefore, it is crucial to implement real-time feedback and an online learning paradigm [65]. However, common DHM tools have some limitations in real-time, such as the needs for short-term data recording and offline post-analysis in a lab setting, and the fixed form of some model parameters (e.g., mechanical properties of muscle fiber) [62]. Thus, a data-driven module for AI-inference would be essential in the HDT framework to fill the gaps. On the one hand, machine learning (ML) methods could be used to evaluate performance [66] and ergonomics risk [67] based on data from non-intrusive devices (e.g., Kinect and IMUs). For example, Luka et al. [68] used ML to estimate muscle fatigue online in real-time based on data from offline biomechanical models. On the other hand, deep learning (DL) [66] methods have better accuracy in time series data, like recurrent neural networks (RNN) [69], enabling the analysis of motion history and context information.

3.3 From Pre-Design to Timely Interaction

A proactive ergonomics method called DHM is used to pre-design products, workstations, and tasks [70]. While HDT focuses on real-time monitoring and intervention, it places more emphasis on perceptibility and timely interaction. Perceptibility is the ability to perceive human behavior and reactions, and it is related to advances in wireless and sensor technologies [71]. A timely interaction helps to ensure the operators’ health and safety and enhances user experiences [25]. In specific, prompt feedback to individuals is essential for proactive musculoskeletal disorders prevention, action training guidance, decision support, etc. [72] For example, timely interaction between humans and robots in the HRC scenarios could improve accuracy and safety in complex and flexible co-tasks [71, 72]. In addition, HDT will be efficient to update robotic controls strategies quickly by monitoring operators and timely intervention, e.g., planning a trajectory without collisions [73,74,75]. In general, a modularized user interface (UI) in HDT that is flexible and replaceable for the new is essential to ensure timeliness as new modes of interaction (e.g., gestures and voice) emerge.

4 Framework of HDT in Human Factors

Based on the above analysis, a unified HDT framework is being proposed to integrate human factors and digital techniques to deal with complex and dynamic situations in reality. As shown in Figure 4, the proposed HDT framework is composed of the Physical Twin (PT), the Virtual Twin (VT), and the linkage between the PT and VT. It is designed based on a common connotation of digital twin (DT) according to [76, 77] and follows a widely consensual HDT framework for health and well-being in Ref. [50], which has been applied to the DT of the human heart [12]. These studies provide a solid theoretical foundation for the HDT framework. However, the current HDT framework is not suitable for our application scenarios and goals from a human factors perspective.

Figure 4
figure 4

Framework of HDT from a human factors perspective

The HDT is a model-data-hybrid-enabled method that can proactive ergonomic design to intelligent ergonomic services. It combines human modeling methods with AI techniques to build multi-scale human models based on abundant data from the PT. Then, iterative and self-updating models can be enabled by continuous data and information in a feedback loop, allowing real-time analysis and timely interaction between the PT and VT.

In this section, the proposed HDT framework is presented in detail. The content of the three main components—data source, digital engine, and interface—has been tailored to meet the technical requirements of human factors analysis. Firstly, this framework prioritizes the twin of work-related human factors, such as body movements and cognitive status, rather than high-fidelity anatomical representations. Additionally, it integrates ergonomic methods like biomechanical modeling with AI-interface methods. These ensure model timeliness as well as model accuracy, efficiency, and personalization with iteration. Furthermore, it offers multi-scale human body models adaptable to different fidelity needs. For instance, the HDT model for an operator incorporates varying levels of detail. High-detail hand models are necessary for assembly workers collaborating with robots to enable real-time adjustments in robot path planning based on their gestures. Manual laborers need low-detail musculoskeletal with an emphasis on general work postures to prevent musculoskeletal disorders (MSDs).

4.1 Data Source

Multi-modal data from multiple sources need to be obtained by smart sensors in order to characterize the physical human being from a comprehensive perspective, involving physical, psychological, and social aspects. It is important to note that the acquisition of such data in daily work should preferably follow the following principles: non-interference, non-intrusiveness, and non-infringement of privacy.

There are four types of data sources according to different sources and usages. (1) Physiological signals, including existing electrophysiological signals such as EMG, ECG, Electroencephalogram (EEG), which measure the electrical signals emitted by skeletal muscles, brain, and heart, respectively [78]. As well as data on vital signs like the heart beat and blood pressure from wearable sensors and devices. (2) Motion data, from Kinect, optical or IMU-based MoCap system, used to measure the real body movements; and sensing data from pressure and infrared sensors to identify human posture and activities. (3) Other data from the social network and environment to capture the social environment and contextual information. (4) Demographics data of the subjects such as age, gender, BMI, years, which are crucial to model personalization.

All data seem to be relevant to the study of human factors that improve performance, well-being, and health, and the majority of the data will be collected, stored, and transmitted in a synchronized manner [66]. For instance, wearable IMU and EMG sensors have proved that they can be utilized to capture motion and muscular activity in real-time, enabling online MSDs assessment by offering reliable and objective measurements [27]. Besides, EEG signals are broadly used in conjunction with big data analytics and AI algorithms for emotion recognition [79] and mental fatigue detection [80].

There are undoubtedly technical challenges during collection, storage, and transmission. On the one hand, flexible and nonintrusive sensors should be developed to collect data with the minimal possibility of discomfort and disruption to daily tasks. On the other hand, data transmission and synchronization mechanisms should be ensured due to the daily upload of a large volume of data to the server. For instance, if a huge volume of instant video stream data is required, a rapid and efficient communication system is a major guarantee of the data transmission capabilities. Additionally, adopting an edge-cloud IoT architecture can increase the capacity of data storage, processing, and computing, and edge devices may make it possible to respond to massive amounts of data more rapidly and in real-time.

4.2 Digital Engine

The digital engine serves as the trunk or brain of the VT in the proposed framework. The digital engine is composed of two parts: a human modeling engine and an AI-inference engine. For real-time monitoring, assessment, prediction, and optimization, the digital engine plays an important role in model-data-hybrid-enabled simulation and intelligent analysis to extract behavioral and contextual information.

The human modeling engine is a model-enabled module. To give a comprehensive description of the human body, it should consider health, cognition, and biomechanics models by using physiological signal data, motion data, contextual information, etc. Health modeling mainly considers mental health and physical health. For mental health, models focus on cognitive load measurements to address mental stress from the workplace [81]. For physical health, relevant mathematical models include static endurance time models and dynamic muscle fatigue models [82]. In terms of cognitive modeling, perceptual-cognitive models are used to evaluate operators’ capacity and ability for conducting mental tasks (e.g., perception, awareness, and memory) [81, 83]. On the side of biomechanics, 3D biomechanical models of the musculoskeletal system are built via motion data and an anthropometry database. Kinematics and kinetics calculations are used to simulate movements for examining joint loads and muscle force [84].

Notably, twinning of cognitive abilities is most challenging. Since human brains’ structure and function are intricate, not to mention cognitive mechanisms, it is thereby hard to establish an accurate model of cognitive performance for now. While non-invasive neuroimaging techniques, such as EEG, are used to facilitate brain-computer interfaces and enhance safety within symbiotic HRC scenarios according to Wang et al. [85]. These technologies hold the potential to reflect cognitive activities and the brain’s functional states in support of HDT. However, this neural information is constrained by accuracy, resolution, and latency limits [85].

The AI-inference engine is a data-enabled module that includes data pre-processing, feature extraction and representation, data analytics and inference, and decision-making. Data pre-processing is in charge of filtering the raw data acquired, synchronization with timestamps, and normalization. Based on the pre-processed data, feature extraction is conducted to develop a feature set that represents the physiological trait, which is helpful to reduce redundant data from the dataset. Data analytics and inference are the next crucial step. Popular AI algorithms like ML and DL can provide a range of solutions for various objects, including recognition, monitoring, assessment, and prediction [86,87,88]. And the outputs, such as MSDs risk assessment, action safety guidance, health state inferences, can support decision-making. When it comes to AI inference, what needs to be treated with caution is the modalities of input data and the computational resources.

Additionally, since both the human modeling engine and the AI-interface engine have advantages and disadvantages, it is sometimes better to combine the two technologies to ensure real-time efficiency and personalization. For instance, a vision-based DL algorithm extracts a human 2D skeletal pose in real-time; skeletal models can be used to predict the kinematics and kinetics of human motions; as a result, muscle forces and joint loads can be estimated after importing the reconstructed posture data into musculoskeletal models. This method is computationally efficient since the hybrid model combines the skeletal model’s rapid motion prediction with muscular dynamics assessment [64]. In terms of personalization, physical activity levels, preferences, and behavior patterns vary widely among individuals. The general approach of the human modeling engine is effective in developing generalized human models with unified characteristics (e.g., BMI and gender) but limited in a statical population level. Hence, the AI-inference engine emerges as an essential tool in building personalized models. The AI module combines data from the generic model with real-world data, including individual features and contextual information. It learns about individual preferences and behavioral patterns using this integrated data. Furthermore, it enables dynamical updating of the parameters of HDT models. For example, Liao et al. [88] developed DT models of drivers to predict personalized lane change behavior online, considering their preferences. Additionally, Moztarzadeh et al. [89] increased the usefulness of DTs for cancer prediction by dynamically modifying decision trees based on dynamic data.

However, creating an iterative and self-updating model requires a technological breakthrough due to the real-time and high-frequency motion data needed. Therefore, data transmission interoperability, data normalization, and communication are essential when building a large-scale multidisciplinary HDT model.

4.3 Interface

Owing to the AI-inference engine working, HDT can learn the users’ behavior and preferences through a complex interaction with the real environment. And the interface plays a central role in virtual-reality interaction, which supports the immersive interaction of digital and physical humans using different interaction techniques. In this framework, HDT can finally represent and visualize the physical and cognitive status of human digitally and send feedback to the corresponding physical human based on the modeling and analysis results. As shown in Figure 4, the interface, functioning as the skin appearance of VT, includes three modules: function, visualization, and multimodal interaction.

The function module has the following purposes: monitoring, assessment, prediction, and recognition. Some of the functions are widely applied in human factors, including performance analysis and prediction, ergonomics assessment, health monitoring, and occupational risk monitoring, as well as intention and action recognition. In addition, a digital representation of the physical human is also made available to create an end-to-end immersive twin that can be displayed livelier on terminal devices. To this end, this HDT framework concentrates more on accurate virtual-real mapping of essential human factors, such as body movements, working postures, and physical and cognitive status, rather than a high-fidelity representation of human anatomy. Finally, the digital avatar must interact with both physical humans and other VTs through multi-modal interaction. Advanced technologies like AR/VR, wearables, and high-performance computing help to create an all-around and immersive interactive experience with sight, sound, touch, and smell. For example, AR is used to generate virtual objects in the same environment through precise spatial and temporal mapping. Wearable UI may become a trend that enables users to easily interact with small wearable devices with touchscreen displays and receive timely feedback (e.g., vibration alerts or message notifications) [7]. High-performance computing is also necessary for better synchronization and a more user-friendly experience.

5 Future Perspectives

5.1 Industrial Application Trends

The widespread adoption of DT technology across all areas of our lives and work has promoted the development of HDT in the medical and sports fields. However, HDT is still in its infancy in the industrial field because human needs are not prioritized in the technological-centered period; instead, there is an ongoing pursuit of the quality and efficiency of production. In light of the growing emphasis on the human factors in Industry 5.0 and the HCPS paradigm, our research has found that HDT has the potential to provide intelligent services including data visualization, monitoring, prediction, and immersive interaction, by considering the human factors throughout the product lifecycle. Furthermore, a few leading companies have shown interests and a trend of application towards HDT in the industrial field, including the clothing, manufacturing, and metaverse industries (see Figure 5). Meanwhile, multi-scale human body models can be utilized to adapt different application needs.

Figure 5
figure 5

Representative HDT application trends in industries

With customization and personalization prevalent over the years, anthropometric measurement is a key technology in clothing customization so that clothing sizes can fit every customer. In the past, personal tailoring required taking a precise tape measurement getting the client's individual measurements. Currently, Magic Weaver Inc. is trending to become a representative HDT application in clothing customization, as it proves the feasibility of AI body digitalization and measuring solutions to clothing companies, including smart size recommendations and tailor-made services [90]. The best-fit sizes of individual customers are computed based on their photos through 3D human body modeling technologies. It also reduced the return and exchange rate by 70% for some customers and invalid inventory of clothing companies by generating realistic human body models. In the future, vision-based recognized methods for whole-body measurements can be used to generate HDT models at an anthropometric level.

In the manufacturing industry, humans and robots are working closely together. Collaboration is becoming a hot topic in HRC scenarios. They share the same workspace for flexible automation, with robots handling physically demanding, repetitive, and tedious tasks. HDT is conducive to monitoring human body and providing real-time information about the environment for worker safety in HRC. As a typical application, KUKA’s collaborative, sensitive robots can work more effectively with operators compared to conventional robots [91]. Indeed, one of the key reasons is that the motions of humans and robots may be monitored with sensors and predicted digitally so that humans and robots interact extremely close with efficient precision, and safety. Therefore, cognitive modeling of HDT using monitoring data will enhance communication between humans and robots for flexible task coordination.

In addition, occupational health and safety issues have always attracted attention in the manufacturing field [92]. As technology advances, industrial exoskeletons are integrated with AI, sensors, and other technologies to augment human capabilities and reduce fatigue [93]. With this, multi-scale HDT models can combine with exoskeletons to improve the physiological state of workers by giving wearable tools and ergonomics data. For instance, German Bionic Inc. developed wearable exoskeleton devices with software platforms for real-time monitoring and posture risk identification [94]. Ekso Bionics Inc. offered lightweight exoskeletons with 3D body visualization for assembly workers [95]. Furthermore, Cyberdyne Inc. created the HAL® exoskeleton to enhance user physical functions according to user intentions via biometric sensors [96]. In the future, wearable exoskeletons combined with more bio-electric sensors may develop higher-fidelity HDT models to monitor workers’ health risk factors and provide adaptive external support.

In the emerging metaverse industry, many companies are devoted to improving user-virtual world connection using VR/AR technologies to create a complete loop of human-machine interaction. A touch glove developed by Meta (Facebook) Reality Labs reflects a wide range of minor sensations, including pressure, texture, and vibration, giving users the sense that they are physically touching virtual objects [97]. Pebble Feel, a wearable device that enables VR users to perceive the temperature in a virtual environment, was unveiled by Shifrall, a Panasonic subsidiary [98]. In the future, metaverse needs a socially interactive, immersive interaction network environment within the HDT framework as a multi-user platform.

5.2 Challenges

In this work, a comprehensive overview and characteristics of the shift from DHM to HDT were discussed. Then, a proposed HDT framework was introduced, combining human factors with advanced digital techniques to meet the needs of modern industry. Meanwhile, the promising future of HDT for industrial applications were also highlighted. However, HDT still has a long way to go in terms of theories, technologies, and implementation with public attention. Several technical and social challenges must be addressed to fully realize its potential.

  • Multidisciplinary cooperation. As human modeling in the HDT framework has considered multiple aspects, knowledge from various disciplines is needed, such as brain science, psychology, and biomechanics. For instance, a finite element model was constructed by combining a digital twin of the lumbar spine with AI, data analytics, and biomechanics to predict real-time biomechanics during human movement [53]. This requires investigating the combination of multidisciplinary methodologies, theories, or models, as even creating a high-fidelity dynamic visualization of a small part of the lumbar spine is complex.

  • Privacy-preserving. The HDT utilizes various data to identify and assess the states of users, raising privacy and security concerns [99]. There are three considerations in this regard. Firstly, user privacy is a critical concern and is influenced by how data is acquired. For example, the motion data used to provide action guidance in HDT can be obtained through cameras, wearable devices, or ambient sensors. The level of user acceptance of these data acquisition methods will depend on the degree of intrusiveness and obtrusiveness associated with daily usage. Secondly, user privacy concerns vary based on the type of data source. Personal information should be used and transmitted prudently with the user’s consent. The accessibility of this data for specific tasks requires discussion. Finally, the central processing of user data by third parties can raise privacy concerns because users are concerned about their data being maliciously leaked or attacked. Edge computing and federated learning can help process sensitive data more securely. Further exploration is needed to determine the scenario applicability of general authentication approaches and encryption mechanisms in edge computing.

  • Multimodal data fusion. The selection of sensing devices and sensors determines the usability and variety of HDT applications. The multi-source heterogeneous data in HDT require an integration of valuable information from each modality and neglects the redundant features. For instance, how to deal with body movements, heart rate, and working context data to recognize a worker’s stress level. Multimodal sensory data fusion can compensate for inconsistent measurements from individual sensors [78].

  • High-fidelity issue. The ML and DL approaches place a greater emphasis on correlation rather than causality, which can result in a lack of interpretability. However, an explainable model is significant in gaining user trust and contributing to its applications. For instance, if a high risk of MSDs is predicted, would this prediction be trusted and utilized to develop follow-up measures without explanation? Therefore, there are still some challenges to be addressed in evaluating the model-data-hybrid-enable technologies in HDT with high-fidelity. The growing field of eXplainable Artificial Intelligence (XAI) can assist in the development of models with interpretability [100].

  • Autonomy. The cyber twin of the HDT can perform data analysis and decision-making. However, it should be careful to determine the level of automation for these decisions because they may be made based on incomplete information. As a result, the distribution of privileges between the user and the system requires further investigation. To advance the development and implementation of HDT, related policy, legislation, and ethical considerations must be considered.

  • Regulatory Compliance. The ability of HDT to accurately and timely capture and reflect changes in individual characteristics may give rise to privacy and data regulatory issues. Data related to individuals, such as facial data, should be their own [101]. The owner’s informed consent is essential to ensure the legal use of data. This is critical in domains like medical treatments. Strict laws and guidelines should govern HDT ownership and data security, with regulatory mechanisms tailored to the sensitivity of information in each industry. Through technological innovation, legal regulation, and ethical guidance, the potential and risks of HDT for a positive impact on individuals and society can be balanced.

It is hoped that this effort will lead to more open discussions among practitioners and researchers, and motivate cross-disciplinary ideas, which continuously enrich the theories and technologies for the future of HDT.