Introduction

The rapid growth of digital technologies has brought about a new era in the tourism industry, with big data emerging as a key driver for innovation and progress. This has led to a range of opportunities and challenges for industry stakeholders (Ardito et al., 2019). Technological advancements in big data have stimulated innovation and progress in the sector, including toward sustainable practices (Bulchand-Gidumal, 2021; Rahmadian et al., 2022). Leveraging big data can provide valuable insights to stakeholders in tourism, enabling them to achieve various objectives such as predicting travel demand, making informed decisions, managing knowledge flow and customer interactions, and providing efficient and effective services (Buhalis & Leung, 2018). It also offers a range of benefits, such as increased productivity, improved customer satisfaction, personalized marketing campaigns, and streamlined operations. Additionally, big data supports decision-making and enables services based on various data sources (Loureiro, 2018; Zhang et al., 2019). In order to facilitate decision-making, this data source is further integrated into artificial intelligence-based technology and the Internet of Things (IoT) (Nguyen et al., 2017; Lisi & Esposito, 2015). For example, mobile applications for smart tourism can utilize big data analytics to provide information about visitors, enabling analysis of travel routes and their impact on the environment (Kim et al., 2019). Another illustration is IoT-based systems for personalized navigation in smart museums, improving the visitor’s cultural experience (Del Fiore et al., 2016).

The emerging concept of Digital Twin (DT) technology relies on artificial intelligence (AI), machine learning (ML), and the IoT. It is generally described as a comprehensive, all-encompassing simulation of a system, like a machine, that incorporates various physics, scales, and probabilities. This simulation mirrors the entire lifecycle of its twin by leveraging the most accurate physical model, real-time sensor data, historical records, and other relevant information (Glaessgen & Stargel, 2012). A DT also functions as a virtual image that defines the comprehensive physical and functional characteristics of the entire product life cycle, and it can transmit and receive product information. One of the primary functions of DT is to predict system reactions before they occur by matching the system’s current analysis and current response with behavioral predictions (Schleich et al., 2017). The accuracy of such predictions is contingent o the completeness of data collection, its ability to be shared, and its simulation. DT has been implemented in many sectors and industries, including smart cities (Farsi et al., 2020). The rapid growth of new ICT and IoT applications in tourism would have made the implementation of DT an important milestone for smart and sustainable tourism. With the development of cyber-physical systems and IoT, data is increasingly becoming a key asset and competitive advantage that can be utilized for multiple purposes, including DT applications.

As the importance of data as a strategic asset in various industries, including tourism, continues to grow rapidly, effective data management has become a key concern for organizations. To address this, a suitable big data governance framework must be established, which refers to managing big data in an organization and its use for decision-making through various analytical tools. Governance is crucial in this digital realm to create legitimacy, balance of power among stakeholders, and ensure compliance with complex laws and regulations, particularly those related to privacy and security (Khatri & Brown, 2010). Therefore, a big data governance framework is essential to ensure that DT technology as a software architecture can be effectively used as a potential tool to solve the problem of sustainable tourism based on big data, AI, and IoT. The primary objective of the framework is to facilitate seamless data access for optimal machine learning performance while ensuring lawful and ethical handling, storage, and processing of both supplier and user data in accordance with applicable regulatory guidelines (Al-Badi et al., 2018).

Furthermore, in recent years, the issue of effective knowledge management in software architecture has also received increasing attention (Tang et al., 2010). To address this issue, the documentation of architectural decisions (ADs) has emerged as a promising approach (Van Heesch et al., 2012). These decisions establish the system’s overall structure, behaviour, and quality. In this study, we aimed to address practical applicability and validate our previous research on the use of decision framework for architectural decisions (DFAD) by combining the academic approach with practical constraints in the industrial realm, specifically the government sector, through a case study on the use of DT for smart and sustainable tourism by the public sector which involved big data collection mechanism and communication among multistakeholder. Through stakeholder engagement, feedback collection, and the case study, we refined our proposed DFAD, taking into account the insights and feedback we gathered during our study. We also aimed to demonstrate the use of DFAD to support the role of data governance on digital technology to support smart and sustainable tourism and contribute to the research gap on knowledge management in software architecture.

This paper is structured to provide a comprehensive overview of the conceptual decision documentation framework and DT technology for smart and sustainable tourism and to promote the use of DFAD on supporting big data governance in the field. In Section “Background and related work”, we present the background and related work on the DFAD and DT technology. In Section “Methodology”, we describe our case study’s design, including the research questions, topic selection, data collection, and analysis. In Section “Results”, we present the study results, highlighting the effectiveness of the DFAD in supporting big data governance toward smartness and sustainability. Finally, in Section “Conclusion”, we conclude our research by discussing its implications for practitioners and researchers and suggestions for future research.

Background and related work

This section discusses the importance of big data governance and its framework in ensuring sustainability in the use of digital technology. To implement big data governance in digital technology systems, we propose the use of a DFAD with its benefits. Additionally, we present our previous research that proposed the potential use of DT technology for smart and sustainable tourism. Given that DT technology employs big data, AI, and IoT, incorporating big data governance is necessary to ensure its success. Therefore, the proposed DFAD framework can be used as a solution to address the challenges associated with the governance of big data in DT technology for sustainability.

Big data governance

Effective data management is a critical aspect of organizational governance (Hong et al., 2019). However, it requires more than just collecting and storing data, as unmanaged data can lead to low data quality and higher business transaction costs (Manjunath,Ravindra and Ravikumar, 2010). Therefore, a comprehensive approach is necessary to guide organizations in establishing and improving their data governance practices (Cohn, 2015). Data governance encompasses a combination of individuals, methodologies, protocols, and technological tools that empower organizations to harness data as valuable digital assets (Khatri & Brown, 2010). It establishes a unified structure to oversee and preserve data quality, security, accessibility, relevance, and integrity. Additionally, it guarantees the responsible utilization of actual data to define business objectives, uphold operational procedures, and make crucial decisions. The primary goal of implementing data governance is to ensure the sustainable use of data in achieving an organization’s business objectives (Tallon, 2013).

Data governance is crucial in ensuring high-quality data and providing organizations with a consistent and reliable track record (Fu et al., 2011). This quality data can bring significant benefits, including faster and improved decision-making (Al Nuaimi et al., 2015). Additionally, data governance is concerned with determining the roles and responsibilities for decision-making about an organization’s data assets (Liaw et al., 2014). Data governance goes beyond data storage, cleaning, and integration; it encompasses a framework of policies, practices, and business rules for collecting, protecting, and utilizing data to support business objectives (Almeida & Calistru, 2013). This process is continuous and repeatable, aiming to ensure proper data management while minimizing potential risks (Mahanti, 2021). Big data governance is an emerging field that involves a set of processes, methods, technologies, and practices to efficiently and securely discover, collect, process, and store vast amounts of structured and unstructured data (Malik, 2013). The definition is further expanded by Feki and Boughzala (2016), who emphasize the comprehensive approach of big data governance, focusing on management and analysis across the entire data lifecycle while prioritizing security, data protection, and cost-effectiveness for sustainable data creation. Central to big data governance is decision-making rights and responsibilities, given the extensive data processing within and outside organizations. However, Yang et al. (2019) warn that using big data, coupled with business disruption, increases the risk of data breaches and requires effective management of data quality, security, and ethical data processing. Therefore, Zwitter (2015) advocated for the implementation of a governance system to mitigate potential risks associated with the use of big data.

Big data governance is important for organizations to achieve their objectives. Grover et al. (2018) and Al Nuaimi et al. (2015) emphasize the importance of big data governance in ensuring efficient and secure data management while maximizing return on investment. However, using big data can also pose security risks if proper security controls are not in place (Cuzzocrea, 2014). As the amount of collected data increases, so do the scale and cost of risks, necessitating adequate security controls for data at rest, in transit, and when it leaves the network (Cumbley & Church, 2013). Ensuring proper access control is essential to data security, which involves restricting data access, monitoring analysis activities, and limiting access (Sabelfeld & Myers, 2003). Additionally, big data governance must adhere to four types of regulations: data protection, data classification, compliance, security and privacy rules, and process (business) rules (Ghavami, 2020).

A big data governance framework encompasses a comprehensive set of policies, guidelines, and procedures that enable organizations to effectively manage and maintain large amounts of structured and unstructured data (Hong et al., 2019). The framework includes various components such as people, processes, tools, and technology to enable efficient and secure data discovery, collection, processing, analysis, and storage as illustrated in Fig.  1 (Al-Badi et al., 2018). In addition, it establishes data quality, security, privacy, and ethical standards for data management and analysis, as well as defines the roles and responsibilities of stakeholders throughout the data lifecycle. The framework also outlines decision-making rights and responsibilities related to data management, establishes security controls for data protection, and defines business rules for data management and analysis.

The implementation of a big data governance framework can provide numerous benefits to organizations. First, improve data quality by maintaining accuracy, consistency, and completeness, leading to more informed decision-making and insights (Nisar et al., 2021). Second, enhance data security by safeguarding against unauthorized access and adhering to relevant data protection regulations (Demchenko et al., 2014). Third, increase efficiency by streamlining data management processes and reducing redundancies, freeing up resources for other business activities (Hankuk University of Foreign Studies, South Korea, Yeong Kim, Suh Cho, & Hankuk University of Foreign Studies, South Korea, 2018). Fourth, aid better risk management by identifying and mitigating potential risks associated with big data, such as data breaches or quality issues (Al-Badi et al., 2018). Fifth, foster collaboration and promote communication between various departments and stakeholders (Hankuk University of Foreign Studies, South Korea et al., 2018). Overall, a big data governance framework empowers organizations to efficiently manage their valuable data assets, comply with relevant regulations, and mitigate potential risks, resulting in better utilization of data for business success.

Fig. 1
figure 1

Big data governance framework

Ethics and digital technologies

Ethics has gained significant prominence in the realm of software engineering. It encompasses a collection of moral values and principles designed to direct ethical behavior (Walsh, 2015). Regarding software engineering, ethics entails a set of guidelines and regulations that enable software specialists to prioritize the well-being, justice, and safety of users and societies (Gotterbarn, 2002). The incorporation of ethical concerns significantly influences the development of software-intensive systems, given their widespread presence and integration into societies. These software systems exert significant influence over people’s lives, social interactions (including actions and decisions), equality (including opportunities and rights), and justice, making ethical aspects an indispensable component of system building (Alidoosti et al., 2022). Software systems have the potential to give rise to ethical concerns, impacting stakeholders by challenging their values and presenting ethical dilemmas. For instance, certain applications have led to privacy and data security violations by exposing users’ identities and locations. Moreover, in the context of big data utilization, four notable ethical issues arise: privacy, group privacy, propensity, and research ethics (Zwitter, 2014).

Dealing with these matters and incorporating ethical considerations into software systems is a complex task. It requires a comprehensive understanding of the ethical challenges that may harm individuals and societies, the affected parties, their concerns, the essential ethical values (which can be inherently ambiguous), and the methods to integrate them into software architecting. Given the extensive influence of software architecture design on a system, as well as people and society at large, it becomes a crucial realm for addressing significant ethical decisions, aligning to specified quality requirements.

Decision documentation framework

To govern such a system or technology, we have developed several documentation templates based on the Decentralization Network Governance (DNG) concept and documentation framework for architecture decisions (ISO/IEC/IEEE Systems and software engineering - Architecture description, n.d.);(Zwitter & Hazenberg, 2020). In establishing a clear decision-making process, our proposed documentation templates based on the DNG concept and documentation framework for architecture decisions aim to define decision-making roles, responsibilities, and procedures. This not only enables effective monitoring and evaluation but also identifies areas for improvement, ensuring the system remains aligned with organizational goals and objectives. Furthermore, it ensures compliance with relevant laws and regulations, providing a more transparent and accountable system or technology governance structure.

To this end, we propose the following encompassing procedures:

  1. 1.

    Identification of stakeholders

  2. 2.

    Identification of laws and regulations

  3. 3.

    Documentation frameworks for architecture decisions (DFAD)

Step 1-Identification of stakeholders

Successful software engineering projects are dependent on stakeholder involvement throughout the various stages of the project. Stakeholders should be aware of potential issues and feel that their ideas, opinions, and contributions are valued to ensure the project’s success. Involving stakeholders in the decision-making process is critical, as it assists in aligning project goals with the organization’s strategic objectives (McManus, 2004). Additionally, it enables better communication between stakeholders and development teams, reducing the likelihood of misunderstandings and increasing the likelihood of successful project outcomes. Therefore, effective stakeholder management is essential for the success of software engineering projects. To identify the stakeholders and their roles and contributions to this project, we need to understand the conceptual model of the architecture framework (Fig.  2) and project life cycle.

Fig. 2
figure 2

Conceptual Model of Architecture Framework (ISO/IEC/IEEE, 2011)

Step 2- Identification of laws and regulations

Moving forward, the identification of stakeholders is followed by the identification of rules and regulations that are to be implemented in the software engineering project. The rules and regulations are classified into two categories, namely technological and non-technological. It is important to note that the level of the law should be understood in advance, as it determines the authority and scope of each level, starting from the constitution. This classification of rules and regulations provides a framework for the decision-making process and helps ensure that the software engineering project complies with all relevant laws and regulations. Figure 3 shows a general principle hierarchy of law in most countries. It specifies how different levels of law are applied in practice. Zwitter in 2020 identified laws and regulations to govern digital technology as shown in Table 1 (Zwitter & Hazenberg, 2020).

Fig. 3
figure 3

Hierarchy of law

Table 1 Laws and regulations for digital technology

Step 3-Documentation framework for architecture decisions

Finally, we present a DFAD that comprises a collection of documentation templates to govern smart and sustainable tourism digital technology. Our objective is to provide a comprehensive view of the decision-making process that involves each stakeholder at different stages of the project, defines the governing system of each stakeholder based on their roles and responsibilities, ensures adherence to security requirements, and aligns with relevant regulations. The proposed framework aims to facilitate a structured decision-making process that supports the efficient and secure management of digital technology, while addressing the diverse needs of all stakeholders involved in the project.

To that end, we propose the tools as follows:

  1. (1)

    documentation framework for architecture decisions that is modified from the conventions of ISO/IEC/IEEE 42010- the international standard for the description of system and software architectures (ISO/IEC/IEEE Systems and software engineering - Architecture description, n.d.). This documentation framework consists of a decision details viewpoint (Table 2) and a relationship viewpoint (Fig. 4).

  2. (2)

    standard operating procedure (SOP) (Fig. 5), modified from the Statistics Office’s SOP.

Fig. 4
figure 4

Decision relationship viewpoint

Fig. 5
figure 5

Standard operating procedure

Table 2 Decision details viewpoint

The first part of the documentation framework is a decision details viewpoint (Table  2). It gives detailed information about single architecture decisions that consists of major information in architecture (e.g., decision outcome, options, and arguments), and minor but useful information (e.g., issue, state, related decisions, and actors) We modified the existing template by selecting the elements and added a governing system as an important element to manage the interaction of each actor. We also added activities on digital technology and Generic Statistical Business Process Model (GSBPM) to give a comprehensive overview of the decisions.

The second part of the documentation framework is a decision relationship viewpoint (Fig. 4), which shows the current state of architecture decisions, and how they relate to other decisions. This template enables transparency and makes the relationships between architectural design decisions explicit and traceable. The final part of the documentation framework we use is the SOP (Fig. 5), adapted and modified from the SOP of BPS-Statistics Indonesia. This template allows the project to be more systematic and cohesive by showing detailed procedures and flows of each activity.

Digital Twin technology for smart and sustainable tourism

Smart tourism is an emerging phenomenon that involves the integration of information technologies into the tourism industry, providing new opportunities for tourism businesses, destinations, and tourists (Katsoni & Segarra-Oña, 2019). This marks a substantial transition towards increased intelligence within the tourism industry as it converges the physical and governance aspects of tourism into the digital domain. (Boes et al., 2016). The smart tourism ecosystem consists of three key elements: customers (tourists), businesses or enterprises, and destinations. For the customer, the emphasis is on delivering personalized, smart assistance through real-time and extensive insights into the tourist experience. For the business, it revolves around accessing shared data to foster collaboration and resource sharing among tourism enterprises (Xiang & Fesenmaier, 2017). At the destination level, the implementation of smart tourism aims to boost competitiveness and enhance the overall quality of life for all stakeholders, encompassing both residents and visitors (Boes et al., 2016). The integration of big data has evolved as a fundamental element of the information technology infrastructure in smart destinations. This inclusion empowers more advanced decision-making processes, which demand robust technologies and cutting-edge algorithms(Oussous et al., 2018). Moreover, the promotion of sustainability practices within the tourism sector has gained considerable importance (Rahmadian et al., 2022; Xu et al., 2020). Sustainable tourism refers to the notion of not only fulfilling economic objectives but also considering the environmental and social dimensions of a destination, with a focus on minimizing or rectifying the adverse impacts on the economy, society, and environment (Vázquez et al., 2019). Promoting effective collaboration for sustainable tourism development presents a dual challenge, comprising both optimism and critical awareness. The critical aspect emphasizes that the tourism industry holds responsibilities not only towards itself as an industry but also towards its stakeholders, customers, governance bodies, and the communities it impacts(Liburd and Edwards, 2018).

The concept of smart tourism shares a strong connection with the idea of a smart city, wherein the integration of intelligence in various aspects, such as mobility, living, people, governance, economy, and environment, plays a pivotal role (Giffinger & Haindl, 2009). A smart city is a city that utilizes advanced information technology, combined with various urban systems and disciplines, to enhance societal and economic outcomes (Bibri & Krogstie, 2017). As cities strive to become smarter, the concept of smart tourism has emerged as an integral part of this endeavor. Destinations seek to attract visitors by offering unique and innovative services that leverage technology, sustainability, and accessibility to enhance the overall tourist experience. This convergence of domains in smart tourism represents a paradigm shift in the approach to tourism development. By utilizing advanced technologies and data analytics, smart tourism has the potential to unlock new opportunities for tourism stakeholders, including governments, businesses, and tourists themselves. This presents an exciting avenue for further research and investigation into the potential implications and benefits of smart tourism in the broader context of smart city development (Katsoni & Segarra-Oña, 2019).

This paper proposes that the integration of DT technology could serve as a valuable tool in the development of smart and sustainable tourism initiatives within the context of smart cities. DT is an emerging field within the realm of artificial intelligence, machine learning, and the IoT. It can be defined as a digital representation of a physical entity. Through the utilization of big data and other supporting resources, stakeholders can create virtual models of regions by analyzing the flow of tourist activities and assessing their impact on the environment. These insights can be applied to various aspects of policy, such as infrastructure and facility provision, over-tourism mitigation, tourism risk reduction, and destination strategies, including marketing, branding, and competitiveness. Demunter (2017) suggests that the potential sources of big data for tourism include communications system, websites, business process-generated data, sensors, and weather information. By leveraging DT technology, stakeholders can make informed decisions that benefit both the tourism industry and the broader community.

In light of these considerations, we present a conceptual framework for designing and implementing a DT in the context of smart and sustainable tourism and a DT architectural synthesis. Hence, we outline the processes and architectural elements required to implement a successful DT, assist us in understanding the software, and then ensure its sustainability. The conceptual framework of DT implementation on smart and sustainable tourism was inspired by Wan et al. (2019) and depicted in Fig. 6. The framework comprises four key steps: identification of big data sources, data management, sensemaking, and decision-making. A robust information management framework guides the design of the DT’s data architecture. The framework’s first step involves identifying the big data sources necessary for the system. There are several potential big data sources for DT on smart and sustainable tourism. For this study, we propose the use of MPD as the potential data source. Other potential big data sources are: geo-location data set, user generated content (UGC), google mobility, sensors, and other information such as temperature, humidity, and air quality. The next step is data management, which is crucial for efficient and secure data collection, storage, and utilization. Thirdly, sensemaking is necessary to determine where big data analysis will be deployed, including modeling and data mining. The final step is decision-making, where the DT is employed to enhance smart and sustainable tourism. An important aspect of this framework is the establishment of a feedback loop that includes a post-implementation evaluation to inform data-driven decision-making processes.

Overall, this framework serves as a comprehensive guide for the successful implementation of a Digital Twin (DT) within the context of smart and sustainable tourism, and in our case study, we deploy the use of MPD as the big data source for the digital system in the level of big data for sustainable tourism as shown in Fig. 6. In conjunction with the DT architectural synthesis, as depicted in Fig.  7, and the DFAD, we expect to achieve software sustainability. Within the realm of software, there are two perspectives or viewpoints regarding software sustainability: sustainable software and software engineering for sustainability. The first viewpoint focuses on the principles, practices, and procedures that promote the longevity of software, often referred to as technical sustainability. While the second viewpoint emphasizes the utilization of software systems to support one or more sustainability dimensions, with a focus on concerns beyond the software systems themselves (Penzenstadler, 2013). According to Lago et al. (2015), discussing sustainability of a specific software system should address the four dimensions of sustainability: economic, social, environmental, and technical. Through the framework we proposed, we aim to accommodate these dimensions of software sustainability and align with the implementation of big data governance.

Fig. 6
figure 6

Conceptual framework of Digital Twin on smart sustainable tourism

Fig. 7
figure 7

Architectural synthesis method

Methodology

The purpose of a decision architect is to provide a decision documentation tool that meets the industry’s expectations and can be easily adopted by architects in their daily work. To determine the level of interest among software architects in using this tool, three factors are considered: usefulness, ease-of-use, and contextual factors. According to Davis (1989), both usefulness and ease-of-use are essential in predicting an individual’s willingness to use new technology. Nonetheless, contextual factors, including an individual’s background, social influence, and facilitating conditions, significantly influence the adoption of new tools (Venkatesh & Maruping, 2008). To better understand the benefits of DFAD, the study formulates the following research questions: "How to apply big data governance on DT technology for smart and sustainable tourism, "How do stakeholders perceive the usefulness of decision architects, and how can it be improved to provide a better user experience?". The study aims to validate and improve a previously proposed decision framework, known as DFAD, in the context of decision architects and software architecture. By addressing these research questions, the study can provide insights into the potential benefits and challenges of using decision architects in software architecture and improve the user experience of the tool.

To better understand the practical application of the developed decision documentation tool, we conducted a qualitative research case study. This approach allowed us to evaluate the tool in a realistic context, providing valuable insights into the intent of software architects to use it in industrial software projects. As Yin (2009) highlighted, the case study method is particularly useful in situations where there is limited control over variables.

The steps of the case study are as follows: first step, as the scenarios, we choose three decisions strongly related to the goal of architecting digital twin for smart and sustainable tourism. These initial steps in architecting the system are crucial for organizations, but they often go unnoticed, and more emphasis is placed on software creation. Following this, we created the documentation framework on architecture decisions for these three decisions based on our initial project. We then conducted a semi-structured interview, a verbal interchange where the interviewer attempts to get information from another person through a list of predetermined questions (Clifford & Doody, 2018). Conducting these interviews allows for a thorough examination of respondents’ perceptions, events, and experiences. Furthermore, their opinions and statements can be cross-referenced and elucidated, leading to a deeper and more comprehensive understanding (Kvale, 2007). We created an interview topic guide and utilized a narrative approach that allowed for open-ended discussions of unanticipated themes. During the interview, we presented the DFAD and its applications on the three selected scenarios with several stakeholders. The participants of the interviews were: business analyst, analyst, technical experts and data scientist. Based on their input, we refined our framework to improve its quality for future use. Additionally, we attended an MPD meeting (online) among the stakeholders to gain further insights into how decisions are made and documented in practice. Overall, this case study provides valuable insights into the practical application of the decision documentation tool we developed.

The interviews took place at the Statistics Indonesia headquarters office in Jakarta from June to July 2022. We targeted the stakeholders related to the decision documentation (n=4). All the interviews were conducted by one research team member with training and experience in interviewing techniques. Each interview covered all topics of the guideline and lasted between 30-40 min, depending on interviews were saturated. The interviews were audio-recorded with the consent of the respondents. All recordings were transcribed and coded by the first author and checked by the third author. The first and third authors also determined the three architectural decisions applied for the study. The second author has role in assessing the research protocol and result findings. The complete study design protocol received ethical approval from the faculty.

Results

The successful implementation of smart and sustainable tourism through the use of innovative technologies, particularly by governmental agencies, necessitates careful planning and management, particularly when dealing with sensitive big data. It is also important to address the technical dimension of sustainability to ensure the durability of the system. Despite the potential benefits of leveraging big data to enhance decision-making processes, big data governance remains a relatively underexplored area among decision-makers. The adoption of new technologies introduces a range of risks, including concerns regarding data security, communication, data quality, and regulatory compliance. Addressing these risks requires a comprehensive framework that accounts for the full spectrum of data governance considerations. Effective governance of big data is essential to ensuring that the potential benefits of innovative technologies are realized while minimizing the potential risks.

In this section, we present the result of our study that aim to fill this gap. To answer the first research question, we apply the DFAD to DT technology for smart and sustainable tourism and incorporate the DFAD with the big data governance framework that we developed in our previous study as shown in Fig. 8. We anticipate that our proposed framework will prove valuable to statistics offices, facilitating the organization, management, and governance of big data as a source for official statistical products. The framework’s ability to support effective decision-making, transparency, accountability, and compliance with legal and ethical standards could enhance the quality and reliability of official statistical products, thus contributing to the overall goals of smart and sustainable tourism.

This framework also emphasizes the importance of selecting and partnering with stakeholders to ensure the project’s success. The DFAD could play a supportive role in mapping the stakeholders directly affected or involved in the project. For example, in this case study, we can identify these stakeholders based on their institutions and roles as outlined in the standard operating procedure. Additionally, stakeholder identification is also specified in the decision details viewpoint. Beyond mapping the stakeholders directly affected, the big data governance framework extends to mapping indirectly impacted stakeholders, including international organizations, the general public, researchers, media, and other governmental bodies.

DFAD to benefit big data governance

Fig. 8
figure 8

Big data governance framework for official statistics

In addition to validating the DFAD, this study enabled us to apply and validate our proposed big data governance framework as a potentially valuable tool for modernizing and enhancing decision-making in statistical processes related to the use of big data, AI, ML, and IoT. To achieve this objective, we followed these steps:

First step, we proposed the use of MPD as a potential data source for DT technology, which could serve as a pilot project for applying DFAD.

Based on our analysis, implementation of DFAD for this case study has several advantages on supporting the Big Data governance framework, namely:

  1. (1)

    address almost all the dimensions of the big data governance framework,

  2. (2)

    support transparency and accountability for each architecture decisions,

  3. (3)

    encourage the organization to follow the process of GSBPM that applied for new innovation for official statistics,

  4. (4)

    mitigate any potential risks by documenting the decisions and their alternatives, since the very first step,

  5. (5)

    enable quality control for each step,

  6. (6)

    ensure regulation compliance on digital system and the use of big data, especially related to sensitive data,

  7. (7)

    assist on mapping the stakeholders, ensure they are being heard, and help the task division within the organization and define proper communication tools or mechanism.

  8. (8)

    ensure the components of IT/big data infrastructure, human resource, and institution for the digital system,

The relationship or benefits of DFAD to the big data governance framework are presented in Fig. 9.

Fig. 9
figure 9

Benefits of decision framework on architecture decisions to the big data governance framework for official statistics

DFAD on DT technology for smart and sustainable tourism

Second step, to identify the stakeholders and their roles and contributions to this project, we need to understand the conceptual model of the DT architecture framework(Fig. 6) and project life cycle based on GSBPM in Fig.  10. We then incorporate our information about the stakeholders, their roles, and how they communicate/coordinate through the decision details viewpoint and standard operating procedure.

Fig. 10
figure 10

Generic Statistical Business Process Model (UN Fundamental Principles, 2014)

Third step, we identified the list of laws and regulations applied for this project, as shown in Table 3.

Table 3 Laws and regulations for Digital Twin technology

Fourth step, we decided three key selected issues regarding this topic: decision 1, decision 2, and decision 3.

In Decision 1, we explored the feasibility of using MPD as a potential data source for digital technology (DT) software systems. We argue that MPD could be effectively combined with other sources such as ML, AI, and IoT to support DT applications. In addition to MPD, we also considered other potential data sources, such as Google Mobility or User Generated Content (UGC).

Decision 2 focused on improving the existing data script. This step was crucial to rectify any errors in the script and improve data accuracy, quality, and reliability, given that the current script had been in use for several years. As an alternative, we also considered the possibility of continuing to use the previous script due to the obligation to release official tourism statistics on a monthly basis.

In Decision 3, we explored two options for testing the data script. The primary option was to test it in the mobile network operator (MNO) environment, given that MPD can only be processed in this environment. Alternatively, we considered testing the script using alternative data sources, such as web scraping (Google Mobility) in the data lake of the Statistic Office (BPS). This decision was critical to ensure that the data script was functioning correctly and delivering accurate results.

Fifth step, we developed architecture decisions for three key selected issues, consists of:

  1. (1)

    Decision detail viewpoints: Tables  4,  5,  6,

  2. (2)

    Relationship viewpoints: Fig. 11, and

  3. (3)

    Standard operating procedures for these three decisions: Figs. 12 and  13,  14,

intending to apply the concept of big data governance, address the sustainability dimensions, and achieve the objectives of software system or digital technology.

Table 4 Decision details viewpoint: Scenario 1. The use of MPD as potential data source c
Table 5 Decision details viewpoint: Scenario 2. Improve the MPD script
Table 6 Decision details viewpoint: Scenario 3. Test the script
Fig. 11
figure 11

Relationship viewpoint

Fig. 12
figure 12

SOP 1

Fig. 13
figure 13

SOP 2

Fig. 14
figure 14

SOP 3

In the decision details viewpoint (Tables  4,  5,  6), we provide detailed information about single architecture decisions. This viewpoint addresses transparency, legal compliance, communication, ethical concerns, and risk identification through its comprehensive approach to analyzing each architecture decision. For instance, addressing technical aspects, privacy, cyber-security, administrative procedure, and regulation issues in the decisions.

In the relationship viewpoint (Fig. 11), we show how each decision is related to one and each other, which further promotes transparency, elucidating and enabling traceability in the relationships among architectural decisions. For instance, decision 2.0 (improve MPD script) is related to decision 1.0 (use of MPD as data source) and decision 3.0 (test the script in the sandbox of MNO). There is also an alternative for decision 2.0 which is decision 2.1 (use the former MPD script).

And the final part of the documentation framework is the SOP (Figs.  12,  13,  14), which allows the project more systematic and cohesive by showing detailed procedures, actors/stakeholders involved, tasks and flows of each activity. For instance, in SOP 1 (Figs.  12), there are eight involved actors, each assigned specific tasks. The process commences with the subject matter experts (analysts), who are tourism experts from Statistics Indonesia, responsible for identifying data needs and defining variables concerning the use of MPD. Subsequently, they engage in discussions with business analysts and supervisors. Moving forward, data engineers are consulted to analyze aspects related to cybersecurity, data access, and privacy. The legal team then reviews this analysis before the supervisor initiates the project plan. Once these crucial steps are secured, the process proceeds with communication, negotiation, and further deliberations involving relevant stakeholders, such as the Ministry of tourism (data user) and the MNO (data provider).

DFAD to benefit stakeholders to comply with data governance and ethics

Incorporating ethical considerations into software architecture design demands that architects comprehend and account for ethical aspects. These may entail: (i) developing ethical awareness among architects; (ii) recognizing relevant stakeholders (e.g., those who the system might negatively impact) and categorizing a spectrum of ethical concerns; (iii) identifying stakeholders’ ethical values and understanding the interconnections among them; (iv) identifying ethical challenges that architects encounter during the design process, comprehending their origins and characteristics; and (v) quantifying and validating ethical values (Alidoosti et al., 2022). The use of DFAD will benefit and assist the stakeholders such as software architects and engineers, data scientists, data analysts, supervisors or project managers, business analysts, legal persons or departments, quality management assessors, and the auditor board. The implementation of DFAD fosters transparency in every architectural decision. In practice, by leveraging the decision details viewpoint, all stakeholders gain access to the rationale behind decisions, the available alternatives, identified problems and detailed issues-both technical and non-technical (including privacy, security, and ethics)-and a clear understanding of the responsible or involved stakeholders. For instance, in Table  6, the decision was made by assessing the importance of addressing data privacy, security and confidentiality to choose the best environment for testing the script of MPD.

Additionally, the decision flow and related activities can be evaluated using the relationship viewpoint and standard operating procedure. These tools collectively enhance transparency, accountability, and trustworthiness in architectural decisions within a public sector organization like Statistics Indonesia. This not only benefits project stakeholders but also extends to external auditors tasked with assessing institutional project accountability, as well as the general public.

Stakeholders’perspectives and reflection on the role of data governance for a digital technology

To address the second research question, we presented the developed frameworks to the participants and solicited their feedback and opinions. All respondents unanimously recognized the usefulness of these frameworks in facilitating decision tracking and supporting big data governance from the early stages. Currently, Statistics Indonesia has no specific documentation system for architectural decisions or other decisions related to big data analysis. It leads to some challenges in tracking the flow of the decisions, the rationale behind decisions, and the relationship between each decision. Also, the absence of such a documentation system has caused some challenges during the annual auditing process conducted by the national auditor board. The auditing process not only evaluates the financial procedure and correctness of each statistical activity, but also the quality, law and regulation compliance, and rationale behind statistical decisions. Therefore, DFAD has benefits to assist the documentation level, which is very important internally for producing official statistics products, as well as externally to provide transparency and accountability. DFAD allow the stakeholders to provide a complete picture of each decision that helps to overcome the challenges above.

However, they also provided valuable input that the decision details viewpoint needed to be more concise and less complicated, prompting us to make improvements by eliminating unnecessary information. Considering this framework is new information for the stakeholders, it took an effort to give detailed information and explanation about the potential use of DFAD.

Nevertheless, it is clear that DFAD provides essential structure, consistency, and quality to the process of making critical architectural choices. By reducing risks, enabling informed decision-making, and promoting alignment with business objectives and regulations, it plays a pivotal role in the success of software and system development projects. First, the framework supports efficiency by offering a clear structure and step-by-step process for decision-making, streamlining the entire process. Second, transparency is ensured as the framework provides all relevant information to stakeholders, such as the reasons and arguments behind each decision, the involved actors, and the flow or relation among each decision, aligning each aspect with the organization’s goals and strategic objectives. Third, the framework helps in risk reduction by identifying potential issues, risks, and benefits in advance, empowering decision-makers to anticipate and plan accordingly. Moreover, the framework facilitates conflict resolution by offering a systematic way to address competing concerns or alternatives, leading to more informed and effective decisions. Fourth, it emphasizes documentation and communication, ensuring comprehensive records and the rationale behind decisions. This documentation proves valuable for quality management assessment, auditing, and maintaining effective communication with stakeholders. Lastly, the framework supports enhancing overall quality and continuous improvement. Decision-makers can learn from past decisions and strategize for future developments, fostering growth and progress.

In addition to highlighting the benefits, it is essential to acknowledge the weaknesses of the DFAD from various stakeholders’ perspectives. First, the complexity of some decision frameworks can discourage the adoption of DFAD, particularly for teams with limited resources or knowledge. This may impede its practicality and applicability in certain situations. Second, architecture decisions can still be influenced by subjective factors, which may potentially compromise the objectivity of the decision-making process. Third, the introduction of a new framework may encounter resistance from stakeholders, requiring significant time and effort to convince them of its value and achieve widespread adoption. Lastly, it is crucial to involve legal experts and business analysts, considering that framework implementation requires in-depth knowledge and expertise, particularly in addressing legal compliance issues.

Despite the identified weaknesses, DFAD still offers more advantages for the stakeholders. However, it is crucial to take these limitations into account and tailor the framework to suit the specific needs and context of the project or organization. Furthermore, complementing the decision framework with expert judgment and experience can effectively mitigate some of these weaknesses. A fundamental aspect of successful data governance policies is ensuring they are concise, easy to understand, and in alignment with the organization’s overall objectives, including the incorporation of a DFAD framework. To facilitate the implementation of such policies, designating data stewards or data custodians as points of contact for stakeholders is vital. These data stewards can assist stakeholders with data-related inquiries, ensure compliance, and streamline data governance processes for continuous operations. In addition to that, implementing a feedback mechanism and maintaining proactive communication with stakeholders is also recommended.

Building upon the insights gained from the interviews, we improved the quality of each viewpoint compared to our previous work. This included refining the decisions, the quality of arguments, the coherence among decisions, and the task details, which also aimed to address our third research question. We further refined the frameworks by shortening certain items to enhance their ease of understanding and applicability. This feedback was crucial in improving the usability of the frameworks and ensuring that they meet the needs and expectations of the industry. The iterative refinement process allowed us to incorporate valuable input from the participants and make necessary adjustments, resulting in more effective and practical frameworks.

Conclusion

This paper proposes the use of the documentation framework for architecture decisions (DFAD) as an integral part of big data governance in the context of digital technology for achieving smartness and sustainability. DFAD provides organizations with a comprehensive governance framework for big data, which ensures successful implementation of digital systems. DFAD supports critical aspects of big data governance, including transparency, security, accountability, communication and coordination, quality control, and ethical and legal compliance. Future research could extend this study’s findings by incorporating DFAD into all architecture decisions to optimize its potential benefits. However, the study recognizes the limitations caused by the COVID-19 pandemic, which restricted the sample size. Therefore, caution is necessary when interpreting the study’s results, and further research should explore DFAD’s potential as a tool for enhancing big data governance in digital technology.

Additionally, the study findings demonstrate that practitioners perceive DFAD as a valuable tool for decision-making. DFAD facilitates the tracking of decisions from the early stages of the decision-making process, enables the identification of stakeholders for each step, and helps mitigate potential risks. Documentation in official statistics products is very important, and DFAD provides benefits for that purpose. It also supports communication and coordination among stakeholders and assists in identifying laws and regulations that are relevant to the technology. However, feedback suggests that DFAD could benefit from enhanced usability and reduced complexity. A clear understanding of the DFAD is critical for its adoption, and stakeholders may be hesitant to use it without sufficient clarity. We acknowledge that creating DFAD may require considerable effort and time, and stakeholders may not accept it if it proves too burdensome. Hence, we propose the active function of data steward and business analyst in order to support the applicability of DFAD and data governance in the organization. Nonetheless, the study results indicate that DFAD holds the potential to enhance effective decision-making in the context of big data governance. The use of DFAD assists stakeholders in understanding architectural decisions in projects, and these assessments provide valuable insights for refining the framework in the future.

In conclusion, this study provides a valuable contribution to the understanding of DFAD’s effectiveness as a governance tool for digital systems. The research findings offer insights that can inform the development of future tools by providing a better understanding of the constraints and requirements of software architects regarding decision documentation. Finally, this study supports the advancement of effective decision-making processes in digital systems and contributes to the sustainable development of modern technologies. The implications of this study may guide future research, policymakers, and practitioners in the design and implementation of governance frameworks for digital systems, with the aim of achieving smartness and sustainability by addressing a multidisciplinarity approach.