1 Objectives

This analysis critically examines the challenges of managing Scope 3 emissions in the automotive sector and advocates for the integration of advanced technologies for effective management. It proposes the creation of a tailored Software as a Service (SaaS) platform to enhance the accuracy and accessibility of emission data. Additionally, it outlines strategic approaches for reducing the sector’s carbon footprint, and provides forward-looking insights to guide future research and practices in both industry and academia.

2 Introduction

2.1 The imperative of GHG inventory according to the GHG Protocol

The global commitment to sustainability has intensified in the wake of pressing climate concerns, placing a premium on holistic strategies to curb greenhouse gas (GHG) emissions (Huang and Zhai 2021; Qin et al. 2021). Central to these tactics is the systematic accounting of GHG emissions facilitated by the Greenhouse Gas Protocol (GHG Protocol) (WRI and WBCSD 2011). This protocol, with its rigorous classification of emissions into distinct scopes — Scope 1 (direct emissions from assets owned or operated by the company), Scope 2 (indirect emissions from the generation of purchased energy, e.g., electricity, heat, and stream), and Scope 3 (indirect emissions from all other value chain activities), provides a scaffold for organizations to quantify, regulate, and communicate their carbon footprint. Developed in accordance with the Protocol, the GHG inventory far surpasses a mere regulatory formality and serves as a pivotal managerial instrument aligned with international sustainability objectives. It empowers entities to base their decisions on reliable data, establish reduction benchmarks, and participate in carbon trading or offsetting initiatives (Kreibich and Hermwille 2021; Jia and Lin 2020; Qian et al. 2024).

2.2 Management challenges and research gaps of scope 3 emissions

While existing guidelines and practices of Scope 1 and Scope 2 emissions have reached maturity, Scope 3 emissions — often constituting the largest portion of an organization’s GHG footprint (EPA 2021; Klaaßen and Stoll 2021; Hertwich and Wood 2018) — still lack similar precision and clarity in their determination. The most widely accepted standard for Scope 3, the GHG Protocol Corporate Value Chain Accounting and Reporting Standard (WRI and WBCSD 2011), delineates fifteen Scope 3 categories, encompassing both upstream and downstream activities. The implementation of this standard, however, faces constraints due to the challenges in obtaining emission data from suppliers and customers (Hettler and Graf-Vlachy 2023; Dahlmann and Rohrich 2019). Overlooking or underestimating Scope 3 emissions may result in momentous misinterpretation of an organization’s overall GHG emissions, thereby impeding the adoption of appropriate GHG mitigation strategies and the cultivation of low-carbon supply chains. To date, research on this domain has been circumscribed to the pitfalls of depending on industry average data for computations, the hurdles of avoiding calculation duplications among reporting entities, and the necessity of procuring primary data (Stenzel and Waichman 2023; Puschmann and Quattrocchi 2023; Gopalakrishnan 2022; Busch et al. 2022; Li et al. 2019; Patchell 2018; Blanco et al. 2016; Downie and Stubbs 2012; Matthews et al. 2008). There is an obvious absence of implementable strategies and efficacious tools to navigate the intricate challenges of Scope 3 emission quantification.

2.3 New methodologies and technological applications in typical industry

The automotive industry, with its extensive and multifaceted supply chains, stands at the forefront of the Scope 3 emissions challenge (McKinsey 2023; Martin et al. 2023; Deloitte 2023; PwC 2022; Carbon Trust 2017). Present GHG reporting practices are typically defined by operational control or financial interest, conspicuously lacking in capturing all emissions associated with extended supply chains and vehicle disposal. In pursuit of sustainable progression within the auto industry, the emergence of novel methodologies is essential, ones that are capable of not only comprehensively and precisely quantifying GHG emissions but also systematically and strategically mitigating them. This study aspires to bridge existing research gap, proposing a framework that integrates contemporary technology for all-encompassing management of Scope 3 emissions in the automotive sector.

3 Technological inovations to address scope 3 challenges

To effectively manage and reduce Scope 3 emissions in the automotive manufacturing sector, various technological innovations could be adopted. The application of cloud computing, text analytics, and machine learning will notably upgrade the accuracy and effectiveness of emission data management.

3.1 Cloud computing for data management and emission quantification

Based on the preceding discussion, the estimation of Scope 3 emissions necessitates reliance on industry averages or financial information, due to the challenges associated with data synchronization within the value chain. This reliance may compromise the accuracy of corporate Scope 3 GHG inventories. Furthermore, such data imprecision also impacts the accuracy of subsequent machine learning model predictions.

Cloud computing technology has significantly manifested its prowess in data storage, processing, and analytical domains (Chen and Zhang 2014). This research envisages harnessing cloud computing technology’s robust infrastructure to develop a Software as a Service (SaaS) platform, tailored for stakeholders in the automotive sector. Its cloud-based architecture, real-time updates, and cross-platform accessibility will enable the seamless gathering of relevant primary data from diverse participants in the value chain.

3.2 Text analysis for emissions reduction insights

Text analysis technologies have progressively been employed in environmental studies (Uthirapathy and Sandanam 2023; Shen and Wang 2023; Effrosynidis et al. 2022; Gong et al. 2022; Baier et al. 2020). These technologies are adept at dissecting dense text data (Albrecht et al. 2020), offering nuanced understanding of environmental patterns and trends. The research advocates for the deployment of advanced textual analytics, particularly topic modeling, to distill knowledge from industrial operation as well as academic theory. The ambition is to forge tactical blueprints for the eco-friendly transformation of automotive industry supply chain with key performance indicator of abatement in GHG emissions.

3.3 Machine learning for emissions prediction and optimal mitigation strategies

Machine learning has been garnering considerable attention for its potent data processing capabilities. Recent research has explored the use of these technologies in predicting Scope 3 carbon emissions, revealing both potential and challenges in this field. Serafeim and Velez Caicedo (2022) utilized Scope 1 and 2 emissions, and market capitalization as features, and compared Random Forest (RF), K-Nearest Neighbors, and Adaptive Boosting algorithms; while Nguyen et al. (2023) employed Elastic Net Regression, Extreme Gradient Boosting, RF, Linear Boost, and Linear Forest, and used features like financial metrics including net income, total assets, and employee numbers. Although machine learning excels in pattern recognition and trend analysis, its performance is substantially contingent upon the type and quality of the data used. In current research, due to the scarcity of data associated with certain categories of Scope 3, even the most sophisticated machine learning models have shown suboptimal predictive performance (Nguyen et al. 2023).

This study addresses this predicament by utilizing precise carbon emission data, gathered and computed via SaaS application, as the label in the machine learning dataset. Applying a tiered approach, it will compare traditional machine learning algorithms, ensemble learning algorithms, and neural networks to predict carbon emissions and forecast the efficacy of specific emission reduction strategies identified through topic modeling, thereby assisting enterprises in developing efficacious carbon reduction projects.

4 Quantification of organizational GHG emissions

In the context of the traditional automobile manufacturing industry, this segment of the research accurately quantifies Scope 3 GHG emissions by advanced cloud computing tool.

The study adheres to the guidelines, including ISO 14064 part 1, 2, and 3 (ISO 2018), alongside GHG Protocol Standards (WRI and WBCSD 2011), accounting for emissions from each Scope 3 category. To set the inventory boundary, we identified all organizational assets and operations of the cooperative partner, and outlined value stream mapping to define all upstream and downstream activities with inherent GHG emission implications.

For the purpose of more easily and swiftly comprehending the relative levels of various Scope 3 activities, recognizing emission hotspots, and embarking on subsequent collection of primary data and development of emission reduction measures, we initially adopted a “progressive narrowing” approach. It will identify the main emission sources within the fifteen categories of Scope 3, as depicted in Fig. 1.

Fig. 1
figure 1

Four steps of “progressive narrowing” method

The structured framework spotlights the predominant emission contributors, guiding the development of customized SaaS platform, which will assist the collection of complete and accurate activity-related data. We calculated GHG emissions through the following formulation from WRI and WBCSD (2011):

$$GHG\,Emissions=Activity\,Data\times\,Emission\,Factor$$
(1)

where emission factor (EF) is expressed as the amount of a particular gas produced per unit of the emissions-producing activity or material input (EPA 2023). This study selected EFs specifically developed for the automobile industry, sourced from authoritative databases like the International Energy Agency’s Emission Factors Database or relevant industry-specific documents. To ensure the applicability and accuracy of these EFs across different regions and stages, the criteria for their selection, focusing on their scientific validity, geographical relevance, and industry-specific applicability has been meticulously assessed. Selection involves a comprehensive review of the methodology behind EF derivation, the representativeness of underlying data, and compatibility with Scope 3 emission categories. Additionally, EFs as needed to accurately reflect differences arising from regional practices and regulatory standards were adapted.

Regarding the precision of emission calculations using these EFs, a dual approach has been deployed — statistical analysis to estimate the range of possible errors and sensitivity analysis to understand the impact of these errors on overall emission estimates. To verify the accuracy of our emission calculations, comparative analyses against available real-world emission data and obtain third-party validation were conducted.

4.1 Customized SaaS platform development

Upon identifying pivotal Scope 3 emission categories by “progressive narrowing” approach, such as the “use of sold products” deemed a major contributor to emissions, the research focus shifted to the development of a SaaS platform tailored for product users, as illustrated in Fig. 2. This customized platform empowers end-users to effortlessly input usage data, supporting precise carbon emission calculations.

Fig. 2
figure 2

SaaS platform schematic for product user engagement in tracking scope 3 emissions

The cornerstone of the study is the clarification of the specific types of information to be collected from users. Moreover, parallel efforts were made in advancing the software’s User Interface and User Experience. By enhancing the interface design and interaction dynamics, the platform can effectively navigate users to input data. Once mandatory data is correctly entered, an intuitive dashboard will subsequently be displayed, expediting users’ ability to comprehend the implications of their carbon footprint.

4.2 Agile development model

Following the initial deployment of the SaaS platform to product users, it undergoes systematic refinement through iterative development (Martin 2003), perpetually amplifying its serviceability to a broader range of stakeholders (Fig. 3). Upon accessing the platform, users can select or update their role through the registration or personal information section. The interface then customizes and displays the necessary information fields pertinent to their specified role. For instance, if a user is identified as an employee of a freight forwarding company, the page will display a dropdown menu for selecting the mode of transportation, and the fields to input data such as transportation mileage, for calculating emissions related to the “Transportation and Distribution” category of Scope 3 GHG emissions. Adopting Docker containerization platform, the tool ensures scalability and smooth deployment across multiple cloud platforms, preserving its versatility across diverse user environments (Kane and Matthias 2023). This strategic expansion enables the thorough aggregation of primary data across various Scope 3 categories, addressing prevalent concerns of subpar data quality. Importantly, combined with the outcomes of the subsequent research phases, the tool will offer targeted insights into potential emission reduction strategies, such as fuel substitution and cleaner transport modes, consequently enabling a strategic and responsive approach to GHG management.

Fig. 3
figure 3

Evolution of SaaS platform for diverse stakeholders’ engagement in managing all 15 categories scope 3 emissions

4.3 Design principle

The platform’s design emphasizes transparency, a notable feature of the SaaS model (Erl and Monroy 2023). This transparency allows stakeholders to access and interpret data inputs and corresponding outputs from various supply chain sectors, thus promoting collaborative efforts to minimize carbon footprints and fostering participation in environmental management. Another prominent feature of the platform is its ability to store historical data in a cloud-based database. This function provides supply chain managers with a detailed overview of their organization’s carbon footprint. Coupled with the platform’s elaborate analytical capabilities of generating reports, charts, and tables, this solution enables emission trend analysis and project progress visualization, significantly streamlining their management processes. Further, the application will be built on agile (Martin 2008) and modern cloud-native development practices to rapidly adapt to user feedback and evolving standards.

4.4 Data security and privacy protection

Given that cloud platforms store and process vast amounts of sensitive information, it becomes essential to implement strong encryption, access controls, and adhere to global data protection laws. In addition, conducting regular security audits and vulnerability assessments can markedly fortify data protection measures.

5 Supply chain optimization as a critical pathway to emission reduction

On the basis of the discussions presented in the preceding section, Scope 3 emissions involve multiple stakeholders, including suppliers, vendors, consumers, and recycling companies. When designing project-level GHG activities, emphasis should be placed on supply chain segments that the company can control directly or indirectly, given that the company’s capacity to influence the product use phase — particularly in terms of consumer behavior — is comparatively limited.

The focus of this section is to meticulously generate emission reduction strategies within the supply chain, where topic modeling serves as a fundamental approach to facilitate this objective (Fig. 4). Besides, the research subjects in developing reduction projects that align with the enterprise’s criteria were assisted as well. Following this, a baseline scenario was established for each project, setting the stage for the subsequent phase of practical application. This allows a comparative analysis against post-implementation GHG reductions, and the procedure can be expressed as

$$Reduction=Baseline\,Emissions-Project\,Emissions.$$
(2)
Fig. 4
figure 4

Flowchart of text analysis process

Besides, the monitoring plan, detailing schemes to be employed, essential parameters for measurement, and frequency of measure were also formulated.

5.1 Data acquisition

To assure the thorough coverage of scholarly articles that intersect with our research interests, Scopus database opted for data compilation (Harzing and Alakangas 2016). By employing the “progressive narrowing” method, it is able to focus on the exploration of mitigation measures for the most critical sources of emissions. However, considering that papers may not always directly mention specific emission categories or the corresponding parts of the supply chain, to ensure an exhaustive study of potential reduction strategies, the subsequent string was used to explore within the title, abstract and keywords fields: (“GHG” OR “greenhouse gas”) AND (“supply chain”) AND (“sustainab*”). The search was limited to peer-reviewed journal articles and reviews from 2013 to 2024, focusing on the most recent and relevant literature.

The online survey with open-ended questions was orchestrated adhering to the proposed procedure by Dillman et al. (2014) for survey design, dissemination, and collection. This survey targets middle-level and top-level managers from the automotive industry, concentrating on those engaged in strategic initiatives for GHG emissions reduction and technological innovations in environmental management. The survey link will be dispatched to the designated respondents through email. To improve response rate, a multi-touch outreach strategy is to be implemented, including personalized emails, scheduled follow-ups, and reminders. Prior to conducting the survey, we will share the findings from our preliminary research, which includes a ranked list of emission categories as determined by the “progressive narrowing” method. The intention of the survey is to gain these industry leaders’ insights on key emission sources and their experiences in management practices.

5.2 Topic modeling and its visualization

The study utilized Latent Dirichlet Allocation (LDA), a generative probabilistic model proficient in categorizing textual data into distinct topics. This model is operationalized using Gensim, an open-source library renowned for its adeptness in unsupervised semantic modeling. LDA enables an in-depth analysis of voluminous textual data, from which valuable insights can be gleaned for supply chain optimization scheme.

Both primary and secondary data will be analyzed to enrich the depth and breadth of the research. Primary data, sourced from survey responses, offers first-hand observations of prevailing industry practices. Secondary data, extracted from academic publications, provides theoretical underpinnings and empirical validations.

For thematic structure visualization, the study initially adopted word clouds, offering a visually engaging representation of key themes. In these clouds, the prominence of each word is indicated by its size, reflecting its frequency or significance within the dataset. This visualization method instantaneously conveys the most salient concepts derived from the textual data, such as specific technologies, policy measures, or market trends. The conspicuous presence of the term “renewable energy” in the word cloud, for instance, underscores the importance of exploring strategies for its adoption within the supply chain.

Complementing word clouds, the research incorporated pyLDAvis, a sophisticated tool designed to augment the interpretability of LDA models. PyLDAvis presents an interactive platform, graphically depicting the topics unearthed in the textual analysis. Its dynamic interface is instrumental in anatomizing thematic interplays. For example, if “electric vehicles” frequently co-occurs with “government incentives”, it suggests a potential strategy of leveraging such incentives for electric vehicle integration in the supply chain transportation.

6 Emission predictions for reduction measures and policy assessment

To address the forecasting of Scope 3 GHG emissions, the research will employ, compare, and validate various machine learning models for robust data analysis and prediction accuracy. Figure 5 displays the comprehensive workflow.

Fig. 5
figure 5

Flowchart of machine learning procedure

6.1 Data collection

The dataset will comprise historical quarterly carbon emissions calculated by the SaaS platform, along with the data pertinent to production and market activities, such as the quarterly production and sales volumes of automobiles, the types and quantities of utilized raw materials, and various metrics of energy consumption. Furthermore, with the objective of predicting the potential impacts of emission reduction initiatives and assisting enterprises in making informed decisions regarding the adoption and investment in specific environmental technologies, there is a need to amass data concerning the implementation of relevant technologies and strategies, including the proportion of electric vehicles in the transportation fleet, selection of suppliers, and the ratio of waste recycling and processing in factories.

6.2 Data vectorization

For this dataset, two tasks of data vectorization to ensure the consistency and analyzability of data have been carried out. For binary category text, such as whether to integrate electric vehicles within the transport fleet, it will be encoded using a binary numeric system, represented as 0 or 1. For multivariate categorical features, such as types of energy utilized (e.g., coal, natural gas, and renewable energy), these will be converted into a series of binary dummy variables and then incorporated into the original dataset.

6.3 Data imbalance and scarcity handling

To confront data imbalance challenges, particularly with rare events or underrepresented categories within the dataset, the Synthetic Minority Over-sampling Technique, which generates synthetic samples from minority classes will be employed to balance the dataset. Besides, anomaly detection algorithms will be leveraged to identify and emphasize rare emission patterns, enhancing the model’s sensitivity to pivotal but scarce emission events. These strategies will ensure that our models are well-equipped to handle diverse scenarios, thereby boosting the robustness and reliability of our carbon emission predictions.

6.4 Algorithm selection

In the preliminary phase of the study, with a relatively limited volume of data, traditional machine learning models such as Linear Regression, Decision Trees (DT), and RF will be employed. These models facilitate the extraction of initial insights into the trends of carbon emissions and contribute to the understanding of direct correlations between individual variables and carbon emissions. For example, DT can be utilized to discern key factors that influence carbon emissions, and to determine their respective threshold values, which can be understood as the specific points where carbon emissions will exhibit significant changes.

As the quantity of data gradually expands, it has been decided to implement ensemble learning algorithms, focusing on Gradient Boosting Decision Trees (GBDT), which are recognized as the optimal algorithm for regression problems. In GBDT, each subsequent decision tree is trained based on the residual errors from all previous trees, boosting the model’s overall predictive accuracy with every iteration (Ke et al. 2017). This process strives to continuously reduce predictive inaccuracies. GBDT is adept at integrating various influential factors, including production levels, types of energy usage, and shifts in policy and regulation. Suppose that a company plans to broadly adopt renewable energy in a particular year and benefit from new environmental tax incentives, the GBDT model can be utilized to forecast the ramifications of these policy adaptations and behavioral changes on the firm’s future carbon emissions (Li et al. 2024).

With further increases in data volume and the complexity of features, multi-layer Artificial Neural Networks (ANN) become an ideal choice for handling data with complicated patterns and nonlinear relationships. Considering the array of emission reduction measures that an automobile manufacturing enterprise might undertake, such as utilizing Internet of Things (IoT) technology for monitoring energy consumption in production processes, applying big data to optimize production planning and resource utilization, and installing solar panels in production facilities, ANNs can be employed to comprehensively analyze the effects of these diverse tactics. Importing the Keras library, a sequential ANN model incorporating multiple hidden layers with varying numbers of neurons can be constructed (Goodfellow et al. 2016). This model could accurately predict the milestones and expected outcomes of emission reduction projects. Thus, it provides decision support for enterprises in formulating effective emission reduction strategies amidst the dynamic market and policy environment. In this manner, deep learning will assist in better understanding and managing their carbon footprint, advancing the achievement of sustainable development goals.

7 Future directions and challenges

As technology advances and market conditions evolve, strategies for managing and reducing Scope 3 emissions in the automotive manufacturing sector must continually adapt and upgrade. This section explores promising future trajectories of these strategies alongside potential challenges they may meet.

7.1 Potentials of technological advances

Artificial intelligence: advancements in machine learning will refine predictive models, enhancing corporate insights into emission patterns and facilitating the formulation of more efficient reduction strategies.

IoT applications: enhanced IoT integration in supply chain management promises more granular and real-time monitoring, improving emissions management through precise data collection.

Blockchain technology: blockchain adoption can elevate data transparency and security, paving the way for streamlined information exchange among stakeholders.

7.2 Shifts in policy and market landscapes

Policy formulation: the growing focus on climate change globally anticipates the development of policies and regulations aimed at carbon emissions, potentially affecting corporate strategies for emission management.

Market dynamics: the expanding interest in sustainability among consumers and investors is likely to prompt more enterprises to proactively manage Scope 3 emissions.

7.3 Potential challenges

Complexity in data analysis: exponential data growth complicates its effective management and analysis.

Globalized supply chains: the globalization of supply chains requires firms to adapt to regulatory environments and operational challenges across jurisdictions.

Return on investment: investment in emerging technologies and enhancements in supply chain processes often requires significant initial outlays, which could deter certain enterprises.

Stakeholder engagement maintenance: it is indispensable to manage diverse stakeholders’ expectations and needs, establish effective communication and feedback mechanisms, and resolve conflicts of interest.

Addressing these challenges calls for flexible and adaptive approaches, with an emphasis on continuously monitoring market and policy shifts, and regularly reassessing reduction tactics for long-term viability. Furthermore, it is crucial to ensure that mitigation efforts effectively meet both immediate and future needs.

8 Conclusion

This article reviews the status quo of Scope 3 emissions management and introduces a comprehensive strategy for Scope 3 emissions accounting, management, and reduction, with a focus on the automotive industry. Through analyzing existing literature and practices, we find that despite facing challenges with data gathering, standardization gaps, and cross-border regulatory differences, there exists immense potential for significantly cutting emissions by embracing technological advancements and strategic enhancements. We particularly emphasize the crucial role of advanced technologies like cloud computing, text analysis, and machine learning in enhancing Scope 3 emissions management. These technologies facilitate more accurate corporate emission quantification and further reveal novel emission reduction pathways.

Moreover, future technological innovations, policy shifts, and market dynamics have been discussed, highlighting the adaptation challenges that enterprises might face in the ever-changing technical and social landscape. For long-term effective management of Scope 3 emissions, enterprises need to deploy adaptive and flexible strategies, maintain continuous tracking of external variations, and periodically evaluate and modify their reduction measures. By aligning with technological progress and market evolution, the automotive manufacturing sector is positioned to not only manage emissions effectively but also promote sustainability and long-term growth, contributing to the broader goals of sustainable development.