Assessing the Gap between Technology and the Environmental Sustainability of European Cities

The growth of cities’ population increased the interest in the opportunities and challenges that Information and Communication Technology (ICT) have on carbon footprint reduction, which fosters their environmental sustainability. Using Principal Component Analysis (PCA), six ICT-related variables from European Union (EU) cities were combined into a single two-dimensional ICT index. Then, through cluster analysis, cities were clustered into four groups based on the ICT index and Carbon dioxide (CO2) emissions. Using ICT as an indicator of smartness and CO2 emissions as an indicator of sustainability, we show that it is possible for a city to be smart but not sustainable and vice versa. Results also indicate that there is a gap between cities in northern Europe, which are the top performers in both categories, and cities in south-eastern Europe, which do not perform as well. The need for a common strategy for achieving integrated smart, sustainable and inclusive growth at a European level is demonstrated.


Introduction
The growth and development of an economy are intrinsically tied to its investment in the development, adoption and innovative use of modern technologies (European Commission 2010a; Gouvea et al. 2017). The effective deployment and use of ICT can increase the rate of innovations in societies and economies (Gouvea et al. 2017). Furthermore, ICT is the binding element of the triple-helix dimension of sustainability; economy, environment and society (Gouvea et al. 2017;UNECE 2015a).
The EU is taking a leading role in championing the push for a digital agenda for Europe, as the successful implementation of an ICT strategy is vital to the inclusive, equitable and sustainable economic development of Europe (Gouvea et al. 2017). This vision is contained in the EU's strategy for smart, sustainable and resilient growth which articulates the EU's plan to achieve an inclusive, sustainable Europe and sets targets for reducing Green House Gases (GHG) by 2020 (European Commission 2010a). Some of the targets include more energy-efficient ICT products and ICT-enabled energyefficient buildings (OECD 2009). A pivotal component of this plan is an urban agenda which aims to make cities attractive and sustainable through an integrated and coordinated approach (European Commission 2010b).
The role played by ICT in the development of smart and sustainable cities globally cannot be over-emphasised (European Commission 2010a). Cities are important to climate change and environmental sustainability discourse. This notion is because cities around the globe accommodate more than 70% of the global population and are responsible for about 75% of global energy use and energy-related GHG emissions (IPCC 2014a;UN-HABITAT 2011). According to Gouvea et al. (2017), Bit is clear that without the diffusion and dissemination of ICT; countries cannot garner the full environmental benefits and the positive externalities that are generated as a result^. It is crucial to assess the relationship between ICT and environmental sustainability within cities in Europe, considering the serious environmental problems, the world is facing, and the interest of the EU in pursuing a digital agenda to make her cities smarter.
Previous studies have sought to understand the association between ICT and environmental sustainability using regression. Wu and Raghupathi (2018) performed a country-level study to examine this relationship and found out that ICT is positively correlated with sustainability and thus has the potential to promote environmental sustainability. Furthermore, Gouvea et al. (2017) reported a positive linear relationship between environmental sustainability and ICT in countries. However, Añón Higón et al. (2017) reported a contrasting result identifying a non-linear relationship between ICT and environmental sustainability in the shape of an inverted U-curve. The findings of these studies are based on the macro-level using country data and fail to account for the variations in this relationship from one country to another. One may ask if the relationship found is the same across all countries studied or if there are exceptions.
Furthermore, there is still a knowledge gap on the microlevel interaction of ICT and environmental sustainability. By Bmicro-level interaction^, we mean studying this interaction using city level data as opposed to the country level data (macro-level) used in previous studies. Country level data are aggregations produced by averaging measurements across different units (cities, towns and villages) within the country. This differs from city level data which are used to infer citylevel interaction of ICT and environmental sustainability, study within country variations and perform inter-city comparisons. This paper strives to contribute to this discourse by using a different approach to study this interaction on a microlevel using city data. Thus, this research aims to answer the following research question: is there a relationship between the ICT development and environmental sustainability of cities? A follow up to this question is: what is the nature of this relationship and is it consistent or does it vary across different cities? This question is vital to understand the concept of smart cities in relation to environmental sustainability. The concept of Bsmart city^and Bsustainable city^is often used interchangeably in literature without a proper understanding of the relationship between them (Bibri and Krogstie 2017a).
Furthermore, several smart city assessment frameworks strongly focus on ICT development without understanding its relationship with environmental sustainability (Ahvenniemi et al. 2017). We intend to clarify this issue empirically by using data from cities. Specifically, we used data from the year 2016 of 129 cities in 28-member states of the EU (EU-28). Our research approach is as follows: 1. Propose a framework to measure the ICT development of EU cities. 2. Calculate a single measure of ICT development (an ICT development score) based on the framework developed in objective 1. 3. Determine an appropriate measure of environmental sustainability in EU cities. 4. Establish the relationship between the ICT development and environmental sustainability of EU cities.

5.
Cluster all cities in Europe based on their ICT and environmental sustainability relationship.
Our work makes significant contributions to the gaps identified in the literature. We present an exploratory analysis of the relationship between the smart cities concept and environmental sustainability using data on 129 cities within the EU. In the context of this paper, the smart cities concept is viewed from the perspective of ICT development within a city and is measured using six ICT indicators, while environmental sustainability is proxied using the GHG emissions of cities. Furthermore, we contribute to understanding the role different dimensions of ICT development play in urbanisation and environmental sustainability. This study yields valuable insights for policymakers in the urban domain on the dynamics of different levels of ICT development and its relationship to environmental sustainability for different cities within the EU.
The rest of this article is structured as follows: In Section 2, we conduct a literature review of key concepts related to this study; in Section 3, we present the framework to measure the connection between ICT and environmental sustainability; in Section 4, we present our data and methodology; Section 5 shows the results of our analysis; Section 6 discusses the results obtained presenting the implications and limitations of the paper as well as possible future work; and Section 7 presents the conclusions.

Literature Review
We delve into the literature to further understand and contextualise the Bsustainability^, Bsmart^and Bcity^concepts within the scope of our study.

The BSustainability^Concept
Sustainability is often used in literature with reference to development to connote the act of balancing the pace of development with the number of resources required to achieve such a pace. Environmental sustainability can be traced back to the eighteenth century when Carl von Clausewitz postulated that we should not be cutting down trees at a rate higher than that at which they are replaced (von Clausewitz 2009). This postulation is the underlying principle of environmental sustainability: using the earth's resources faster than it is being replenished. This unsustainable use of the earth's resources to meet man's immediate needs has resulted in, among other things, an increase in the amount of CO 2 in the atmosphere. An increase in atmospheric CO 2 will, in turn, lead to the warming of our planet precipitating the risk of flooding, drought and an increase in natural disasters among other things (IPCC 2018).
The concept of sustainability was first introduced to the urban domain in the early 1990s (Wheeler and Timothy 2014). It was born out of a realisation of the risks that urban development poses to the environment which may lead to a perilous future (Bibri and Krogstie 2017a). According to Bibri and Krogstie (2017a), urban sustainability is Ba desired state in which urban society strives for achieving a balance between environmental protection and integration, economic development and regeneration, and social equity and justice within cities as long-term goals through the strategic process of sustainable urban development as a desired trajectory^. This definition is supported in the various literature which hinges sustainable development on three components; economy, environment and society (Jenks and Jones 2008;UNECE 2015b).

The BSmart^Concept
The word Bsmart^has been used in several contexts in literature. However, it has been most used to refer to the application of ICT in the creation and management of products, services and systems (Nasrawi et al. 2016). Its use in the urban domain can be traced back to the 1960s when the term Bcybernetically planned cities^was used to refer to the use of ICT to plan urban environments (Gabrys 2014). Since then, the word Bsmart^has been more uniformly used when referring to the use of ICT in the urban domain (Mustafa and Kar 2017). This practice ranges from understanding how cities function to improve their management, to offering services that improve the quality of life of citizens (Bibri and Krogstie 2017a;Rana et al. 2018). It has also improved other aspects of society connected to the urban domain such as education, communication, financial services and commerce (Cruz-Jesus et al. 2017).

The BCity^Concept
According to Nasrawi et al. (2016), a city Bis a place where people live that is larger or more important than a town^. Its importance is due to the size of its population and the concentration of economic opportunities in them. From being smaller in size, they have rapidly grown because of urbanisation, and this has profound consequences on global sustainability. This research views cities as an object to which the Bsmart^concept and Bsustainability^concept can be linked. In literature, there are two research strands related to cities and sustainability. One focuses on the impact of urbanisation and cities on sustainable development and the other focuses on how we can make cities more sustainable. One of the ways to make cities more sustainable is through the proper deployment of ICT in its various forms to monitor, understand, probe and plan cities. This approach has come to be known as the Bsmart sustainable city concept^by academia and the international community (Bibri and Krogstie 2017a).

The Bsmart sustainable city^Concept
Smart sustainable cities is a concept that fuses environmental sustainability, urbanisation and technological development (Höjer and Wangel 2015). It is a term that combines smart city and sustainable city, used to describe the use of ICT to enable cities to become more sustainable and to improve the quality of life of citizens (Akande et al. 2019;Al-Nasrawi et al. 2015;Bibri and Krogstie 2017b;Höjer and Wangel 2015;A Kramers et al. 2016). The use of ICT in urban systems and domains includes but is not limited to Bsensing, collecting, storing, coordinating, integrating, processing, analysing, synthesising, manipulating, modelling, simulating, managing, exchanging, and sharing data for the purpose of monitoring, understanding, probing and planning modern cities to achieve particular goals^ (Bibri and Krogstie 2017a). The emerging field of smart sustainable cities is gradually developing into two main research strands; one that focuses on the effects and implication of ICT use and infrastructure on urban sustainability (Anna Kramers et al. 2014;Nasrawi et al. 2016) and another that deals with the development of integrated frameworks to measure the combined smartness and sustainability of cities (Ahvenniemi et al. 2017). This research is line with the first strand, focusing on the relationship between ICT and environmental sustainability in cities.

The Link between ICT and Environmental Sustainability
It is worthy to note that, despite the obvious link between ICT and environmental outcome, there are no statistical indicators that directly measure this link (OECD 2009). Research exploring the impact of ICT and the information society on environmental sustainability only began appearing in literature in the year 2000 (Ospina and Heeks 2010). A survey of this literature reveals three interrelated research strands: sustainable development, mitigation and adaptation.
The first strand of research in this field explores a global perspective to link sustainable development and the information society. These researches discussed the potential effect the rapid expansion of ICT could have on the achievement of the Millennium Development Goals and environmental sustainability. Slob and van Lieshout (2002) studied the trends of ICT development and their implication on sustainability. Although they found that ICT has stimulated economic growth, they were not certain of the effect the rapid developments in ICT will cause on the environment and called for more research into technology-environment interactions. Furthermore, (Willard and Halder 2003) considered the implication of Information Society on sustainable development, highlighting the consequences environmental information systems, e-commerce and e-participation will have on sustainability from a broad perspective.
The second research strand of literature found focuses on the role ICT applications play in the reduction of CO 2 emission. Here, the International Telecommunications Union plays a prominent role identifying the direct, indirect and systematic effects of ICT on CO 2 emission and recommending policies and activities aimed at reducing the adverse effects and promoting the positive effects (ITU 2008). It focuses on how the use of ICT can affect the environment in developed countries. For example, how the use of video conferencing instead of travelling can reduce the carbon footprint of the user or the use of e-mails instead of sending a postal mail in paper form (Wu and Raghupathi 2018). Literature here focuses on the opportunities posed by ICT to reduce CO 2 emission generated by energy consumption (ITU 2009), travel and mobility (Sustainable Development Commission 2010) and industries (Mingay and Pamlin 2008).
The third strand of research explores how ICT can play a role in climate change adaptation issues in developing regions because these are most vulnerable to the effect of increased CO 2 emission. This research strand examines how ICT can be used as a tool to foster innovation and involve all stakeholders in climate change strategies (Ospina and Heeks 2010). This is exemplified by researches on poverty reduction and environmental protection in Kenya (Mungai 2005) and the use of SMS to report air pollution in the Philippines (Dongtotsang and Sagun 2006).
Of these three strands, our research falls within the second as we explore how to exploit this relationship and apply it within the context of cities and urbanisation (Yi and Thomas 2007).
ICT products and services are playing an essential role in improving the sustainability and liveability of cities. These products and services range from the new wave of computing such as big data analytics, Internet of Things (IoT) and Artificial Intelligence (AI) to ICT-enabled applications such as eCommerce, eBanking, eHealth, eLearning, and e-mail (Bibri and Krogstie 2017a;Cruz-Jesus et al. 2017;Pappas et al. 2018). They enable new sustainable ways of doing business and providing services in what is known as BDigital Transformation^ (Pappas et al. 2018;Venkatesh 2008). Furthermore, when combined with e-government, these eservices can facilitate the transition of society into Bdigital nations^ . Most importantly, they promote dematerialisation which has implications for the environment. Big data generated from various IoT technology gives businesses an insight into how their products and services are affecting the environment providing them with an opportunity to help decarbonise the global economy (Mikalef et al. 2017). Furthermore, ICT gives businesses a competitive advantage via eCommerce (Gorla et al. 2017;Guo and Gao 2017), provides a cost-effective and flexible way to learn via eLearning (Joseph et al. 2017;Teo et al. 2018), improves communication between medical personnel and their patients via eHealth (Maresova and Klimova 2017;Serrano et al. 2018), increases the speed and flexibility of financial transaction via eBanking (Garín-Muñoz et al. 2018) and reduces the cost of communication via e-mail (Gouvea et al. 2017). Overall, the ICT services highlighted above have reduced the need to commute and are inherently connected with environmental sustainability.
There are various studies on the connection between ICT, urbanisation and sustainability. Using ideas proposed by Giffinger et al. (2007), Bifulco et al. (2016) studied the roles of ICT in improving sustainability within smart cities. Using the Network Readiness Index of the World Economic Forum as a proxy for ICT and the Environmental Performance Index as a proxy for environmental sustainability, Gouvea et al. (2017) studied this link at a country level. This link was studied using ordinary least squares regression with ICT as the predictor variable. They found a positive and significant linear relationship between ICT and environmental sustainability. Their work was in line with Wu and Raghupathi (2018), who did an exploratory study on the strategic association between ICT and sustainability at country level using data from the World Bank Group. Wu and Raghupathi (2018) studied this relationship using multiple linear regression with five ICT factors derived from averaging various ICT measurements as explanatory variables. They found that ICT factors are positively associated with sustainability. However, Añón Higón et al. (2017) using ordinary least squares regression, discovered that the relationship between ICT and CO 2 emission is not linear but rather an Binverted U-shaped^at country-level. Watson et al. (2010) also discussed the role of ICT in improving energy efficiency and fostering changes that reduce the environmental impact of cities.
Despite the wealth of literature in analysing the relationship between ICT development and environmental sustainability across different countries from a broad perspective, this relationship is not yet fully understood at a micro-level. Hence, there is a need to take a comprehensive approach to understand the relationship between ICT solutions and environmental sustainability (Bibri and Krogstie 2017a, b). Although Kramers et al. (2016) argue that it is difficult to determine the role of ICT in environmental sustainability because of the larger socio-technical system which encapsulates ICT, we believe that this complex relationship can be broken down by operationalising theoretical constructs and studying specific aspects of this relationship. A foundational step in deciphering this relationship is understanding how this relationship varies different cities at a micro-level.

Measuring the Connection between ICT and Environmental Sustainability
Measuring the link between ICT and environmental sustainability requires the use of indicators. The Organization for Economic Co-operation and Development (OECD) Working Party on Indicators for the Information Society (WPIIS) is the international body charged with the responsibility for setting international standards for ICT indicators, which it has been doing since 1997. Although there are existing information society / ICT indices which we could have made use of, these indices exist only at macro (national) level. Some of these indices include the IDC's Information Society Index (Karvalics 2005), ICT development Index (ITU 2005(ITU , 2017a the International Telecommunications Union's (ITU) Digital Access Index and Digital Opportunity Index (ITU 2005), the World Bank's Digital Adoption Index (World Bank Group 2016), the United Nations Development Program's Technology Achievement Index and the Infostate Index (Huyer et al. 2005). These macro-level indices represent a country with a single index and fail to consider the individual differences within the cities in the country. For example, in line with Oni and Papazafeiropoulou's (2014) idea, we would expect that the characteristic of a capital city is different from other cities within the same country.
Furthermore, some of these existing ICT indices are opaque on their derivation methods, while others incorporate several indicators that are not relevant for our study (Taylor 2016). This inconsistency implies that the methodological characteristics of existing ICT indices cannot be fully evaluated. The indicators included in these ICT indices reflect the purpose for which they were developed, which is not aligned with ours. Hence, we build our own index by analysing technologies which are relevant to the scope of our study, as opposed to making use of a generic index. The challenge with this approach is the unavailability of consistent data, relevant for this study, for all cities. We need to make a compromise between the number of variables to include in our study and the number of cities for which these data are consistently available. Cruz-Jesus et al. (2017) described this as making Ba trade-off between the depth and the width of the analysis^. If more variables are used, and fewer cities are included in our study, our findings may be narrow and limited to only the cities we have included, making it impossible to generalise our conclusions to other contexts. Conversely, if we include more cities and make use of fewer variables, our analysis may not fully capture the various dimensions of ICT development, and hence, we may arrive at misleading conclusions.
Lastly, the process of associating a weight to each variable used to build the ICT index can lead to unreliable results if subjectively done (Bruno et al. 2010). Some of the subjective approaches used in previous studies include the Equal Weighting approach (World Bank Group 2016) and the participatory approach. The Equal Weighting approach assumes that all variable contributes equally to the index (OECD 2008). However, this is not interesting because there is no empirical basis for this assumption. The participatory approach involves the use of principal stakeholders to assign weights based on their experience. Both approaches create indices without taking cognisance of the interrelationship between indicators. Having fully understood the opportunities and obstacles of building an ICT index from variables, we opted for the use of multivariate statistics to explore the suitability, underlying nature and structure of our data and used that information to assign weights and build an index. This approach takes care of the redundant variables included in our analysis as a result of the subjective selection of indicators.

Conceptual Model
We turn to the literature to build on previous theoretical work to help us develop a conceptual model and select appropriate indicators for analysis. The conceptual framework proposed by OECD for measuring the information society consists of the following interconnected modules: ICT supply, ICT Infrastructure, ICT demand, ICT products, information and electronic content and ICT in a broader context (OECD 2005). Several researchers have adapted modules from this framework and used it to study the ICT impact relationship in different contexts such as economic development (Cruz-Jesus et al. 2017) and sustainable development (Azadnia et al. 2017). Furthermore, building on the framework proposed by Berkhout and Hertin (2001) to study the demand side (users and uses) of ICT and its effect on the environmental sustainability, we sought to explore this link by including indicators that measure the infrastructure and pervasiveness of ICT in our study.
ICT demand entails the use of ICT goods or services by various groups and industries to carry out their activities efficiently. The European Commission's Digital Agenda for Europe clearly outlines the role of e-services and their importance to service delivery in connection with improving the quality of life of citizens (European Commission 2014). Specifically, eBanking, eLearning, e-mail and eHealth were listed as being Bsome of the most innovative and advanced online services^(European Commission 2014). The use of these services has been included as variables to outline the use of ICT by individuals within cities. Furthermore, we included the percentage of enterprises selling online to take cognisance of the pervasiveness of ICT among business units. This variable is an important indicator of dematerialisation and electronic commerce (Cruz-Jesus et al. 2012;OECD 2009).
ICT infrastructure is the services on which the information society relies on for its proliferation (Rossy de Brito et al. 2018). We decided to include Broadband Internet per 100,000 persons because of the importance of a broadband connection to access the Internet. Broadband Internet is increasingly becoming important because of the recent explosion of bandwidth-intensive applications such as online gaming, audio and video streaming (Kim and Hwang 2012). Broadband Internet per 100,000 persons has been used by many authors to measure ICT development (Cruz-Jesus et al. 2017;OECD 2009;Vicente and López 2011) Data on this variable are widely available with relatively long-time series. There are standardised methodologies and collection procedures of this data (OECD 2009). This variable is included to measure the level of connectivity with regard to ICT infrastructure within cities (Cruz-Jesus et al. 2012).
In the domain of environmental sustainability, several authors have discussed the strong connection between environmental sustainability and GHG emission in cities (Owusu and Asumadu-Sarkodie 2016;Privitera et al. 2018). Although GHG comprises CO 2 , methane gas, nitrous oxide and fluorinated gases, CO 2 is the primary pollutant accounting for about 80% of emissions (World Bank 2007). CO 2 emission originates from anthropogenic sources such as power generation, fossil fuel generation and use, construction and heating. Hence, we will be making use of CO 2 as our leading indicator of environmental sustainability. As such, our conceptual framework is schematised in Fig. 1. It is made up of two pillars (perspectives): technological development, measured by ICT development, and environmental sustainability, measured by GHG emission. The technological development pillar is divided into two dimensions: ICT demand, which measures the pervasiveness of ICT by characterizing its various users and uses, and ICT infrastructure, which measures the connectivity of the information society using broadband Internet. The environmental sustainability pillar is assessed as the environmental impact of GHG emission as measured by CO 2 emissions.

Data
Based on our theoretical framework as outlined in section 3.1, we have selected six variables to characterise the ICT development of cities and one variable to characterise the environmental sustainability of cities in Europe. These variables are listed in Table 1. The selected variables are in line with recommendations from the OECD, the European Commission and the Intergovernmental Panel on Climate Change (IPCC) (European Commission 2010a; OECD 2009). These indicators were selected based on literature and recommendations from the ITU, OECD, IPCC and the European Commission.
CO 2 emissions are a key indicator of contribution to climate change by cities. The parameter has been used in the past as a core indicator or environmental sustainability and climate change by the IPCC (IPCC 2014b).
It is important to note that the variables listed in Table 1 are not all-inclusive of every factor that contributes to the ICT development and digital sustainability of a city. However, we believe they are adequate within the bounds of data availability limits to measure the ICT development and environmental sustainability within cities in Europe. Data for variables were sourced from the ITU, OECD and Eurostat as compiled by 2thinknow City Benchmarking. 1 These secondary data sources were primarily chosen because of the following reasons: -They are the most authoritative source of data about the availability of ICTs in households and usage of ICTs by individuals (Eurostat 2016;ITU 2017b

Exploratory Data Analysis
An exploratory analysis was done using numerical techniques that are statistically robust to understand the structure of our data. Summary statistics were calculated for each variable with the coefficient of asymmetry (skewness) and histograms used to check for the presence of outliers. Although, all variables under consideration are numerical, the range of values of eBank. eLearn, e-mail, eHealth and eCom differ from that of BroIt. While eBank, eLearn, e-mail, eHealth and eCom are expressed as percentages with a range of 100, BroIt is expressed as a count with a range of 100,000. Hence, if used in their raw form, variables with higher numerical values will be assigned higher weights which would undermine the contribution of other variables in our analysis . We, therefore, normalised all explanatory variables using the minimum and maximum values of each respective variable transforming them to a scale between 0 and 1 before performing our hierarchical clustering tests (Akande et al. 2017). Furthermore, because PCA depends on the correlation structure of our data, Pearson's linear correlation coefficient was calculated to assess the existence of a linear relationship between our explanatory variables. The presence of a correlation between our explanatory variables will help us get meaningful results (Cruz-Jesus et al. 2017).

Principal Component Analysis (PCA)
PCA was used to transform our data from high dimensional space into a low-dimensional space (Spicer 2005). However, before performing PCA, we checked the suitability of our dataset to be analysed by testing for certain PCA assumptions. Firstly, we tested the linear relationship between all variables using Pearson correlation coefficients. Secondly, we tested for the sampling adequacy of our dataset using the Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy. Thirdly, we tested our data's suitability for reduction using Barlett's test of Sphericity (Cruz-Jesus et al. 2017). After that, we performed PCA on the six ICT variables to derive a single measure each of ICT development. PCA describes the variation in our data by a set of uncorrelated variables known as the principal components. The ICT variables produced six principal components. We then investigated if the first few components account for most of the variation in our original dataset to retain them and discard the remaining components. Selecting the number of principal components to retain is a fundamental decision, as choosing fewer principal components than required will produce an incomplete representation of results and selecting more principal components than necessary will result in the inclusion of noise in the results (Friesen et al. 2016). Kaiser's criterion was used to determine the number of principal components to retain (Cruz-Jesus et al. 2017). Variable eigenvectors were investigated to determine which variables contributed the most to each dimension to aid the interpretation of our PCA results. We made use of the contribution of each variable to the inertia explained by each axis (CTA) and the part of variance associated with each variable explained by each axis (CTR) (Koch 2013) to determine the contributing variables to each axis. We made use of CTA and CTR because they can be used in interpreting our results statistically and geometrically (Abdi and Williams 2010).
Cronbach's Alpha was used to measure the internal consistency of each component (Cruz-Jesus et al. 2017). Finally, each component selected was weighted based on its variance  Variable loadings were multiplied by each selected principal component's weight and summed up to obtain a factor score. From the normalised table, we then calculated the coordinate of each city in relation to this factor score, multiplying the coordinate of vectors representing these cities by their respective factor score and summing them together (Friesen et al. 2016). This is more fully explained in Appendix B.

Cluster Analysis
Cluster analysis was used as a robust method for identifying homogeneous groups of objects (in this case, cities) called Bclusters^sharing similar ICT development / environmental sustainability characteristics in such a way that these groups Bcan achieve maximum internal homogeneity (within the cluster) and maximum external heterogeneity (between clusters)^(Brian Everitt et al. 2011;Oliveira et al. 2017).
After developing a quantitative index to measure the ICT development of cities in Europe using PCA, we made use of cluster analysis to create homogenous groups of cities based on this index and the amount of CO 2 emission in each city. We made use of a hybrid clustering approach, combining both hierarchical clustering method and partitioning method (Kumar et al. 2014). Hierarchical clustering was done to determine the number of clusters to extract since we had no prior information about the number of clusters to group the cities (Balcan and Gupta 2014). In hierarchical clustering, the structure of our data determines the number of clusters at each hierarchy. We also made use of the centroids of the selected clusters in the hierarchical clustering as seeds for the partitioning clustering method. The basic algorithms for hierarchical clustering can be described as follows: 1. There are N objects to group 2. We find the closest two points and merge them into a new point 3. We compute the similarity (distance) between this new point and the remaining points. 4. Iterate through steps 2 and 3 until there is only one point remaining.
The hierarchical algorithm and distance measurement used determines the quality of clusters obtained (Leisch 2006). We made a comparison of the cluster solutions obtained by using different hierarchical clustering algorithms. These include centroid, Ward's method, average-linkage, single-linkage and complete-linkage method. Each of these methods was evaluated using its R-Squared (R 2 ) value and the method with the highest R 2 value was selected. The R-squared value is a measure of the proportion of the total variance that is retained in each solution (see Appendix A). The selected hierarchical algorithm was thereafter tested using different distance measurement including Euclidean distance, squared Euclidean distance, Minkowsky and city-block distance (Brian Everitt et al. 2011). Each of these distance measures gave similar results. The result of the best hierarchical clustering algorithm was visualised using a dendrogram (also known as a binary tree) which provides a visual depiction of the formation path of the clusters (Salah et al. 2012).
K-means clustering algorithm requires that the number of clusters, K, be specified before applying the algorithm. To determine the number of clusters to use from the hierarchical clustering, we made a visual comparison of the number of clusters and the distance value at which individuals agglomerate to form a group ). The dendrogram was thereafter Bcut^at the appropriate number of clusters level. The centroids of each cluster were used to generate initial seeds for the partitioning clustering methods (k-means). The k-means algorithm is also implemented in four steps: 1. Partition objects into k non-empty subsets 2. Calculate initial seeds as the centroid of the cluster of the current partition. 3. Assign each object to the cluster that has the closest centroid 4. Iterate through steps 2 and 3 until the centroid no longer moves.
This two-step approach is guaranteed to generate reliable results (Brian Everitt et al. 2011). The mean of the ICT index and CO 2 emission (previously normalised) were computed within each final cluster to profile each city, allowing us to formulate a hypothesis about the distribution of the association between ICT and CO 2 among cities in Europe.
Furthermore, we sought to find out if the relationship between ICT development and environmental sustainability and the geographical location of each city is correlated using Global Moran's I spatial autocorrelation (Gutiérrez et al. 2017).
Choropleth maps to visualise our results were designed in ArcMap, and all other statistical analyses were carried out in R (B. Everitt and Hothorn 2006) and SAS (Der and Everitt 2010).

Exploratory Analysis
A total of 129 European cities for which we were able to obtain data were analysed for all variables under consideration. The average percentage of the population using eBanking, eHealth, e-mail and eCommerce services are less than the median of each variable. Hence, the asymmetry of the distribution of each of these variables can be inferred as being negatively skewed. Conversely, the mean percentage of the population using eLearning services, the mean number of broadband Internet per 100,000 persons and mean amount of CO 2 emission is more than the average of each variable making the asymmetry of the distribution of each variable positively skewed. Further exploration of the coefficient of asymmetry and histogram of each variable showed that the variables had an absolute skewness value close to zero and did not need to be transformed (Aesaert et al. 2017). It is interesting to note that while only 5% of the population uses eBanking service, and 42% of the population uses e-mail in Timisoara (Romania), being the lowest among the cities under consideration, Copenhagen (Denmark) has the highest values in both categories with 93% of its population using eBanking service and 97% using e-mail. Timisoara (Romania) also has the lowest percentage of people using the Internet for eCommerce related activities with a value of 7%. The highest percentage of a city's population using the Internet for healthrelated and learning activities was in London with a value of 80% and 20% respectively. Although Dubrovnik (Croatia) had the lowest amount of CO 2 emission, it also had the lowest percentage of people using the Internet for learning activities. Paris (France) had the highest amount of CO 2 emission among all cities under consideration. Even though these results give us an idea of the relationship between ICT and environmental sustainability among cities in Europe, simple univariate statistics is not sufficient in clearly assessing this relationship because we are dealing with multiple variables. Hence, we make use of multivariate statistical techniques to analyse all seven variables.
The Correlogram of the explanatory ICT variables shown in Fig. 2 shows that each variable has, at least, one statistically significant (0.01) correlation coefficient of 0.55 with another variable. This illustration shows the presence of a linear relationship between all variables. Of particular interest is the relationship between the percentage of people using the Internet for Banking services and the number of people using e-mail which has a correlation value of 0.9 showing that they are extremely correlated. Conversely, the broadband Internet penetration has a low correlation value of 0.25 with the number of people using the Internet to seek health-related information.

Principal Component Analysis
We performed the KMO test to confirm the sampling adequacy of our data for PCA. The KMO for the overall dataset gave 0.77, which means that our PCA will produce reliable results. We also executed Bartlett's test of sphericity to test the suitability of our data for reduction. This test gave us a chi-square value of 370.2 with a p value less than 0.01 making it significant. Hence, we have adequate correlations between our variables for it to be reduced to a smaller number of components.
We carried out a PCA to reduce the complexity of the ICT variables. As indicated in our methodology, selecting the number of principal components to retain is an important decision. Following the Kaiser criterion, we selected the first two principal component which has a cumulative variance of 77% to represent our data ( Table 2).
The first principal axis (PC1) has a variance of 60%. Variables with CTA and CTR values higher than the average in an axis are considered to contribute significantly to that axis. These CTA and CTR values and their corresponding variables have been highlighted in green in Table 2. eBank, eHealth, e-mail, eCommerce and eLearn all contribute significantly to the inertia attributable to this axis (91.7%). Furthermore, the second axis (PC2), which represents an additional 17% of the total variance, only has broadband contributing significantly to it (50.2%). Overall, all explainable variables under consideration are very well represented on the two selected components. We obtained a Cronbach's Alpha value of 0.91 for the first component and 0.80 for the second component indicating an acceptable reliability of our results (Dunn et al. 2014). Figure 3 gives some interesting insights into the behaviour of cities in Europe with respect to their ICT development. From our analysis, their ICT development can be represented by two dimensions. The percentage of the population using the Internet to access banking services, health-related services, mail services, learning services and sell things online are strongly correlated with the first axis. Hence, this axis is related to the use of ICT and is therefore named ICT use. The second axis is only strongly correlated to broadband Internet access which is related to the availability of ICT infrastructure. Hence, we call the second axis ICT infrastructure. We computed the loading for each city and plotted it in Fig. 3. An initial visual inspection of the plot reveals that except for certain capital cities, cities from the same country tend to have similar ICT infrastructure and use characteristics and are hence clustered together. Furthermore, we computed a factor score for each ICT variable as the weighted summation of the two principal loadings, multiplied it by their respective values in each city and summed it to obtain an ICT development index which can be found in Appendix C. Odense in Denmark is the best-ranked city for the two components together. It performs very well in terms of joint ICT use and infrastructure. Conversely, Timisoara in Romania is the least ICT developed city in the EU having a meagre ICT development score. These findings are consistent with Cruz-Jesus et al. (2012) who found Denmark to be the highest digitally developed country and Romania to be the least digitally developed country in the EU.

Clustering
We thereafter made a comparison of various hierarchical clustering aggregation algorithms using the ICT index and the amount of CO 2 emission. The result of the comparison of the r-squared values of the various hierarchical methods considered is shown in Fig. 4. Ward's aggregation criterion was chosen because it consistently had a high r-squared value which is a measure of the extent to which clusters are different from each other. Beyond this, Ward's aggregation method has been found to be the most robust of the other algorithms being able to handle noise efficiently (Balcan and Gupta 2014;Brian Everitt et al. 2011).
Hierarchical clustering was thereafter done on the ICT index and CO 2 emission using Ward's method. The result of this clustering is shown in the dendrogram in Appendix E, which shows the cities on the horizontal axis and the distance at which they agglomerate on the vertical axis. The dendrogram intuitively suggests the presence of four well-detached clusters. Hence, we cut our dendrogram at the four-cluster level opted for a four-cluster solution. The  The centroid of the four clusters obtained from the hierarchical clustering was extracted and used as seeds to initialise the k-means clustering algorithm. We also obtained four clusters using this approach. From the analysis of the clusters, we found out that certain cities belong to a cluster with a high value of ICT index and low CO 2 emission. Although this cluster is labelled cluster 3 in Fig. 5, we call this cluster the Bsmart and sustainableĉ luster. The average value of ICT development for cities in this cluster is 2.53 while the average CO 2 emission for cities in this cluster is 6.39 Million Metric Tonnes. We also noticed that some other groups of cities have extremely low values of ICT development and a moderately low amount of CO 2 emission. Hence, these cities form the Bnot smart but sustainable^group labelled as cluster 2 in Fig. 5 because the average ICT development index is 0.99 while the average CO 2 emission is 8.65 Million Metric Tonnes.
Furthermore, another group of cities have a moderate ICT development and extremely high CO 2 emission. Hence, this group is called the Bsmart but not sustainable^cluster labelled as cluster 4 in Fig. 5. Finally, we have a group of cities with small CO 2 emission and an average ICT development, and hence we refer to the cities in this group as being Bon the path of being smart and sustainable^. It is labelled as cluster 1 in Fig. 5. All cities under study and their respective cluster are listed in Appendix C.

Geographical Analysis
The geographical distribution of the relationship between ICT development and environmental sustainability in cities was observed using maps to allow for the visual assessment of their comparative spatial distribution (Fig. 6). Some interesting insights drawn from Fig. 6 include: & The geographic location of each city has a relationship with the link between ICT development and environmental  The 33 Bnot smart but sustainable^cities are predominantly found in south-eastern Europe with the exception of five cities (15%) found in western and central Europe; all 29 Bsmart and sustainable^cities are found on the northern belt of Europe; and eleven Bsmart but not sustainable^cities are spread around central Europe. & An obvious follow-up point is the presence of spatial autocorrelation. This factor means that cities with similar groups are clustered together in space. This clustering implies that the relationship between ICT and environmental sustainability is not randomly distributed across the EU, rather, physically adjacent cities tend to have similar characteristics and form clusters. The presence of spatial autocorrelation was tested using Moran's Index. We obtained a Moran's Index of 0.083024 with a statistically significant z-score of 4.748540 and p value of 0.000002 indicating a tendency towards clustering with a less than 1% likelihood that our clustered pattern is a result of random chance. We have identified these clusters as Bsmart and sustainable^, Bnot smart but sustainable^, Bsmart but not sustainable^and Bon the path of being smart and sustainable^. & Although we see that different cities within a country tend to have similar characteristics in terms of ICT development and environmental sustainability, we also noticed some exceptions. Of the 28 EU countries under study, five countries (Italy, France, Germany, Spain and the United Kingdom) have more than ten cities included in our study. The availability of data from a significant number of cities makes these five countries good for an intra-country comparison. Except for Italy, the capital city of each of these countries have a different group from the group of the other cities within them. & Finally, it did not come as a surprise that smart and sustainable cities are located in the United Kingdom, Belgium, Netherlands, Denmark, Sweden, Finland, and Luxembourg which are clustered in the North of Europe. Furthermore, the best performing cities in terms of ICT development and environmental sustainability are in Nordic countries.

Discussion
The exponential growth of ICT in the last decade and its application in the push for smart cities has been fundamental in addressing several urbanisation challenges ranging from energy use to quality of life of citizens. This phenomenon has led to a debate on the role of ICT in urbanisation and urban sustainability because of its growing use in urban systems and domains (Bibri and Krogstie 2017a). Cities are particularly important because they generate a huge chunk of global GHG due to their urban population density while consuming 70% of the world's resources (Riffat et al. 2016). However, the relationship between ICT and environmental sustainability is quite complex and has not been empirically studied at subnational (cities) level (Dedrick 2010;Melville 2010). Understanding this relationship is paramount in helping policymakers improve urban sustainability with the support of smart ICT. In this study, using our proposed ICT framework, we measured the ICT development of cities in Europe using PCA, and we were able to extract and represent the two dimensions of ICT development; ICT infrastructure and ICT use. We were also able to represent the ICT development of cities in Europe using a single index and found cities in Sweden, Denmark, Netherlands and Finland to have a high level of ICT development while cities in Romania and Bulgaria have a low level of ICT development. This result corroborates previous studies which also obtained similar results at a country-level (Cruz-Jesus et al. 2017). Furthermore, we performed a cluster analysis of the ICT index and CO 2 emission of cities in Europe obtaining four clusters of cities with homogenous ICT-CO 2 relationships.
Regarding the relationship between ICT development and environmental sustainability across cities in the EU, we found that this relationship can be grouped into four categories; 'Not smart but sustainable', 'smart and sustainable', 'smart but not sustainable' and 'on the path of being smart and sustainable'. This finding is consistent with earlier research (Añón Higón et al. 2017), implying that the relationship between ICT and environmental sustainability is not the same for all cities but differs depending on their stage of ICT development. Hence, the possibility exists for this relationship to be positive in the case of cities categorised as Bsmart and sustainable^or negative as seen in cities categorised as Bsmart but not sustainable^. At low levels of ICT development, a further increase in ICT development may result in an increase in CO 2 emission in cities. However, after a certain ICT development threshold, a further increase in ICT development in cities will result in a decrease in CO 2 emissions. This finding differs from conclusions in previous studies (Gouvea et al. 2017;Wu and Raghupathi 2018) that infer that the relationship between ICT and environmental sustainability is linearly positive and the same for all countries. On the other hand, this finding supports previous suggestions that even though ICT contributes significantly to GHG emissions, it also has the potential to reduce emissions by a fraction of five (Buttazzoni 2008;European Commission 2009).
Comparing the relationship of ICT development and GHG emission of cities within a country, we found that there is a tendency for the smart/sustainable profile of capital cities to differ from that of the other cities within the same country. Regarding GHG emission, our findings reveal that the magnitude of this difference is large as the capital city of a country has much more CO 2 emission than other cities within the same country. This result is consistent with findings by (Chen et al. 2016) and is expected, as capital cities are usually larger, more populated and more urbanised and hence emit more CO 2 when compared with other cities within the same country. In terms of ICT development, our findings reveal that the magnitude of difference in ICT development between the capital city of a country and other cities within the same country is not as significant as that of GHG emission. However, decomposing the ICT index into its principal component shows that this difference is more pronounced in principal component 2 which represents broadband connections (see Table 2). This aspect can be explained by the fact that ICT services such as eBanking, eLearning, e-mail, eHealth and eCommerce are more centrally organised and offered by the government and companies. Hence, the main barrier to the use of these services is the availability of ICT infrastructure which is higher in capital cities compared to other cities (Prieger 2003). This finding is also in line with Ono and Zavodny (2007) who posit that differences in ICT access at an intranational level often occur because of different characteristics. This inference could also imply that the extent to which the citizen has access to ICT infrastructure significantly influences his ability to use ICT services.
Comparing the relationship of ICT development and GHG emission of capital cities across Europe, we found Amsterdam and Copenhagen to be the best performing capital cities in terms of smartness and sustainability while Bucharest and Warsaw are the least performing capital cities. The assessment is also in line with previous studies with similar analysis at country level (Añón Higón et al. 2017). The most substantial indicator of the difference between the performance of the capital cities is the amount of CO 2 emission. Best performing capital cities in Europe have found ways to utilise ICT and other means effectively to reduce their carbon footprint, however, the least performing capital cities have not been so successful in doing so. Also, our results show that the second indicator of the difference between the performance of capital cities within Europe is the level of ICT infrastructure. The wealth of countries has been identified in the literature to be a major driving factor in determining the level of ICT infrastructure (Cruz-Jesus et al. 2017). This reasoning can also be applied to cities as we observed that wealthier cities are performing better than others.
The main contribution of our study is two-fold. Firstly, we proposed a new approach to study the relationship between ICT development and CO 2 emission using cities in the EU as a case study. We employed multivariate statistics to overcome the limitations identified in alternative approaches to create an ICT development index and created clusters of cities with similar smartness and sustainability profile. Secondly, our study sheds light on the intra-country and inter-city comparisons between the role the different components of ICT development play in urbanisation and environmental sustainability on a micro-level.

Practical Implications
Understanding the relationship between technological development and the environmental sustainability of cities is critical for entities that promote smart and sustainable urban development.
The first implication can be drawn from the type of cities found in each cluster. Cities in Nordic countries and northern Europe perform best in terms of ICT development and environmental sustainability in EU. The wealth of cities has been established in literature to be a major driving factor of their performance because economic wealth is a critical prerequisite in building ICT infrastructure and adopting ICT (Billón et al. 2008;Cruz-Jesus et al. 2012). It is essential that less wealthy cities in Europe recognise the need to take advantage of the growing market of low-cost technologies to foster environmental sustainability while taking cognisance of the potential adverse effects of such. Furthermore, they also need to adopt urban policies and regulatory frameworks to foster the use of ICT in dematerialisation, travel substitution and building and energy efficiency. Finally, they need to explore environmentally-sustainable business practices and move towards a knowledge-based economy driven by ICT for an allround Bsmart and sustainable^economic growth.
The second implication can be drawn from the way policymakers and the public perceive the concept of smart cities and sustainable cities. A common mistake is the assumption and misconception that all smart city concepts incorporate the goals of sustainable urban development. This misconception has been driven by previous literature which showed that the relationship between ICT development and environmental sustainability is positively linear (Gouvea et al. 2017;Wu and Raghupathi 2018). As a result, leading to the design of various smart city assessment frameworks that strongly focus on ICT with little attention given to environmental sustainability (Ahvenniemi et al. 2017). However, as evidenced by this research, the relationship between ICT and environmental sustainability varies among different cities. Hence, it is advisable that urban policymakers be intentional about explicitly including environmental sustainability in their smart city goals and leverage the use of ICT in achieving their sustainability goals.
Consequently, the concept of Bsmart sustainable cityŝ hould be used for emphasis when referring to the use of ICT to promote environmental sustainability by stakeholders. This shift presents a more holistic front and does not undermine the importance of either ICT or environmental sustainability in the quest for a more liveable city. Urban policymakers need a well-rounded approach when designing strategies that make use of ICT to drive environmental sustainability focusing on the built environment, living environment and resource flows.
The last practical implication is derived from the gap in ICT development and environmental sustainability between the bigger cities (capital cities) and other cities within a country. Capital cities are urbanised because of the myriad opportunities that capital cities offer as opposed to smaller cities. However, this urbanisation comes with various challenges, one of which is the emission of GHG in large quantities. People will not mind staying in smaller cities as long as they have the same level of access to services available in large cities, among other factors. ICT can help to bridge this gap by offering services such as education, health, mail, banking and commerce remotely. However, people's ability to use all these is dependent on available infrastructure. People with broadband connections will take advantage of services offered through the Internet, using more applications and for a longer time (Van Dijk 2009). Policymakers should keep in mind that to reduce urbanisation and associated GHG emissions in large cities, they will need to broaden access to broadband Internet in smaller cities. This will reduce the pressure on the bigger cities as citizens will be able to carry out various services available through the Internet in smaller cities.
Overall, urban policymakers need a well-rounded approach when designing strategies that make use of ICT to drive environmental sustainability focusing on the built environment, living environment and resource flows. This strategy can range from the effective use of ICT to increase citizen engagement on climate change issues, to facilitating the use of applications allowed by ICT such as remote working and telecommuting.

Limitations and Future Research
It is important to note that in an attempt to operationalise theoretical constructs that are not directly measurable, we have proposed observable measures based on literature. While we made an effort to perform a complete and multidimensional exploratory analysis, we had to make a compromise between the depth and width of our analysis (Cruz-Jesus et al. 2017). There is an inverse relationship between the number of cities we could have included in our analysis (width) and the number of indicators we could have used to characterise ICT development and environmental sustainability (depth). Hence, we ended up including just 129 cities in Europe and seven indicators. So, some features of ICT and environmental sustainability may not be covered as we could not include other cities and other indicators because of data constraints. Additional variables relating to teleworking incidence, changes in paper production and physical mail, patenting activity connecting ICT and the environment and research and development by the ICT sector on environmental sustainability may affect the results of this study. Including other cities in Europe and beyond may also impact our results as other cultural, political and economic factors come into play. Furthermore, this study does not infer a direct causal relationship between ICT and environmental sustainability. We simply performed a cross-sectional study to cluster a homogenous group of cities based on the relationship between ICT and environmental sustainability.
For future work, it would be interesting to expand this research and include other indicators and other cities beyond Europe to better capture the relationship between ICT and environmental sustainability on a global stage. It will also be interesting to do this analysis for historical years to characterise the long-term relationship of ICT and environmental sustainability and its evolution over the years using a longitudinal study. This work will make it possible to forecast a Bsmartainability^(smart and sustainable) path of global cities into the future based on past data.

Conclusion
Several authors have explored the relationship between ICT and environmental sustainability, but this link has not been empirically explored at a sub-national level using multivariate statistics. In this research, we set out to explore the link between ICT development and environmental sustainability on a micro-level, clustering 129 cities in Europe based on this relationship. We achieved this using PCA to create an ICT index for the cities under study and subsequently using cluster analysis to group and characterise each city based on its ICT and CO 2 emission. We explored the two dimensions of ICT development and also carried out an intra-country and inter-city comparisons of the joint ICT-CO 2 characteristics of European cities. Based on our analysis, there are four groups of cities with similar ICT (smart) and CO 2 (sustainable) characteristics namely; smart but not sustainable cities, not smart but sustainable cities, smart and sustainable cities and those on the path of being smart and sustainable cities.
Furthermore, we found that although cities from the same country usually have similar ICT infrastructure and use characteristics, capital cities tend to have a different profile. This difference is primarily due to the amount of GHG emission and ICT infrastructure in capital cities. Comparing the profile of capital cities across the EU, we found capital cities in northern Europe to be the best performing cities in terms of smartness and sustainability while capital cities that are Bnot smart but sustainable^are predominantly located in south-eastern Europe. The practical implications of this research have also been highlighted. The results of this research will help urban policymakers design appropriate strategies for achieving integrated smart, sustainable and inclusive growth at European level.

Compliance with Ethical Standards
Conflicts of Interest The authors declare that they have no conflict of interest.

Appendix 1
R 2 is a measure of the similarity (or dissimilarity) of groups in a cluster. It is given by the formula below:

Appendix 2
Procedure to obtain the score of each city from a PCA with 6 selected variables. Let Si be the final score attributed to each city i (I = 1, 2, 3, ……., 129) where λk is the k-eigenvalue of the variance and covariance x i t x i with xi being the 6-dimensional vector describing city i after data normalization. So, ∑ 2 k¼1 λ k is the total inertia explained by the first two axes.
Also, γ k i is the coordinate of city i on axis k generated by the unit eigenvector μk given by: