Keywords

FormalPara Learning Objectives

After reading this chapter, you should understand:

  1. 1.

    Loading and cleaning data for use in model estimation

  2. 2.

    Specifying measurement models in SEMinR syntax

  3. 3.

    Specifying the structural model in SEMinR syntax

  4. 4.

    Estimating a PLS path model using SEMinR syntax

  5. 5.

    Summarizing a PLS path model in SEMinR

  6. 6.

    Bootstrapping a PLS path model in SEMinR

  7. 7.

    Accessing the contents of the summary objects

  8. 8.

    Exporting the PLS-SEM results for reporting

SEMinR is a software package developed for the R statistical environment (R Core Team, 2021) that brings a user-friendly syntax to creating and estimating structural equation models. SEMinR is open source, which means that anyone can inspect, modify, and enhance the source code. SEMinR is distributed under a GNU General Public License version 3 (GPL-3), implying it is completely free for personal, academic, and commercial use – as long as any changes made to it, or applications built using it, are also open source.

SEMinR is hosted on GitHub (► https://github.com/sem-in-r/seminr). We encourage users to follow the GitHub page for SEMinR and contribute to this project or use the issues feature to report bugs or problems. Users of SEMinR can also interact with the developers and each other at the Facebook group (► https://www.facebook.com/groups/seminr). Participants regularly discuss recent developments, best practices, and tutorials on basic functionality of SEMinR. We also encourage you to follow this Facebook group for updates on bugs, issues, and new features.

The SEMinR syntax enables applied practitioners of PLS-SEM to use terminology that is very close to their familiar modeling terms (e.g., reflective, composite, and interactions), instead of specifying underlying matrices and covariances. Specifically, the syntax was designed to:

  • Provide a domain-specific language to build and estimate PLS path models in R

  • Use both variance-based PLS-SEM and covariance-based SEM (CB-SEM) to estimate composite and common factor models (► Chap. 1)

  • Simply and quickly specify model relationships and more complex model elements, such as interaction terms (see ► Chap. 8) and higher-order constructs (Sarstedt, Hair Jr, Cheah, Becker, & Ringle, 2019)

SEMinR uses its own PLS-SEM estimation engine and integrates with the lavaan package (Rosseel, 2012) for CB-SEM estimation. SEMinR supports the state of the art of PLS-SEM and beyond. The development team regularly improves the program, incorporates new methods, and supports the users with useful reporting options in their analyses.

In ► Chap. 2, we introduced R and RStudio. After reading that chapter, you should now be familiar with writing scripts, creating objects, and installing packages. In this chapter, we discuss how to use the SEMinR package. The first step is to install the SEMinR package and load it into the RStudio environment (◘ Fig. 3.1). SEMinR was built using R version 4.0.3 – depending on how recently you installed R and RStudio, you might need to update these software files to the latest version before installing SEMinR. Refer to ► Chap. 2 for instructions on installing the latest versions of R and RStudio.

Fig. 3.1
A window of the console depicts the output program for installing the source package. Installing and unpacking packages using staged installation, moving data sets, installing help indices, copying figures, building package indices, installing vignettes, and testing.

Installing and loading the SEMinR package. (Source: authors’ screenshot from RStudio)

# Download and install the SEMinR package # You only need to do this once to equip # Rstudio on your computer with SEMinR install.packages(“seminr”) # Make the SEMinR library ready to use # You must do this every time you restart Rstudio and wish to use SEMinR library(seminr)

3.1 The Corporate Reputation Model

With the SEMinR package installed and loaded to the environment, we now introduce the example dataset and model that will be used throughout this textbook. We draw on Eberl’s (2010) corporate reputation model, which is also used in A Primer on Partial Least Squares Structural Equation Modeling (PLS-SEM) (Hair, Hult, Ringle, & Sarstedt, 2022). The goal of the model is to explain the effects of corporate reputation on customer satisfaction (CUSA) and, ultimately, customer loyalty (CUSL). Corporate reputation represents a company’s overall evaluation by its stakeholders (Helm, Eggert, & Garnefeld, 2010). This construct is measured using two dimensions. One dimension represents cognitive evaluations of the company, which is the company’s competence (COMP). The second dimension captures affective judgments, which determine the company’s likeability (LIKE). Research has shown that the model performs favorably (in terms of convergent validity and predictive validity) compared to alternative reputation measures (Sarstedt, Wilczynski, & Melewar, 2013).

In summary, the simple corporate reputation model has two main theoretical components: (1) the target constructs of interest – namely, CUSA and CUSL (endogenous constructs) – and (2) the two corporate reputation dimensions COMP and LIKE (exogenous constructs), which are key determinants of the target constructs. ◘ Figure 3.2 shows the constructs and their relationships.

Fig. 3.2
A flow diagram of reputation model. COMP is connected to comp 1 to 3 and CUSA and C U S L with arrows. LIKE is connected to like 1 to 3 and CUSA and C U S L with arrows. COMP, LIKE, and CUSA are connected to CUSA L which is further connected to cusl 1 to 3.

Simple corporate reputation model. (Source: authors’ own figure)

Each of these constructs is measured by means of multiple indicators, except satisfaction. For instance, the endogenous construct COMP is reflectively measured by three indicator variables, comp_1, comp_2, and comp_3. Respondents answered on a scale from 1 (totally disagree) to 7 (completely agree) to evaluate the statements that these three items represent (◘ Table 3.1). Two other constructs in the simple model (CUSL and LIKE) can be described in a similar manner, and the third (CUSA) has only a single indicator. ◘ Table 3.1 summarizes the indicator wordings for the four constructs considered in this simple corporate reputation model. In ► Chap. 5, we will extend the simple model by adding four formatively measured constructs.

Table 3.1 Indicators for the reflectively measured constructs of corporate reputation model

Now that you are familiar with the reputation model, we will demonstrate the syntax used by SEMinR. Briefly, there are four steps to specify and estimate a structural equation model using SEMinR:

  1. 1.

    Loading and cleaning the data

  2. 2.

    Specifying the measurement models

  3. 3.

    Specifying the structural model

  4. 4.

    Estimating, bootstrapping, and summarizing the model

3.2 Loading and Cleaning the Data

When estimating a PLS-SEM model, SEMinR expects you to have already loaded your data into an object. This data object is usually a data.frame class object, but SEMinR will also accept a matrix class object. For more information about these objects, you can access the R documentation using the ? operator (e.g., ?matrix). The read.csv() function allows you to load data into R if the data file is in a.csv (comma-separated value) or.txt (text) format. Note that there are other packages that can be used to load data in Microsoft Excel’s.xlsx format or other popular data formats.

Comma-separated value (CSV) files are a type of text file, whose lines contain the data of each subject or case of your dataset. The values in each line correspond to the different variables of interest (e.g., the first, second, or third value of a line corresponds with the first, second, or third variable in the dataset, from left to right). These values are typically separated by commas but can also be separated by other special characters (e.g., semicolons). The first line of the file typically consists of variable names, called the header line, and is also separated by commas or other special characters. Thus, a variable will have its name in the first row at a certain position (e.g., fifth data entry), and its values will be in all the following lines of data at the same position (e.g., also at the fifth data entry position). Files in a.csv format are a popular way of storing datasets, and we will use it as an example in this chapter. Many software packages, such as Microsoft Excel and SPSS, can export data into a.csv format.

We can load data from a.csv file using the read.csv() function. Remember that you can use the ? operator to find help about a function in R (e.g., use ?read.csv) at any time. ◘ Table 3.2 shows several arguments for the read.csv() function as included in the help file.

Table 3.2 A (shortened) list of arguments for the read.csv() function

In this section, we will demonstrate how to load a.csv file into the RStudio global environment. The file we will use is called Corporate Reputation Data.csv and can be downloaded from the book’s website at ► https://www.pls-sem.net/downloads/. Once you have downloaded the Corporate Reputation Data.csv file, transfer it to your R project working directory as discussed in ► Chap. 2. If you inspect the Corporate Reputation Data.csv file in a text editor, it should appear as in the screenshot in ◘ Fig. 3.3. Note that this.csv file uses semicolons instead of commas to separate variable names and values.

Fig. 3.3
A window presents the c s v file of corporate reputation data in a text editor. It lists the service provider, service type, and sample type.

The Corporate Reputation Data.csv file viewed in a text editor. (Source: authors’ screenshot from R)

In ◘ Fig. 3.3, we see that this sample data has a header row consisting of the variable names (columns). In addition, the semicolon (;) is used as a separator character, and the missing values are coded as −99. If you wish to import this file to the global environment, you can use the read.csv() function, specifying the arguments file= “Corporate Reputation Data.csv”, header= TRUE, and sep= “;” and assigning the output to the corp_rep_data variable:

# Load the corporate reputation data corp_rep_data <- read.csv(file= “Corporate Reputation Data.csv”,header = TRUE, sep = “;”)

When clicking on the corp_rep_data object in the environment panel of RStudio, the source window opens at the top left of the screen (◘ Fig. 3.4).

Fig. 3.4
A screenshot from R Studio that presents a table with 8 columns. These are service type, c s o r 1 to 5, and global. 1 to 14 rows of 344 entries are presented.

Inspecting the corp_rep_data object. (Source: authors’ screenshot from RStudio)

Important

Inspect the loaded data to ensure that the correct numbers of columns (indicators), rows (observations or cases), and column headers (indicator names) appear in the loaded data. Note that SEMinR uses the asterisk (“*”) character when naming interaction terms as used in, for example, moderation analysis, so please ensure that asterisks are not present in the indicator names. Duplicate indicator names will also cause errors in SEMinR. Finally, missing values should be represented with a missing value indicator (such as −99, which is commonly used), so they can be appropriately identified and treated as missing values.

We encourage you to follow the above steps to download and read a dataset. Alternatively, you can also access that particular dataset directly from SEMinR. To help demonstrate its features, SEMinR comes bundled with two datasets, the corporate reputation dataset (Hair et al., 2022; corp_rep_data) and the European Customer Satisfaction Index (ECSI) dataset (Tenenhaus, Esposito Vinzi, Chatelin, & Lauro, 2005; mobi). When the SEMinR library has been loaded to the global environment (library(seminr)), the data are accessible by simply calling the object names (corp_rep_data or mobi).

Whichever way you have loaded the corp_rep_data, we can now inspect the dataset by using the head() function. head() is a useful function that outputs the first few fields of an object:

# Show the first several rows of the corporate reputation data head(corp_rep_data)

It is clear from inspecting the head of the corp_rep_data object (◘ Fig. 3.5) that the file has been loaded correctly and has the value “-99” set for the missing values. With the data loaded correctly, we now turn to the measurement model specification.

Fig. 3.5
A screenshot of the console output page. The first several rows of the corporate reputation data, head, service provider, service type, and sample type are presented.

The head of the corporate reputation dataset. (Source: authors’ screenshot from RStudio)

3.3 Specifying the Measurement Models

Path models are made up of two elements: (1) the measurement models (also called outer models in PLS-SEM), which describe the relationships between the latent variables and their measures (i.e., their indicators), and (2) the structural model (also called the inner model in PLS-SEM), which describes the relationships between the latent variables. We begin with describing how to specify the measurement models.

The basis for determining the relationships between constructs and their corresponding indicator variables is measurement theory. A sound measurement theory is a necessary condition to obtain useful results from any PLS-SEM analysis. Hypothesis tests involving the structural relationships among constructs will only be as reliable or valid as the construct measures.

SEMinR uses the constructs() function to specify the list of all construct measurement models. Within this list, we can then define various constructs:

  • composite() specifies the measurement of individual constructs.

  • interaction_term() specifies interaction terms.

  • higher_composite() specifies hierarchical component models (higher-order constructs; Sarstedt et al., 2019).

Theconstructs() function compiles the list of constructs and their respective measurement model definitions. We must supply it with any number of individual composite(), interaction_term(), or higher_composite() constructs using their respective functions. Note that neither a dataset nor a structural model is specified in the measurement model stage, so we can reuse the measurement model object across different datasets and structural models.

Thecomposite() function describes the measurement model of a single construct and takes the arguments shown in ◘ Table 3.3.

Table 3.3 The arguments for the composite() function

SEMinR strives to make specification of measurement items shorter and cleaner using multi_items(), which creates a vector of multiple measurement items with similar names or single_item()that describes a single measurement item. For example, we can use composite() for PLS path models to describe the reflectively measured COMP construct with its indicator variables comp_1, comp_2, and comp_3: composite(“COMP”, multi_items(“comp_”, 1:3),weights= mode_A);Sect.3.5contains explanations of mode A and mode B. When no measurement weighting scheme is specified, the argument default is set tomode_A. Similarly, we can usecomposite()to define the single-item measurement model ofCUSAascomposite(“CUSA”, single_item(“cusa”)). Combining the four measurement models within theconstructs()function, we can define the measurement model for the simple model inFig.3.2. Note: If an error occurs, make sure you used thelibrary(seminr)command in R to load the SEMinR package before executing the program code.

# Create measurement model simple_mm <- constructs( composite(“COMP”, multi_items(“comp_”, 1:3)), composite(“LIKE”, multi_items(“like_”, 1:3)), composite(“CUSA”, single_item(“cusa”)), composite(“CUSL”, multi_items(“cusl_”, 1:3)))

The program code above facilitates the specification of standard measurement models. However, the constructs() function also allows specifying more complex models, such as interaction terms (Memon et al., 2019) and higher-order constructs (Sarstedt et al., 2019). We will discuss the interaction_term() function for specifying interactions in more detail in ► Chap. 8.

3.4 Specifying the Structural Model

With our measurement model specified, we now specify the structural model. When a structural model is being developed, two primary issues need to be considered: the sequence of the constructs and the relationships between them. Both issues are critical to the concept of modeling because they represent the hypotheses and their relationships to the theory being tested.

In most cases, researchers examine linear independent–dependent relationships between two or more constructs in the path model. Theory may suggest, however, that model relationships are more complex and involve mediation or moderation relationships. In the following section, we briefly introduce these different relationship types. In ► Chaps. 7 and 8, we explain how they can be estimated and interpreted using SEMinR.

SEMinR makes structural model specification more human readable, domain relevant, and explicit by using these functions:

  • relationships() specifies all the structural relationships between all constructs.

  • paths() specifies relationships between sets of antecedents and outcomes.

The simple model in ◘ Fig. 3.2 has five relationships. For example, to specify the relationships from COMP and LIKE to CUSA and CUSL, we use the from and to arguments in the path function: paths(from = c(“COMP”, “LIKE”), to = c(“CUSA”, “CUSL”)).

# Create structural model simple_sm <- relationships( paths(from = c(“COMP”, “LIKE”), to = c(“CUSA”, “CUSL”)), paths(from = c(“CUSA”), to = c(“CUSL”)))

Note that neither a dataset nor a measurement model is specified in the structural model stage, so we can reuse the structural model object simple_sm across different datasets and measurement models.

3.5 Estimating the Model

After having specified the measurement and structural models, the next step is the model estimation using the PLS-SEM algorithm. For this task, the algorithm needs to determine the scores of the constructs that are used as input for (single and multiple) partial regression models within the path model. After the algorithm has calculated the construct scores, the scores are used to estimate each partial regression model in the path model. As a result, we obtain the estimates for all relationships in the measurement models (i.e., the indicator weights/loadings) and the structural model (i.e., the path coefficients).

The setup of the measurement models depends on whether the construct under consideration is modeled as reflective or formative. When a reflective measurement model is assumed for a construct, the indicator loadings are typically estimated through mode A. It estimates the relationship from the construct to each indicator based on a reflective measurement model that uses bivariate regressions (i.e., a single indicator variable represents the dependent variable, while the construct score represents the independent variable). As a result, we obtain correlations between the construct and each of its indicators (i.e., correlation weights), which become the indicator loadings. In contrast, when a formative measurement model is assumed for a construct, the indicator weights are typically estimated using multiple regression. More specifically, the measurement model estimation applies PLS-SEM’s mode B, in which the construct represents a dependent variable and its associated indicator variables are the multiple independent variables. As a result, we obtain regression weights for the relationships from the indicators to the construct, which represent the indicator weights. While the use of mode A (i.e., correlation weights) for reflective measurement models and mode B (i.e., regression weights) for formative measurement models represents the standard approach to estimate the relationships between the constructs and their indicators in PLS-SEM, researchers may choose a different mode per type of measurement model in special situations (see also Hair et al., 2022; Rigdon, 2012).

Structural model calculations are executed as follows. The partial regressions for the structural model specify an endogenous construct as the dependent variable in a regression model. This endogenous construct’s direct predecessors (i.e., latent variables with a direct relationship leading to the specific endogenous construct) are the independent variables in a regression used to estimate the path coefficients. Hence, there is a partial regression model for every endogenous construct to estimate all the path coefficients in the structural model.

All partial regression models are estimated by the PLS-SEM algorithm’s iterative procedures, which comprise two stages. In the first stage, the construct scores are estimated. Then, in the second stage, the final estimates of the indicator weights and loadings are calculated, as well as the structural model’s path coefficients and the resulting R2 values of the endogenous latent variables. Appendix A of this textbook provides a detailed description of the PLS-SEM algorithm’s stages (see also Lohmöller, 1989).

To estimate a PLS path model, algorithmic options and argument settings must be selected. The algorithmic options and argument settings include selecting the structural model path weighting scheme. SEMinR allows the user to apply two structural model weighting schemes: (1) the factor weighting scheme and (2) the path weighting scheme. While the results differ little across the alternative weighting schemes, path weighting is the most popular and recommended approach. This weighting scheme provides the highest R2 value for endogenous latent variables and is generally applicable for all kinds of PLS path model specifications and estimations. Chin (1989) provides further details on the different weighting schemes available in PLS-SEM.

SEMinR uses the estimate_pls() function to estimate the PLS-SEM model. This function applies the arguments shown in ◘ Table 3.4. Please note that arguments with default values do not need to be specified but will revert to the default value when not specified.

Table 3.4 Arguments for the estimate_pls() function

We now estimate the PLS-SEM model by using the estimate_pls() function with arguments data= corp_rep_data, measurement_model= simple_mm, structural_model= simple_sm, inner_weights= path_weighting, missing= mean_replacement, and missing_value= “-99” and assign the output to corp_rep_simple_model.

# Estimate the model corp_rep_simple_model <- estimate_pls( data = corp_rep_data, measurement_model = simple_mm, structural_model = simple_sm, inner_weights = path_weighting, missing = mean_replacement, missing_value = “-99”)

Note that the arguments for inner_weights, missing, and missing_value can be omitted if the default arguments are used. This is equivalent to the previous code block:

# Estimate the model with default settings corp_rep_simple_model <- estimate_pls( data = corp_rep_data, measurement_model = simple_mm, structural_model = simple_sm, missing_value = “-99”)

When the PLS-SEM algorithm has converged, the message “Generating the seminr model. All 344 observations are valid” will be shown in the console window (◘ Fig. 3.6).

Fig. 3.6
A console window depicts creating the measurement model, constructs, simple relationships, estimating the model, measurement, structural model, and generating the seminr model.

The estimated simple corporate reputation model. (Source: authors’ screenshot from RStudio)

3.6 Summarizing the Model

Once the model has been estimated, we can summarize the model and generate a report of the results using the summary() function, which is used to extract the output and parameters of importance from an estimated model. SEMinR supports the use of summary() for the estimate_pls(), bootstrap_model(), and predict_pls() functions.

The summary() function applied to a SEMinR model object produces a summary.seminr_model class object, which can be stored in a variable and contains the sub-objects shown in ◘ Table 3.5 that can be inspected using the $ operator (e.g., summary_simple_corp_rep$meta). These sub-objects relate to model estimates, which serve as a basis for the assessment of the measurement and structural models (Hair, Risher, Sarstedt, & Ringle, 2019).

Table 3.5 Elements of the summary.seminr_model object

# Summarize the model results summary_simple_corp_rep <- summary(corp_rep_simple_model) # Inspect the model’s path coefficients and the R^2 values summary_simple_corp_rep$paths # Inspect the construct reliability metrics summary_simple_corp_rep$reliability

◘ Figure 3.7 shows the results stored in the summary_simple_corp_rep$paths and summary_simple_corp_rep$reliability sub-objects.

Fig. 3.7
A console window summarizes the model results, inspects the structural paths, and displays the construct reliability metrics.

Inspecting the summary report elements. (Source: authors’ screenshot from RStudio)

3.7 Bootstrapping the Model

PLS-SEM is a nonparametric method – thus, we need to perform bootstrapping to estimate standard errors and compute confidence intervals. Bootstrapping will be discussed in more detail in ► Chaps. 5 and 6, but for now, we introduce the function and arguments.

SEMinR conducts high-performance bootstrapping using parallel processing, which utilizes the full performance of the central processing unit (CPU). The bootstrap_model() function is used to bootstrap a previously estimated SEMinR model. This function applies the arguments shown in ◘ Table 3.6.

Table 3.6 Arguments for the bootstrap_model() function

In our example, we use the bootstrap_model() function and specify the arguments seminr_model= corp_rep_simple_model, nboot= 1000, cores= NULL, seed= 123. In this example, we use 1,000 bootstrap subsamples. However, the final result computations should draw on 10,000 subsamples (Streukens & Leroi-Werelds, 2016). These computations may take a short while (i.e., the R program remains idle). We first assign the output of the bootstrapping to the boot_simple_corp_rep variable. We then summarize this variable, assigning the output of summary() to the sum_boot_simple_corp_rep variable. The summarized bootstrap model object (i.e., sum_boot_simple_corp_rep) contains the elements shown in ◘ Table 3.7, which can be inspected using the $ operator.

Table 3.7 Elements of the summary.bootstrap_model object

# Bootstrap the model boot_simple_corp_rep <- bootstrap_model(seminr_model = corp_rep_simple_model, nboot = 1000, cores = NULL, seed = 123) # Store the summary of the bootstrapped model sum_boot_simple_corp_rep <- summary(boot_simple_corp_rep) # Inspect the bootstrapped structural paths sum_boot_simple_corp_rep$bootstrapped_paths # Inspect the bootstrapped indicator loadings sum_boot_simple_corp_rep$bootstrapped_loadings

◘ Figure 3.8 shows the results of the bootstrap procedure for the path coefficients and indicator loadings. Note that bootstrapping is a random process, and your results might be slightly different from those presented here.

Fig. 3.8
A console window illustrates the bootstrapping model 1. It begins with storing the summary and is followed by an inspection of the bootstrapped structural paths and indicator loadings.

Bootstrapped structural paths and indicator loadings. (Source: authors’ screenshot from RStudio)

3.8 Plotting, Printing, and Exporting Results to Articles

When model estimation, evaluation, and analysis have been completed, it is often necessary to export the results generated in R to a report, such as an Apache OpenOffice writer document (.odt) or Microsoft PowerPoint presentation (.ppt or.pptx). Throughout this book, we provide screenshots for demonstrating the code outputs to the console in RStudio. However, we do not recommend this method to be used for copying and pasting results to research reports or articles. Instead, we recommend exporting tables and matrices to.csv files, which can be imported into documents or presentations, and that figures are exported to.pdf files to ensure the best print quality. In this section, we demonstrate how to best export results for high print quality and readability.

The write.csv() function takes an object from the global environment and writes it into a.csv file in the working directory of the project. This function applies two arguments: x is the name of the object to be written to file, and file is the name of the file to be created and written to. Thus, if we wish to report the bootstrapped paths from the previously discussed simple model, we would use the write.csv() function with argument x= sum_boot_simple_corp_rep$bootstrapped_loadings and file= “boot_loadings.csv”.

# Write the bootstrapped paths object to csv file write.csv( x = sum_boot_simple_corp_rep$bootstrapped_loadings, file = “boot_loadings.csv”)

Once the boot_loadings.csv file has been saved into the working directory, we can open it with Apache OpenOffice Calc, Microsoft Excel, or other spreadsheet software. These spreadsheet software applications enable formatting and editing the table to produce high-quality tables in reports. We followed this procedure to create ◘ Table 3.8 of bootstrapped indicator loadings for the simple corporate reputation model.

Table 3.8 Export of bootstrapped indicator loadings from SEMinR

Next, we discuss how to generate high-quality figures from the SEMinR results. First, we generate a sample plot for export from RStudio. To do this, we use a sub-object of summary_simple_corp_rep and plot the constructs’ internal consistency reliabilities (i.e., Cronbach’s alpha, rhoA, and rhoC) with plot(summary_simple_corp_rep$reliability). Once this plot displays in the plots tab in RStudio (◘ Fig. 3.9), click the Export dropdown list and select Save as PDF to bring up the save plot as a.pdf window. Select the size and output name and save the document. Note that the size will affect the rendering of the plot and might need to be adjusted several times before the ideal format is found. These.pdf images can be imported directly into documents and reports at very high print quality. Alternatively, we can save the plot as an image file in.png,.jpeg,.eps, and many other formats. To do so, click the Export dropdown list in the plots tab and select Save as Image.

Fig. 3.9
A window in which a graph that plots the relative positions of alpha, rho a, and rho c for COMP, LIKE, CUSA, and C U S L. The save as P D F option is selected from the export menu in the plots section.

Exporting the plot from RStudio using Save as PDF. (Source: authors’ screenshot from RStudio)

Summary

In this chapter, we introduced the SEMinR syntax necessary for loading data, specifying and estimating a PLS path model, and reporting the results. Unlike popular graphical user interface software that uses menus and buttons, using a programming language, such as R, creates many opportunities for errors and bugs to be introduced. It is crucial that you are well versed in the SEMinR syntax, functions, and arguments before you proceed to the next few chapters. For this reason, we strongly recommend reviewing this chapter several times and attempting to complete the exercises before moving onto subsequent chapters. The upside of the programming approach is that every step and parameter of your analysis are explicitly defined for others to repeat or replicate. In addition, more experienced users can draw on a large number of supplementary R packages that extend the analyses supported by SEMinR.

The SEMinR syntax for PLS-SEM is broadly divided into four stages: (1) loading and cleaning the data, (2) specifying the measurement models, (3) specifying the structural model, and (4) estimating, bootstrapping, and summarizing the model. When loading data, it is important that the format of the file to be imported is well understood to prevent later errors. Special attention should be paid to the column headers, the separator and decimal characters used, and the missing value indicator. The raw data file can be inspected using a text editor prior to importing it into the RStudio environment. The imported data should also be compared to the raw data file to ensure no errors occurred in the process.

The measurement model is specified using the SEMinR functions constructs(), composite(), interaction_term(), and multi_items() or single_item(). The measurement model can be specified and reused across different datasets and structural model configurations. The structural model is specified using relationships(), paths(), and intuitive arguments from and to for specific paths. The PLS path model is estimated using the estimate_pls() function, which allows for specification of the inner model weighting scheme, as path weighting or factorial. The bootstrap_model() function is used to bootstrap a previously estimated SEMinR model. Reports are generated using the summarize(), plot(), and print() functions, and high-quality figures and tables can be exported to reports and presentations.

Exercise

The SEMinR package comes bundled with a model (i.e., the influencer model), which analyzes if consumers are likely to follow social media influencers’ purchase recommendations and whether they feel connected with the influencer. Specifically, the model examines the impact of self-influencer connection (i.e., the level of own identification with the influencer presenting a specific product, SIC) on product liking (PL), perceived quality (PQ), and purchase intention (PI). That is, product liking and perceived quality act as potential mediators in the relationship between self-influencer connection and purchase intention. Finally, the model hypothesizes a direct effect from purchase intention to willingness to pay (WTP). Pick (2020) provides the theoretical background on a similar influencer model.

The data were collected as part of a larger study on social media influencer marketing (Pick, 2020) via an online survey between January and April 2019. The final dataset consists of N = 222 observations. The dataset is bundled with the SEMinR package and is named influencer_data. Participants saw either a “real” influencer called Franklin who presented a fitness shake (N = 100) or a “fake” influencer called Emma who presented a hand blender (N = 122) (indicator, influencer_group). After seeing the real or fake influencer, participants provided information about their self-influencer connection on a 7-point Likert scale (1 = completely disagree, 7 = completely agree). Different from Pick (2020), we consider self-influencer connection as a formative measure using the set of items shown in ◘ Table 3.9. To assess the formative construct’s convergent validity, a single item was included in the survey (indicator, sic_global), which serves as the criterion measure for a redundancy analysis.

Table 3.9 Indicators for the formatively measured construct of the influencer model

In the next step, respondents stated their perceived influencer competence (used in ► Chap. 8), perceived quality, product liking, and purchase intention, all of which are measured reflectively on a 7-point Likert scale (1 = completely disagree, 7 = completely agree). Finally, willingness to pay is measured using a single question, asking respondents for their willingness to pay (in Euro) for the presented product. See ◘ Table 3.10 for a complete list of item wordings. The table also includes items for an additional construct (perceived influencer competence), which we will introduce in ► Chap. 8.

Table 3.10 Indicators for the reflectively measured construct of the influencer model

The influencer model is illustrated in ◘ Fig. 3.10. If you need help or hints, consult the SEMinR demo topic file for the influencer model:

# Access the demo file for the ECSI dataset demo(topic = “seminr-pls-influencer”, package = “seminr”)

  1. 1.

    Reproduce the influencer measurement models in SEMinR syntax.

  2. 2.

    Reproduce the influencer structural model in SEMinR syntax.

  3. 3.

    Estimate the influencer model using the standard settings. Remember to specify the influencer_data dataset.

    Fig. 3.10
    An illustration depicts the influencer model. Sic 1 to sic 7 are connected to SIC which is connected to P L and P Q. These 3 are connected to P I which is connected to W T P and p 1 to p 5. P L and P Q are connected to p l 1 to 4 and p q 1 to 4, respectively.

    Influencer model. (Source: authors’ own figure)