Artwork pricing model integrating the popularity and ability of artists

Park, Jinsu; Lee, Yoonjin; Yang, Daewon; Park, Jongho; Jung, Hohyun

doi:10.1007/s10182-024-00504-3

Artwork pricing model integrating the popularity and ability of artists

Original Paper
Open access
Published: 02 July 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

AStA Advances in Statistical Analysis Aims and scope Submit manuscript

Artwork pricing model integrating the popularity and ability of artists

Download PDF

Jinsu Park¹,
Yoonjin Lee²,
Daewon Yang³,
Jongho Park⁴ &
…
Hohyun Jung ORCID: orcid.org/0000-0002-8460-5933^2,5

Abstract

Considerable research has been devoted to understanding the popularity effect on the art market dynamics, meaning that artworks by popular artists tend to have high prices. The hedonic pricing model has employed artists’ reputation attributes, such as survey results, to understand the popularity effect, but the reputation attributes are constant and not properly defined at the point of artwork sales. Moreover, the artist’s ability has been measured via random effect in the hedonic model, which fails to reflect ability changes. To remedy these problems, we present a method to define the popularity measure using the artwork sales dataset without relying on the artist’s reputation attributes. Also, we propose a novel pricing model to appropriately infer the time-dependent artist’s abilities using the presented popularity measure. An inference algorithm is presented using the EM algorithm and Gibbs sampling to estimate model parameters and artist abilities. We use the Artnet dataset to investigate the size of the rich-get-richer effect and the variables affecting artwork prices in real-world art market dynamics. We further conduct inferences about artists’ abilities under the popularity effect and examine how ability changes over time for various artists with remarkable interpretations.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The art market has attracted significant interest for a variety of reasons, reflecting its unique blend of cultural, economic, and social factors (Schönfeld and Reinstaller 2007; David et al 2013). Artwork transactions take place through various channels, such as art galleries, online art marketplaces, and auctions (Ashenfelter and Graddy 2003). Auctions play a significant role in the art market and are important for analyzing how the artwork price is determined since the bidding process is open, and participants can see the competitive bids, which helps establish a fair market value for artworks (Candela and Scorcu 1997; Higgs and Forster 2014). Therefore, many researchers aim to understand the factors influencing art prices, predict price trends, and gain insights into the art market dynamics (Peluso et al 2017; Kraeussl and Logher 2010). Research in these areas contributes to a deeper understanding of the art market, provides insights and recommendations for collectors, investors, and policymakers, and informs strategies for auction houses, artists, and dealers (Peluso et al 2017). It is an evolving field that continues to adapt to changes in the art market and also create advancements in data analysis techniques.

The repeat sales method and the hedonic pricing model are representative techniques used in artwork pricing. The repeat sales method (Galbraith and Hodgson 2018) involves analyzing the resale prices of artworks that have been sold multiple times over a period. However, this method relies on having sufficient data on repeat sales, which may be limited for certain artworks, especially those with infrequent transactions. In this paper, we utilize the framework of the hedonic model (Ekeland et al 2004), which is a specific application of linear regression used to estimate the price or value of a product or asset based on its attributes. In the context of artwork pricing, a hedonic model employs artwork attributes (e.g., artist reputation, size, medium) as independent variables to estimate the artwork’s log price as the dependent variable (Shin et al 2014; Sproule and Valsan 2006; Garay et al 2022). Artwork prices are often influenced by combining two key factors: artist attributes and artwork attributes (Forster and Higgs 2018; Beckert and Rössel 2013; Rengers and Velthuis 2002). These factors work in tandem to shape the perceived value and market price of an artwork. Artist attributes include gallery reputation, the number of solo exhibitions, media awareness, awards, and aliveness. The examples of artwork attributes that significantly contribute to artwork prices are size, medium, and materials.

The popularity effect, also known as the rich-get-richer phenomenon, means that popular trends, ideas, and products often become even more popular; while, less-known options struggle to gain attention and traction. We can observe the popularity effect in many human-made societies including social networks (Jung and Phoa 2021; Jung 2023; Song and Park 2022), economics (Durham et al 1998), and online platforms (Bratanova et al 2016; Jung et al 2020). The popularity effect also occurs in art markets since artworks by well-known and highly regarded artists, or those that have gained significant attention and acclaim, tend to command higher prices and greater demand. This effect is a crucial aspect of how artworks are priced and sold in the art market. Researchers have conducted numerous studies to understand how an artist’s reputation influences the pricing and demand for their artworks (Ursprung and Wiermann 2011; Beckert and Rössel 2013). However, these works simply use the artist’s reputation attributes as independent variables of the model, such as survey results and the number of solo exhibitions, which are measured at a specific time point. These static artists’ reputation variables make it difficult to capture the reputation change over time; thus, the artist’s reputation at the time of sale can be inaccurate. Moreover, the artist’s reputation variables may have high correlations, hurting the explainability of the model result.

Researchers have attempted to explain the innate ability of artists that cannot be explained by controlled variables. Bocart et al (2018) conducted Bayesian modeling to explain gender differences in ability. The Gaussian assumption of ability is similar to the approach of this paper, but they assumed static ability and focused on gender differences. Marchenko and Sonnabend (2022) also investigated the gender effect on German visual artists through the notion of artistic ability. Hosoya (2020) conducted an extensive study on artistic ability from the perspective of a random effect model, assuming invariant artistic ability. Kackovic et al (2022) investigated students’ artistic ability via visual arts programs. The term ability is sometimes used interchangeably with terms like talent, and it is often measured using artwork prices (Angelini et al 2023; Etro and Stepanova 2018).

An artist’s ability can change over time due to various factors, including experience, practice, exposure to new techniques and styles, and personal growth (Markusen et al 2006). Pablo Picasso, one of the most influential artists of the 20th century, has remarkable ability fluctuations over his lifetime due to his adaptability and willingness to explore new artistic horizons. His early realistic style gave way to the emotional depth of his Blue Period, showcasing his evolving ability to convey complex emotions. Influenced by African and Iberian art, he embraced abstraction during his Cubist phase, deconstructing forms and perspectives. Therefore, we should allow the change of an artist’s ability, especially when dealing with a long period of data, to properly understand the dynamic nature of artwork markets. Studies on such changes in artist ability have been conducted from various perspectives, including career effects and aging effects (Galenson 2000; Galenson and Weinberg 2000; Galenson 2004).

In this study, we develop a novel pricing model to analyze the magnitude of the popularity effect in the artwork market and estimate the artist’s ability. Also, we propose a time-dependent artist popularity measure that can be inferred from artwork sales history data without using the artist’s reputation attributes. Therefore, the proposed measure can consider the popularity at the point of artwork sales. We employ the proposed popularity measure to determine whether the popularity effect occurs in the system. In addition, genuine artist ability under the popularity effect can be inferred to help with artist recommendation systems and artwork price prediction. The influence of variables on pricing can also be analyzed like the hedonic pricing model. The inference algorithm is presented to properly estimate the model parameters and the artist’s ability values. The simulation study is conducted to show the validity of the presented algorithm. The real data analysis detects the rich-get-richer nature of the artwork pricing mechanism with inferences on the analysis of the artist’s abilities.

The remainder of the paper is structured as follows. Section 2 proposes the popularity measure and our pricing model. The inference algorithm of the proposed model is presented in Sect. 3 with a discussion on the validity of the algorithm in Sect. 3. Section 4 analyzes the auction artwork sales data with interpretations. Section 5 concludes the paper with final remarks.

2 The proposed popularity measure and the model

2.1 Background

We deal with an artwork price dataset with the variables including artwork name, artist, date of work, date of sale, and price. In addition, the dataset can have other variables that may affect artwork prices, such as artwork size and medium. Let $i=1,2, \cdots , I$ be the artists of the dataset. We use the notation $t=1,2,\ldots ,T$ for the date of sale and $t'=1,2,\ldots , T'$ for the date of work. Let an artist i created $K_{i,t}$ artworks at the date of sale t, with prices $p_{i,t,k}$, features $x_{i,t,k,m}, m=1,2,\cdots ,M$, and date of work $t'_{i,t,k}$ for each artwork $k=1,2,\ldots ,K_{i,t}$, where M is the number of independent variables. Similarly, we may index the artworks using the date of work, which is another temporal variable. Let an artist i created $K'_{i,t'}$ artworks at the date of work $t'$, with prices $p_{i,t',k'}$, features $x_{i,t',k',m}, m=1,2,\cdots ,M$, date of sale $t_{i,t',k'}$ for each artwork $k'=1,2,\ldots , K'_{i,t'}$. The logarithmic transformation $y_{i,t,k} = \ln (p_{i,t,k})$, $y_{i,t',k'} = \ln (p_{i,t',k'})$ is performed to the prices, which is a common practice in pricing models since prices usually have a positive and right-skewed distribution.

Let $u_{i,t}$ be the popularity of an artist i at the date of sale t. We use the date of sale for the popularity measure since people consider how popular the artist is at the time of purchasing an artwork. We should carefully choose the popularity measure considering the characteristics of the data, as it will greatly influence the evaluation of the rich-get-richer phenomenon. In this paper, the sum of log prices of sold artworks of artist i up to time t is used to measure the artist’s popularity as popular artists tend to have many works sold at high prices. In addition, we divide by 10,000 to make the scale similar to other variables, given by

$$\begin{aligned} u_{i,t} = \frac{1}{10000} \sum _{s=1}^{t-1} \sum _{k=1}^{K_{i,s}} y_{i,s,k}, \quad i=1,2,\cdots ,I,~ t=1,2,\cdots ,T. \end{aligned}$$

(1)

Let $a_{i,t'}$ be the ability of an artist i at the date of work $t'$. Note that we employ the date of work as the time index for the artist’s ability since the quality of the artwork is determined by the ability level of the date of work rather than the date of sale. We assume that an artist i creates the first artwork and the last artwork at $t'=t'_{0,i}$ and $t'=t'_{1,i}$, respectively. The initial ability of an artist i is assumed to follow a normal distribution with mean 0 and variance $\tau ^2$. In most cases, the variance of the artists’ initial ability values is unknown. Therefore, we let $\tau ^2$ be a Bayesian parameter that follows an exponential distribution with mean $\lambda$ as a prior distribution so that the proper value of $\tau ^2$ for the data will be inferred through the posterior distribution. Then, the distribution of initial ability is given by

$$\begin{aligned} a_{i,t'_{0,i}} \sim N(0, \tau ^2), \quad i=1,2,\cdots ,I. \end{aligned}$$

(2)

We allow the ability values to change over time in a random walk for an artist i, $i=1,2,\cdots ,I$, as follows:

$$\begin{aligned} a_{i,t'} \sim N(a_{i,t'-1}, \sigma _a^2), \quad t'=t'_{0,i}+1, \cdots , t'_{1,i}. \end{aligned}$$

(3)

The variance $\sigma _a^2$ of the random walk process determines the magnitude of random jumps between the adjacent time steps. In this paper, we set the two hyperparameters $\lambda = 4.0$ and $\sigma _a^2=0.1^2$ so that the initial abilities are expected to have a standard deviation of 2.0 over artists and allow a change of around 0.1 standard deviation per each time step.

2.2 Formulation of model

We propose a pricing model considering the popularity and ability of artists, given by

$$\begin{aligned} y_{i,t,k} = \sum _{m=1}^M \beta _m x_{i,t,k,m} + \gamma u_{i,t} + a_{i, t'_{i,t,k}} + \epsilon _{i,t,k} \end{aligned}$$

(4)

for $i=1,2,\ldots ,I,~ t=1,2,\ldots ,T,~ k=1,2,\ldots ,K_{i,t}$, where the error term follows the normal distribution with zero mean and the variance of $\sigma _\epsilon ^2$, that is, $\epsilon _{i,t,k} \sim N(0, \sigma _\epsilon ^2)$. We may re-index the observations using the date of work, and the identical representation of Eq. (4) can be given by

$$\begin{aligned} y_{i,t',k'} = \sum _{m=1}^M \beta _m x_{i,t',k',m} + \gamma u_{i,t_{i,t',k}} + a_{i, t'} + \epsilon _{i,t',k'} \end{aligned}$$

(5)

for $i=1,2,\ldots ,I,~ t'=t'_{0,i}, t'_{0,i}+1, \cdots , t'_{1,i},~ k'=1,2,\ldots ,K'_{i,t'}$, where $\epsilon _{i,t',k'} \sim N(0, \sigma _\epsilon ^2)$.

The regression coefficient $\beta _m$ represents the relationship between the m-th independent variable and the log price. The model may contain an intercept term by setting 1 for all the observations corresponding to one of the independent variables. The parameter $\gamma$ describes the size of the rich-get-richer phenomenon of the artwork pricing system. A positive $\gamma$ indicates a positive popularity effect that popular artists would achieve high artwork prices over time, and the stronger rich-get-richer should result in a larger $\gamma$. If the popularity effect works in the opposite direction, i.e., popular artists have a disadvantage in artwork prices, then a negative $\gamma$ would be observed, although a negative popularity effect would be rarely observed in real-world artwork pricing dynamics. However, care should be taken in interpretation when comparing across different datasets, as the estimate of $\gamma$ may vary depending on the scale of popularity $u_{i,t}$ and the characteristics of datasets.

Similar to the random effect model (Kim et al 2021; Francke and Van de Minne 2021), the effect of the artist i’s ability on the price will be directly explained by the ability term $a_{i, t'_{i,t,k}}$ in the proposed pricing model. Also, note that the date of work $t'_{i,t,k}$ is employed for the artwork created at the date of sale t.

Let the dataset has $N = \sum _{i=1}^I \sum _{t=1}^T K_{i,t} = \sum _{i=1}^I \sum _{t'=t'_{0,i}}^{t'_{1,i}} K'_{i,t'}$ observations. For ease of notation, let $x_{obs,m} = (x_{i,t,k,m})_{i=1,2,\cdots ,I, t=1,2,\cdots ,T, k=1,2,\ldots ,K_{i,t}}$, $m=1,2,\cdots ,M$, $u_{obs} = (u_{i,t})_{i=1,2,\cdots ,I, t=1,2,\cdots ,T, k=1,2,\ldots ,K_{i,t}}$, $a_{obs} = (a_{i,t'_{i,t,k}})_{i=1,2,\cdots ,I, t=1,2,\cdots ,T, k=1,2,\ldots ,K_{i,t}}$, and $y_{obs} = (y_{i,t,k})_{i=1,2,\cdots ,I, t=1,2,\cdots ,T, k=1,2,\ldots ,K_{i,t}}$ be the vector of each independent variable, popularity, ability, and dependent variable for all the observations. We denote by

$$\begin{aligned} X_{obs} = \left[ x_{obs,1}, \cdots , x_{obs,m}, u_{obs}\right] \end{aligned}$$

the $N \times (M+1)$ matrix, where the columns $x_{obs,1}, \cdots , x_{obs,m}$, and $u_{obs}$ are the observation vector of independent variables and popularity.

Table 1 summarizes the notations in this paper.

Table 1 Notations and their descriptions

Full size table

3 Inference algorithm

3.1 Probability distributions and their properties

The artist ability values $a=\{a_{i,t'} \mid i=1,2,\cdots ,I,~ t'=t'_{0,i}, t'_{0,i}+1, \cdots , t'_{1,i}\}$ and the initial ability variance $\tau ^2$ are the two latent variables of the model. The model parameters are the regression coefficients $\beta _m$, $m=1,2,\cdots ,M$ for the independent variables, the popularity coefficient $\gamma$, and the error variance $\sigma _\epsilon ^2$. Let us denote the sequence of regression coefficients $\delta = \left( \beta _1, \cdots , \beta _m, \gamma \right)$ and the sequence of all the model parameters $\theta = \left( \beta _1, \cdots , \beta _m, \gamma , \sigma _\epsilon ^2 \right) = (\delta , \sigma _\epsilon ^2)$.

We will use the EM algorithm to update latent variables and model parameters alternately. The E-step updates latent variables through Gibbs sampling, which requires the conditional distributions of latent variables. We also require the complete-data likelihood function to update parameters in the M-step. We first see the basic probability distributions of the model to access these requirements.

The prior distribution of $\tau ^2$ is given by

$$\begin{aligned} p(\tau ^2) = \frac{1}{\lambda } e^{-\tau ^2/\lambda }, \quad \tau ^2 > 0. \end{aligned}$$

(6)

The prior distribution of initial ability values $a_{i,t'_{0,i}}$, $i=1,2,\cdots ,I$ given $\tau ^2$, shown in Eq. (2), can be written as

$$\begin{aligned} p(a_{i,t'_{0,i}} | \tau ^2) = \frac{1}{\sqrt{2\pi }\tau } e^{-\frac{1}{2 \tau ^2} a_{i,t'_{0,i}}^2}, \quad -\infty< a_{i,t'_{0,i}} < \infty . \end{aligned}$$

(7)

The random walk process of the ability change in Eq. (3) can be described by the probability density function

$$\begin{aligned} p(a_{i,t'} | a_{i,t'-1}) = \frac{1}{\sqrt{2\pi }\sigma _a} e^{-\frac{1}{2 \sigma _a^2} \left( a_{i,t'} - a_{i,t'-1}\right) ^2}, \quad -\infty< a_{i,t'} < \infty \end{aligned}$$

for $i=1,2,\cdots ,I$, $t'=t'_{0,i}+1, \cdots , t'_{1,i}$. Finally, we have the probability density function of response variable $y_{i,t,k}$, given by

$$\begin{aligned} p(y_{i,t,k} | a, \theta ) = \frac{1}{\sqrt{2\pi }\sigma _\epsilon } e^{-\frac{1}{2 \sigma _\epsilon ^2} \left( y_{i,t,k} - \left( \sum _{m=1}^M \beta _m x_{i,t,k,m} + \gamma u_{i,t} + a_{i, t'_{i,t,k}}\right) \right) ^2}, \quad -\infty< y_{i,t,k} < \infty \end{aligned}$$

for $i=1,2,\cdots ,I$, $t=1,2,\cdots ,T$, $k=1,2,\ldots ,K_{i,t}$ according to Eq. (4). Another representation of the model provided in Eq. (5) yields the probability density function of $y_{i,t',k'}$ given by

$$\begin{aligned} p(y_{i,t',k'} | a_{i,t'}, \theta ) = \frac{1}{\sqrt{2\pi }\sigma _\epsilon } e^{-\frac{1}{2 \sigma _\epsilon ^2} \left( y_{i,t',k'} - \left( \sum _{m=1}^M \beta _m x_{i,t',k',m} + \gamma u_{i,t_{i,t',k}} + a_{i, t'}\right) \right) ^2}, \quad -\infty< y_{i,t',k'} < \infty \end{aligned}$$

for $i=1,2,\cdots ,I$, $t'=t'_{0,i}, t'_{0,i}+1, \cdots , t'_{1,i}$, $k'=1,2,\ldots ,K'_{i,t'}$. Also, we denote

$$\begin{aligned} p(y_{i,t'} | a_{i,t'}, \theta ) = \prod _{k'=1}^{K'_{i,t'}} p(y_{i,t',k'} | a_{i,t'}, \theta ), \end{aligned}$$

where $y_{i,t'} = \{y_{i,t',k'} \mid k'=1,2,\ldots ,K'_{i,t'} \}$.

We are now ready to derive the required conditional distributions of latent variables for Gibbs sampling. The conditional distributions for the artist’s abilities are summarized in Proposition 1.

Proposition 1

Let us denote by

$$\begin{aligned} b_{i,t',k'} = \sum _{m=1}^M \beta _m x_{i,t',k',m} + \gamma u_{i,t_{i,t',k}} \end{aligned}$$

the linear combination term of the independent variables and popularity in Eq. (5), for $i=1,2,\cdots ,I,~ t'=t'_{0,i}, t'_{0,i}+1, \cdots , t'_{1,i},~ k'=1,2,\ldots ,K'_{i,t'}$. Then we have the following conditional distributions for the abilities of artist i, $i=1,2,\cdots ,I$.

The conditional distribution of the initial ability is given by
$$\begin{aligned} p(a_{i,t'_{0,i}} | \tau ^2, a_{i,t'_{0,i}+1}, y_{i,t'_{0,i}}, \theta ) \sim N\left( \mu _{i,t'_{0,i}}, \sigma _{i,t'_{0,i}}^2\right) , \end{aligned}$$
(8)
where the mean and variance of the normal distribution are
$$\begin{aligned} \sigma _{i,t'_{0,i}}^2 = \left( \frac{1}{\tau ^2} + \frac{1}{\sigma _a^2} + \frac{K'_{i,t'_{0,i}}}{\sigma _\epsilon ^2}\right) ^{-1}, \quad \mu _{i,t'_{0,i}} = \sigma _{i,t'_{0,i}}^2 \left( \frac{a_{i,t'_{0,i}+1}}{\sigma _a^2} + \frac{1}{\sigma _\epsilon ^2} \left( \sum _{k'=1}^{K'_{i,t'_{0,i}}}\left( y_{i,t'_{0,i},k'} - b_{i,t'_{0,i},k'}\right) \right) \right) . \end{aligned}$$
The conditional distributions of the intermediate ability values are given by
$$\begin{aligned} p(a_{i,t'} | \tau ^2, a_{i,t'-1}, a_{i,t'+1}, y_{i,t'}, \theta ) \sim N\left( \mu _{i,t'}, \sigma _{i,t'}^2\right) , \end{aligned}$$
(9)
where the mean and variance of the normal distribution are
$$\begin{aligned} \sigma _{i,t'}^2 = \left( \frac{2}{\sigma _a^2} + \frac{K'_{i,t'}}{\sigma _\epsilon ^2}\right) ^{-1}, \quad \mu _{i,t'} = \sigma _{i,t'}^2 \left( \frac{a_{i,t'-1} + a_{i,t'+1}}{\sigma _a^2} + \frac{1}{\sigma _\epsilon ^2} \left( \sum _{k'=1}^{K'_{i,t'}}\left( y_{i,t',k'} - b_{i,t',k'}\right) \right) \right) , \end{aligned}$$
for $t'=t'_{0,i}+1, t'_{0,i}+2, \cdots , t'_{1,i}-1$.
The conditional distribution of the last ability is given by
$$\begin{aligned} p(a_{i,t'_{1,i}} | \tau ^2, a_{i,t'_{1,i}-1}, y_{i,t'_{1,i}}, \theta ) \sim N\left( \mu _{i,t'_{1,i}}, \sigma _{i,t'_{1,i}}^2\right) , \end{aligned}$$
(10)
where the mean and variance of the normal distribution are
$$\begin{aligned} \sigma _{i,t'_{1,i}}^2 = \left( \frac{1}{\sigma _a^2} + \frac{K'_{i,t'_{1,i}}}{\sigma _\epsilon ^2}\right) ^{-1}, \quad \mu _{i,t'_{1,i}} = \sigma _{i,t'_{1,i}}^2 \left( \frac{a_{i,t'_{1,i}+1}}{\sigma _a^2} + \frac{1}{\sigma _\epsilon ^2} \left( \sum _{k'=1}^{K'_{i,t'_{1,i}}}\left( y_{i,t'_{1,i},k'} - b_{i,t'_{1,i},k'}\right) \right) \right) . \end{aligned}$$

Proof

Using Bayes’ rule, we obtain the relationship for the initial ability of an artist i given by

$$\begin{aligned} p(a_{i,t'_{0,i}} | \tau ^2, a_{i,t'_{0,i}+1}, y_{i,t'_{0,i}}, \theta ) \propto p(a_{i,t'_{0,i}} | \tau ^2) p(a_{i,t'_{0,i}+1} | a_{i,t'_{0,i}}) p(y_{i,t'_{0,i}} | a_{i,t'_{0,i}}, \theta ). \end{aligned}$$

The intermediate ability values of an artist i is

$$\begin{aligned} p(a_{i,t'} | \tau ^2, a_{i,t'-1}, a_{i,t'+1}, y_{i,t'}, \theta ) \propto p(a_{i,t'} | a_{i,t'-1}) p(a_{i,t'+1} | a_{i,t'}) p(y_{i,t'} | a_{i,t'}, \theta ) \end{aligned}$$

for $t'=t'_{0,i}+1, t'_{0,i}+2, \cdots , t'_{1,i}-1$. Finally, the last ability of an artist i can be written by

$$\begin{aligned} p(a_{i,t'_{1,i}} | \tau ^2, a_{i,t'_{1,i}-1}, y_{i,t'_{1,i}}, \theta ) \propto p(a_{i,t'_{1,i}} | a_{i,t'_{1,i}-1}) p(y_{i,t'_{1,i}} | a_{i,t'_{1,i}}, \theta ). \end{aligned}$$

It is straightforward to obtain the probability density function of normal distributions in Eqs. (8), (9), and (10) by computing the product of normal densities.

$\square$

Next, the conditional distribution of $\tau ^2$ is given by

$$\begin{aligned} p(\tau ^2 | a_{i,t'_{0,i}}, i=1,\cdots ,I) \propto p(\tau ^2) \prod _{i=1}^I p(a_{i,t'_{0,i}} | \tau ^2). \end{aligned}$$

(11)

We use the transformation $\eta = \ln (\tau ^2)$ to sample $\tau ^2$ effectively through the ARS(Adaptive Rejection Sampling) that requires the log-concavity of the target distribution (R. Gilks and Wild 1992). Proposition 2 shows the requirement.

Proposition 2

The probability density function $p(\eta | a_{i,t'_{0,i}}, i=1,\cdots ,I)$ is log-concave in $\eta$.

Proof

By using Eq. (11) and $\left| \frac{d \tau ^2}{d \eta }\right| = e^{\eta }$, we obtain

$$\begin{aligned} p(\eta | a_{i,t'_{0,i}}, i=1,\cdots ,I) \propto p(\tau ^2 = e^\eta ) \prod _{i=1}^I p(a_{i,t'_{0,i}} | \tau ^2 = e^\eta ) \cdot e^{\eta }. \end{aligned}$$

(12)

By taking the logarithm and plugging in the probability density functions in Eqs. (6) and (7), we can write

$$\begin{aligned} \ln p(\eta | a_{i,t'_{0,i}}, i=1,\cdots ,I) = - \frac{1}{\lambda } e^{\eta } - \left( \frac{I}{2} - 1 \right) \eta - 2 e^{-\eta } \sum _{i=1}^I a_{i,t'_{0,i}}^2 + C, \end{aligned}$$

where C is constant with respect to $\eta$. Then the second-order derivative

$$\begin{aligned} \frac{\partial ^2}{\partial \eta ^2} \ln p(\eta | a_{i,t'_{0,i}}, i=1,\cdots ,I) = - \frac{1}{\lambda } e^{\eta } - 2 e^{-\eta } \sum _{i=1}^I a_{i,t'_{0,i}}^2 \end{aligned}$$

is negative for any $\eta$, showing the logarithm of the target distribution is concave in $\eta$.

$\square$

We can sample $\eta$ via ARS and take the inverse transformation $\tau ^2 = e^{\eta }$ to obtain samples of $\tau ^2$ for Gibbs sampling.

Next, we turn our attention to the M-step. We need the complete-data likelihood function to construct an optimizing function, given by

$$\begin{aligned} \begin{aligned} L(\theta | a, \tau ^2, y_{obs})&= p(y_{obs} | \theta , a, \tau ^2) \\&= \prod _{i=1}^I \prod _{t=1}^T \prod _{k=1}^{K_{i,t}} p(y_{i,t,k} | a, \theta ). \end{aligned} \end{aligned}$$

Then, the complete-data log-likelihood function of the model can be expressed by

$$\begin{aligned} \begin{aligned} l(\theta | a, \tau ^2, y_{obs})&= \ln L(\theta | a, \tau ^2, y_{obs}) \\&= \sum _{i=1}^I \sum _{t=1}^T \sum _{k=1}^{K_{i,t}} \ln p(y_{i,t,k} | a, \theta ). \end{aligned} \end{aligned}$$

Let $Q(\theta | \theta _{prev})$ be the expectation of the log-likelihood function given by

$$\begin{aligned} Q(\theta | \theta _{prev}) = E \left[ l(\theta | a, \tau ^2, y_{obs}) | y_{obs}, \theta _{prev}\right] , \end{aligned}$$

where the expectation is over the latent variables a and $\tau ^2$. EM algorithm will update the parameter $\theta$ using the previous parameter value $\theta _{prev}$ by maximizing $Q(\theta | \theta _{prev})$ with respect to $\theta$. In practice, obtaining the exact value of $Q(\theta | \theta _{prev})$ is not feasible. Therefore, we approximate the objective function

$$\begin{aligned} \hat{Q}(\theta | \theta _{prev}) = l(\theta | a_{est}, \tau ^2_{est}, y_{obs}) \end{aligned}$$

(13)

by employing the estimate of latent variables $a_{est}$ and $\tau ^2_{est}$, where the estimates should be obtained with given observation $y_{obs}$ and previous parameter estimate $\theta _{prev}$. We can utilize the ordinary least square method to obtain the parameter estimate of $\theta$ that maximizes the function $\hat{Q}(\theta | \theta _{prev})$ for the M-step.

3.2 Algorithm and inference

We present the EM algorithm to obtain the model parameter estimates $\theta$ and the latent variables a and $\tau ^2$. We obtain G Gibbs samples in the E-step for each latent variable. We remove the first $G_0$ samples to remove potential bias due to the initial values. Furthermore, we initialize the Gibbs samples by setting the last Gibbs sample of the previous iteration. In this paper, we set $G=110$ and $G_0=10$.

We set the ability estimate at the s-th iteration of the EM algorithm, $\hat{a}^{(s)} = \left( \hat{a}_{i,t'}^{(s)}\right) _{i=1,2,\cdots ,I, t'=t_{0,i}, t_{0,i}+1, \cdots , t_{1,i}}$, to the average value of the corresponding Gibbs samples, given by

$$\begin{aligned} \hat{a}_{i,t'}^{(s)} = \frac{1}{G-G_0} \sum _{g=G_0+1}^G a_{i,t',(g)}^{(s)}. \end{aligned}$$

In the M-step of Algorithm 1, we employ the current estimate $\hat{a}_{obs}^{(s)}$ given by

$$\begin{aligned} \hat{a}_{obs}^{(s)} = \left( \hat{a}_{i, t'_{i,t,k}}^{(s)} \right) _{i=1,2,\cdots ,I, t=1,2,\cdots ,T, k=1,2,\cdots ,K_{i,t}} \end{aligned}$$

for $a_{est}$ in Eq. (13). Note that $\tau _{est}^2$ in Eq. (13) is not directly used in the ordinary least square optimization process.

We repeat E-step and M-step until the convergence of $\hat{\theta }^{(s)}$, $s=1,2,\cdots$. Then Algorithm 1 provides the final estimates $\hat{\theta } = \hat{\theta }^{(S)} = (\hat{\delta }^{(S)}, \hat{\sigma }_{\epsilon }^{2,(S)})$ for the model parameters. The standard error of the estimate $\hat{\delta }$ can be obtained by the diagonal elements of the $(M+1) \times (M+1)$ matrix $\hat{\sigma }_{\epsilon }^{2} \left( X_{obs}^T X_{obs}\right) ^{-1}$.

We also obtain Gibbs samples $a_{(g)} = a_{(g)}^{(S-1)}$, $\tau _{(g)}^2 = \tau _{(g)}^{2,(S-1)}$, $g=1,2,\cdots ,G$ for the latent variables from Algorithm 1. We define the point estimates for latent variables as the average values of the samples:

$$\begin{aligned} \hat{a} = \frac{1}{G-G_0} \sum _{g=G_0+1}^G a_{(g)}, \quad \hat{\tau }^2 = \frac{1}{G-G_0} \sum _{g=G_0+1}^G \tau _{(g)}^2. \end{aligned}$$

(14)

Since the ability value is given at each time point, we can consider the average value over time as a single-value estimate of an artist i, given by

$$\begin{aligned} \bar{\hat{a}}_i = \frac{1}{t'_{1,i}-t'_{0,i}+1} \sum _{t'=t'_{0,i}}^{t'_{1,i}} \hat{a}_{i,t'}. \end{aligned}$$

(15)

4 Applications to artwork dataset

4.1 Dataset

We use the Artnet auction data, where Artnet is an online platform that operates a global marketplace and informational resource for the art market. In this paper, we focus on contemporary art, by selecting the artists of the Top 500 artists by fine art and NFT auction turnover in the annual report “The Art Market Trends” from 2018 to 2022 provided by Artprice(www.artprice.com). We further select artists born after 1920 and categorized as “Post-War Art” and “Contemporary Art.” Up to 500 artworks are randomly sampled for each selected artist to properly consider less famous artists, preventing potential popularity bias. Next, we remove artworks that do not have images or sold price records.

The obtained dataset contains 64,176 artworks created by 260 artists sold at auctions. We consider data with the date of work from the years 1951 to 2020 and the date of sale from the years 1993 to 2022. As in the simulation study, we use 5 years and 1 year as time windows for the date of work and the date of sale, respectively, so that we have time points $t'=1(1951--1955), 2(1956--1960), \cdots , 14(2016--2020)$ for the date of work and $t=1(1993), 2(1994), \cdots , 30(2022)$ for the date of sale. We choose artists with three or more artworks in each of five or more consecutive dates of work $t'= t'_0, t'_0+1, \cdots , t'_1$ from the time of entry $t'_0$, where $t'_0$ is the first time three or more artworks appear in the system and $t'_1 - t'_0 \ge 5$. We remove the artworks that are not created at $t'= t'_0, t'_0+1, \cdots , t'_1$. Consequently, our final Artnet dataset contains 42,599 artworks created by 152 artists. We consider the following independent variables:

Log Area($m=1$) is the logarithmically transformed values of the physical surface area occupied by the artwork.
Signed($m=2$) refers to whether the artwork is signed by the artist.
Inscribed($m=3$) refers to whether the artwork has additional writing, markings, or inscriptions.
Medium1 is the material used to create the artwork and classified into seven categories: oil($m=4$), acrylic($m=5$), water($m=6$), pastel($m=7$), pencil($m=8$), mixed media($m=9$), and others(reference category).
Medium2 is the material on which an artwork is created and classified into seven categories: canvas($m=10$), cardboard($m=11$), board($m=12$), paper($m=13$), leather($m=14$), and others(reference category).
Year of sale: $t=1(1993, m=15), 2(1994, m=16), \cdots , 30(2022, m=44)$

The artist’s popularity values are calculated using Eq. (1). The dependent variable is the log price (dollar) of artworks.

4.2 Result

Table 2 Estimated regression coefficients and corresponding standard errors of the artnet dataset

Full size table

Table 2 shows the parameter estimation results of the artnet dataset. Figure 1 shows the trace plots of the regression coefficients $\beta _1$, $\beta _2$, $\beta _4$, $\beta _{30}$, and $\gamma$, showing the convergence of the regression coefficients. Although not shown in Fig. 1, the convergence of other parameters is also stable.

We can see that most parameters are statistically significant. Positive $\hat{\beta }_1=0.65$, $\hat{\beta }_2=0.04$, and $\hat{\beta }_3=0.09$ indicate that large, signed, and inscribed artworks tend to be expensive. The parameter estimates from $\hat{\beta }_{15}$ to $\hat{\beta }_{44}$ corresponding to the date of sale imply that the recently sold artworks tend to have higher prices. The order of Medium1 from most expensive to cheapest turns out to be oil($\hat{\beta }_4=0.65$), acrylic($\hat{\beta }_5=0.46$), pastel($\hat{\beta }_7=0.32$), water($\hat{\beta }_6=0.17$), pencil($\hat{\beta }_8=0.06$), mixed media($\hat{\beta }_9=0.04$), and others(reference category). On the other hand, the order of Medium2 from most expensive to cheapest is canvas($\hat{\beta }_{10}=0.73$), board($\hat{\beta }_{12}=0.36$), others(reference category), cardboard($\hat{\beta }_{11}=-0.15$), leather($\hat{\beta }_{14}=-0.26$), and paper($\hat{\beta }_{13}=-0.33$). Most importantly, the statistically significant and positive popularity parameter $\hat{\gamma }=1.38$ suggests that artworks by artists with a high reputation are likely to be pricy.

Through the proposed model, we can infer artist ability values using the popularity and the independent variables as control variables. Artworks by famous artists tend to be expensive, and we try to infer how much this is caused by the artist’s reputation and the artist’s ability. In other words, we estimate the artist’s ability excluding the effect due to the artist’s popularity. Also, we can infer how the author’s ability changes over time. Figure 2 shows the ability inference results for five artists in the dataset. Adrian Gheni has positive ability estimates at all times, indicating his intrinsic ability is higher than other artists. We can also see that his ability gradually increases over time. Note that the average ability should be 0 according to Eqs. (2) and (3). On the other hand, A.R. Penek’s ability variability is small, and he has negative ability estimates for all $t'=1,2,\cdots ,14$. Inferring from Alex Katz’s ability estimate and its trend, we can see that he has approximately average ability and tends to gradually decrease.

Table 3 Top 5 artists based on the ability estimate for each time and the averaged estimate of the Artnet dataset

Full size table

Table 3 shows the top 5 artists based on the estimated ability values at each time point and the average estimate over time, computed by Eqs. (14) and (15), respectively. We can see that Jasper Johns (3.74) has the greatest overall ability. We can also find capable artists at a specific time point, for example, Wayne Thiebaud is the most talented artist from 2001 to 2020. Note that the proposed model estimates the ability values considering the rich-get-richer effect caused by the reputation of the artists.

5 Concluding remarks

In this paper, we propose an artwork pricing model to measure the size of the popularity effect and to infer the intrinsic ability of artists in the presence of the popularity effect. The proposed model has the advantage of utilizing independent variables that affect artwork prices like the hedonic model. The estimation algorithm of the proposed model is presented to infer the model parameters and artist ability values via the EM algorithm and Gibbs sampling. Monte Carlo simulations show that the proposed algorithm works well by showing the convergence of the parameters and the suitability of the inference on the artist’s ability. Additionally, the algorithm works reasonably on real data analysis as well.

We present a popularity measure in Eq. (1) that can be inferred from the dataset without explicit reputation measures of artists. It is worth noting that the presented popularity measure can be customized depending on the characteristics of the dataset. For example, if we have the artist’s reputation measures such as the number of gallery openings and media awareness, then the popularity measure can be modified considering the explicit artist’s reputation values. Also, we usually consider how famous the artist is when the artwork is sold, but we may also use the date of work if appropriate. In this paper, the presented popularity measure has a discrete structure with significant jumps over time. Hence, employing a smoothed popularity measure could be another valuable alternative.

The inference on the artist’s ability is performed on the Artnet dataset regarding the popularity effect in the system. The artist’s ability is allowed to change over time, reflecting the artist’s experience, self-growth, inspiration due to technological advancement, and decline in ability due to personal issues such as aging and health. The degree of change in the artist’s ability is set as a hyperparameter so that we can control for the expected change in the artist’s ability. Also, we may consider the other structures of ability dynamics by modifying the ability assumptions in Eqs. (2) and (3) according to the characteristics of a dataset. For example, we may modify the assumption of the artist’s ability to have discrete levels. Furthermore, the proposed model can cover any time intervals and even irregular time intervals by defining the time index of the date of work. For example, it may be more reasonable to consider the artist’s various creative periods taking into account the characteristics of each period in art history.

The inferred ability values of artists can be applied to the artist recommender system in places such as artwork sales platforms while considering the system’s rich-get-richer effect. In future work, one may develop an artwork recommender system using the idea of the proposed model to eliminate the popularity bias. Another direction of future work includes extending the proposed model to consider the interaction between the popularity and ability of artists. The artist’s ability is a latent variable in the proposed model, affecting their artwork prices. A single latent variable may oversimplify the multifaceted nature of ability, such as creativity, innovation, and technical skill. A model incorporating multi-dimensional latent variables could be another direction for future research.

Data availability statement

The datasets are available from the corresponding author upon request.

References

Angelini, F., Castellani, M., Pattitoni, P.: Artist names as human brands: brand determinants, creation and co-creation mechanisms. Empir. Stud. Arts 41(1), 80–107 (2023)
Article Google Scholar
Ashenfelter, O., Graddy, K.: Auctions and the price of art. J. Econ. Lit. 41(3), 763–786 (2003)
Article Google Scholar
Beckert, J., Rössel, J.: The price of art: Uncertainty and reputation in the art field. Eur. Soc. 15(2), 178–195 (2013)
Article Google Scholar
Bocart, F., Gertsberg, M., Pownall, R.A.: Glass ceilings in the art market. Available at SSRN (2018)
Bratanova, B., Loughnan, S., Klein, O., et al.: The rich get richer, the poor get even: perceived socioeconomic position influences micro-social distributions of wealth. Scand. J. Psychol. 57(3), 243–249 (2016)
Article Google Scholar
Candela, G., Scorcu, A.E.: A price index for art market auctions. J. Cult. Econ. 21, 175–196 (1997)
Article Google Scholar
David, G., Oosterlinck, K., Szafarz, A.: Art market inefficiency. Econ. Lett. 121(1), 23–25 (2013)
Article MathSciNet Google Scholar
Durham, Y., Hirshleifer, J., Smith, V.L.: Do the rich get richer and the poor poorer? Experimental tests of a model of power. Am. Econ. Rev. 88(4), 970–983 (1998)
Google Scholar
Ekeland, I., Heckman, J.J., Nesheim, L.: Identification and estimation of hedonic models. J. Pol. Econ. 112(S1), S60–S109 (2004)
Article Google Scholar
Etro, F., Stepanova, E.: Power-laws in art. Phys. A 506, 217–220 (2018)
Article Google Scholar
Forster, J., Higgs, H.: Artwork characteristics and prices in the New Zealand secondary art market, 1988–2011. N. Z. Econ. Pap. 52(2), 150–169 (2018)
Google Scholar
Francke, M., Van de Minne, A.: Modeling unobserved heterogeneity in hedonic price models. Real Estate Econ. 49(4), 1315–1339 (2021)
Article Google Scholar
Galbraith, J.W., Hodgson, D.J.: Econometric fine art valuation by combining hedonic and repeat-sales information. Econometrics 6(3), 32 (2018)
Article Google Scholar
Galenson, D.W.: The careers of modern artists. J. Cult. Econ. 24(2), 87–112 (2000)
Article Google Scholar
Galenson, D.W.: The life cycles of modern artists. Hist. Meth 37(3), 123–136 (2004)
Article Google Scholar
Galenson, D.W., Weinberg, B.A.: Age and the quality of work: the case of modern American painters. J. Polit. Econ. 108(4), 761–777 (2000)
Article Google Scholar
Garay, U., Puggioni, G., Molina, G., et al.: A Bayesian dynamic hedonic regression model for art prices. J. Bus. Res. 151, 310–323 (2022)
Article Google Scholar
Gilks, R.W., Wild, P.: Adaptive rejection sampling for Gibbs sampling. Appl. Stat. 41, 337–348 (1992)
Article Google Scholar
Higgs, H., Forster, J.: The auction market for artworks and their physical dimensions: Australia-1986 to 2009. J. Cult. Econ. 38, 85–104 (2014)
Article Google Scholar
Hosoya, G.: The artwork and the beholder: a probabilistic model for the joint scaling of persons and objects. Psychol. Aesthet. Creat. Arts 14(2), 224 (2020)
Article Google Scholar
Jung, H.: Eliminating the biases of user influence and item popularity in bipartite networks: a case study of Flickr and Netflix. Phys. A 618, 128695 (2023)
Article Google Scholar
Jung, H., Phoa, F.K.H.: On the effects of capability and popularity on network dynamics with applications to YouTube and twitch networks. Phys. A 571, 125663 (2021)
Article MathSciNet Google Scholar
Jung, H., Lee, J.G., Lee, N., et al.: Ptem: a popularity-based topical expertise model for community question answering. Ann. Appl. Stat. 14(3), 1304–1325 (2020)
Article MathSciNet Google Scholar
Kackovic, M., Hartog, J., van Ophem, H., et al.: The promise of potential: a study on the effectiveness of jury selection to a prestigious visual arts program. Kyklos 75(3), 410–435 (2022)
Article Google Scholar
Kim, A., Kim, C.H., Noh, M.: A small-area data analysis for cancer registration data of Busan. J. Korean Data Anal. Soc. 23(4), 1559–1567 (2021)
Article Google Scholar
Kraeussl, R., Logher, R.: Emerging art markets. Emerg. Mark. Rev. 11(4), 301–318 (2010)
Article Google Scholar
Marchenko, M., Sonnabend, H.: Artists’ labour market and gender: evidence from German visual artists. Kyklos 75(3), 456–471 (2022)
Article Google Scholar
Markusen, A., Johnson, A., Connelly, C., et al.: Artists’ Centers: Evolution and Impact on Careers, Neighborhoods and Economics. University of Minnesota, Center for Urban and Regional Affairs, Minneapolis (2006)
Google Scholar
Peluso, A.M., Pino, G., Amatulli, C., et al.: Luxury advertising and recognizable artworks: new insights on the “art infusion’’ effect. Eur. J. Mark. 51(11/12), 2192–2206 (2017)
Article Google Scholar
Rengers, M., Velthuis, O.: Determinants of prices for contemporary art in Dutch galleries, 1992–1998. J. Cult. Econ. 26, 1–28 (2002)
Article Google Scholar
Schönfeld, S., Reinstaller, A.: The effects of gallery and artist reputation on prices in the primary market for art: a note. J. Cult. Econ. 31, 143–153 (2007)
Article Google Scholar
Shin, D., Lee, K., Lee, H.: Neoliberal marketization of art worlds and status multiplexity: price formation in a Korean art auction, 1998–2007. Poetics 43, 120–148 (2014)
Article Google Scholar
Song, H.Y., Park, H.W.: Comparison of popular YouTube video scripts and commentary networks in the economic sector: focusing on Sampro Tv channels. J. Korean Data Anal. Soc. 24(2), 843–859 (2022)
Article Google Scholar
Sproule, R., Valsan, C.: Hedonic models and pre-auction estimates: abstract art revisited. Econ. Bull. 26(5), 1–10 (2006)
Google Scholar
Ursprung, H.W., Wiermann, C.: Reputation, price, and death: an empirical analysis of art price formation. Econ. Inq. 49(3), 697–715 (2011)
Article Google Scholar

Download references

Acknowledgements

This study was supported by the Sungshin Women’s University research grant of 2022 (H20220024).

Author information

Authors and Affiliations

Department of Information Statistics, Chungbuk National University, Cheongju, 28644, Republic of Korea
Jinsu Park
Department of Statistics, Sungshin Women’s University, Seoul, 02844, Republic of Korea
Yoonjin Lee & Hohyun Jung
Department of Information and Statistics, Chungnam National University, Daejeon, 34134, Republic of Korea
Daewon Yang
Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, 23955, Saudi Arabia
Jongho Park
Data Science Center, Sungshin Women’s University, Seoul, 02844, Republic of Korea
Hohyun Jung

Authors

Jinsu Park
View author publications
You can also search for this author in PubMed Google Scholar
Yoonjin Lee
View author publications
You can also search for this author in PubMed Google Scholar
Daewon Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jongho Park
View author publications
You can also search for this author in PubMed Google Scholar
Hohyun Jung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hohyun Jung.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Simulation study

1.1 A.1 Synthetic dataset

We perform a simulation study to verify the validity of the estimation process of the proposed model. We generate a synthetic dataset by imitating the process of creating artwork, using the log area and the date of sale as independent variables. The artwork is assumed to be created during the period from 1951 to 2020; while, the time index for the date of work is defined using a 5-year window: $t'=1(1951--1955), 2(1956--1960), \cdots , 14(2016--2020)$. On the other hand, the date of sale considers the years from 1993 to 2022 denoted by $t=1(1993), 2(1994), \cdots , 30(2022)$. We have 31 independent variables including the logarithm of area($m=1$) and 30 dummy variables corresponding to each date of sale $t=1(m=2), 2(m=3), \cdots , 30(m=31)$. The true parameter values are set to $\beta _1=2.0, \beta _2=5.0, \beta _3=5.1, \beta _4=5.2, \cdots \beta _{31}=7.9, \gamma =1.0$, and $\sigma _\epsilon ^2=1.0^2$. The regression coefficients indicate that artworks that are large, recently sold, and created by popular artists are likely to be expensive. We set the variance of the initial abilities $\tau ^2=2^2$ and the step variance of the ability change $\sigma _a^2=0.1^2$ in Algorithm 1.

Let the number of artists be $I = 100$ in the system. We first generate the true ability of artists according to Eqs. (2) and (3). We assume that each artist creates 500 artworks, where each artwork is created at $T_{work} \sim Uniform(1951, 1952,..., 2020)$, and the year of sale is randomly chosen from $Uniform\left( \max (T_{work},1993), 2022\right)$. The log area of an artwork is sampled from the lognormal distribution with the parameters $\mu =3$ and $\sigma =0.5$. Finally, the artwork price is determined according to Eq. (4), where an artist’s popularity is computed by Eq. (1). Then the generated synthetic dataset has 50,000 artworks, and Table 4 displays the first five artworks of the synthetic dataset.

Table 4 First five artworks of the synthetic dataset

Full size table

1.2 A.2 Parameter estimation

Table 5 True and estimated regression coefficients with corresponding standard errors of the synthetic dataset

Full size table

Table 5 summarizes the parameter estimation results of the proposed model with Algorithm 1. T statistics and P-values are derived through hypothesis testing, assessing whether each regression coefficient equals its true value. We can see that the estimated parameter values are similar to the true parameter values and all P-values are larger than 0.03, suggesting that the estimation algorithm works properly.

Figure 3 shows the trace plots of the regression coefficients $\beta _1$, $\beta _{11}$, $\beta _{21}$, $\beta _{31}$, and $\gamma$ over 10,000 iterations of Algorithm 1. We can see the convergence of the regression coefficients, ensuring the reliability of parameter estimates.

1.3 A.3 Inference on artist ability

Figure 4 shows the true and estimated ability values of the five artists according to Eq. (14). The confidence interval for each ability value $a_{i,t'}$ is calculated based on $\pm 2$ times the standard deviation of Gibbs samples $a_{i,t',(g)}$, $g=G_0+1, G_0+2, \cdots , G$ centered on the estimate $\hat{a}_{i,t'}$. The true ability values are mostly contained in the confidence interval, implying the validity of the ability inference. Further, we examine the performance of the ability inference using quantitative evaluation measures MSE(mean squared error), RMSE(root mean squared error), MAE(Mean Absolute Error), and PCC(Pearson Correlation Coefficient) based on the true values $a_{i,t'}$ and the estimated values $\hat{a}_{i,t'}$ for all the artists $i=1,2,\cdots ,I$ and all the time points $t'=1,2,\cdots , T'$. The evaluation measures are computed by

$$\begin{aligned} \begin{aligned} MSE&= \frac{1}{I \cdot T'} \sum _{i=1}^I\sum _{t'=1}^{T'} \left( a_{i,t'} - \hat{a}_{i,t'}\right) ^2, \\ RMSE&= \sqrt{MSE}, \\ MAE&= \frac{1}{I \cdot T'} \sum _{i=1}^I\sum _{t'=1}^{T'} \left| a_{i,t'} - \hat{a}_{i,t'}\right| , \\ PCC&= \frac{\sum _{i=1}^I \sum _{t'=1}^{T'} (\hat{a}_{i,t'} - \bar{\hat{a}}) (a_{i,t'} - \bar{a})}{\sqrt{\sum _{i=1}^I \sum _{t'=1}^{T'} (\hat{a}_{i,t'} - \bar{\hat{a}})^2} \sqrt{\sum _{i=1}^I \sum _{t'=1}^{T'} (a_{i,t'} - \bar{a})^2}}, \end{aligned} \end{aligned}$$

where $\bar{a} = \frac{1}{I \cdot T'} \sum _{i=1}^I \sum _{t'=1}^{T'} a_{i,t'}$ and $\bar{\hat{a}} = \frac{1}{I \cdot T'} \sum _{i=1}^I \sum _{t'=1}^{T'} \hat{a}_{i,t'}$ are the average values of true and estimated ability values, respectively. We have $MSE=0.0129$, $RMSE=0.1137$, $MAE=0.0935$, and $PCC=0.9987$ for our synthetic dataset. The relatively small error-based measures of MSE, RMSE, and MAE indicate the true and estimated ability values are similar to each other considering the scale of the ability values. Also, we can see that PCC is very close to 1, indicating a strong positive correlation between the true and estimated ability values, which further supports the accuracy of the inference of the true ability values.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Park, J., Lee, Y., Yang, D. et al. Artwork pricing model integrating the popularity and ability of artists. AStA Adv Stat Anal (2024). https://doi.org/10.1007/s10182-024-00504-3

Download citation

Received: 21 November 2023
Accepted: 19 June 2024
Published: 02 July 2024
DOI: https://doi.org/10.1007/s10182-024-00504-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Artwork pricing model integrating the popularity and ability of artists

Abstract

1 Introduction

2 The proposed popularity measure and the model

2.1 Background

2.2 Formulation of model

3 Inference algorithm

3.1 Probability distributions and their properties

Proposition 1

Proof

Proposition 2

Proof

3.2 Algorithm and inference

4 Applications to artwork dataset

4.1 Dataset

4.2 Result

5 Concluding remarks

Data availability statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A: Simulation study

Appendix A: Simulation study

1.1 A.1 Synthetic dataset

1.2 A.2 Parameter estimation

1.3 A.3 Inference on artist ability

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation