Crowdfunding research investigates the principles and interactions of donors, backers, or investors, often equated as the “crowd,” with projects initiators, entrepreneurs, artists, or any individual or organization seeking financial support via platforms [1,2,3]. Following established definitions of crowdfunding, we hereafter term these backers, projects, and platforms [4, 5]. Irrespective of the type of crowdfunding (equity-based, lending-based, or reward- and donation-based—see, e.g., [6,7,8], previous research on crowdfunding discovered a plethora of valuable findings for theory development and practice. Among these findings, researchers consistently looked on the effects of project features on crowdfunding success variables over time [9,10,11,12,13]. However, what has been widely ignored is the type of the relationship project features posses with the respective success variables. So far, the focus lay on the direction and significance of certain project characteristics. For example, project initiators may ask how many updates are required to maximize the likelihood to achieve the funding goal, irrespective of their distribution over the funding period. One can argue that “more is always better” resulting in attempts of daily updates. Contrarily, backers may be wearied of numerous updates to work through, indicating a possible wear-out effect, implying the existence of an optimal level of updates. Other variables that can be controlled or at least affected by project initiators may follow other patterns. For instance, Metcalf’s law [14] stating that the value of a social network is the square of its members implies a potential quadratic relationship between social media variables like Facebook likes and project success. In these cases, the “more is always better” logic may be advantageous, and project initiators are well advised to switch from variables with declining impact to social media variables. In a methodological sense, both examples indicate non-linear effects that cannot be captured by the predominantly linear (logit) regression methods used in previous research [15]. Further and notwithstanding the importance of time-series effects, research on the non-linear nature of relationships between variables and crowdfunding success is scarce [16].

Hence, we try to fill this void by looking at non-linear effects of variables in a large-scaled sample from the reward-based crowdfunding platform Kickstarter consisting of 294,000 projects between 2009 and 2016. The intent of the paper is to improve probability of success of crowdfunding projects. Since there is little known about non-linear effects, we follow an explorative approach, applying non-linear generalized additive models that are compared to a traditional linear model. With current advances in research methodologies and data availability, an oversimplified linear relationship loses its validity. We therefore try to provide a first step within the crowdfunding domain, but urge researchers to adopt these methodologies in order to question, confirm, or disprove established relationships and findings.

The structure of the paper is the following. The “Theoretical background” section deals with the theoretical approaches that can explain the expected non-linear effects in information systems research and related disciplines. The “Methodology” section introduces the model and the preliminary results. Finally, conclusions and discussion are exposed.

Theoretical background

Previous research

Only few researchers have investigated the non-linear effects on crowdfunding success so far. Most explicitly, Gleasure and Feller [15] investigate 5736 campaigns from the charity-platform (donation-based) with curvilinear (U-shaped) regressions. They find that comments as well as donations from project initiators to other projects affect funding following an inverted U-shape (i.e., moderate amounts improve funding). On the contrary, anonymous donations show a U-shaped (i.e., low or high amounts improve funding) effect on funding. Kuppuswammy and Roth use survey data from 284 Kickstarter projects (reward-based) and have established a marginal effect of funding on external financing following a cubic shape (increasing first, then declining), thus a downstream effect of funding success [6]. Ward and Ramachandran apply squared terms for the interactions of project age with comments and project age with updates in 3865 projects from (donation-based) [17]. Inherently a time-series study, these most simple types of non-linear proxies of time and comments/updates have not yielded any significant results. Using a probit model, that is explicitly modeling funding as the probability of an individual pledge, Giudici et al. [18] find a negative non-linear relationship of geo-social capital (i.e., the variance of backers’ origins) with funding behavior (decreasingly negative), even pronounced when the backers have many Facebook friends (individual social capital). Here, 699 backers from 11 Italian crowdfunding platforms (of unknown type) that provided further data have been investigated. Finally, and focusing on geographic aspects as well, Agrawal et al. investigate 4712 projects from (donation-based) and multiple non-linear effects of the amount of investments already pledged on the investment probability of backers [19]. For instance, locally distant backers have a progressively higher probability for increasing amounts of investments, while locally close backers are digressively less likely to invest for increasing amounts of investments. To the best of our knowledge, these effects have not yet been modeled as non-linear (amounts of investments were grouped). Overall, these studies indicate a substantial lack of research. Predominantly charity- or artist-based platforms and a focus on variables that cannot be affected by project investors (e.g., geographic location of backers) raise questions about generalizability and robustness of previous effects. Insights from a more variable rich, reward-based platform thus may help to overcome these issues.

Possible effects in crowdfunding

So far, no theory or theoretical approach has been provided that can explain the expected non-linear effects in information systems research and related disciplines. We will therefore focus on an exploratory approach in this study that is aiming to stimulate theory development by developing four potential explanations of what theories can help to explain non-linear instead of linear effects.

First, optimum stimulation level theory [20] stipulates that an individual has its highest performance (i.e., awareness) in moderate levels of stimulation (arousal), while lower or higher levels lead to less awareness. This theory has been found to explain diverse phenomena such as information search behavior [21] or website complexity perception [22]. Transferred to crowdfunding, the inverted U-shaped effect of comments on funding success [15] can be explained by this theory as well. Only few comments may possess low value for potential backers as they are too limited in its content resulting in a low level of stimulation and thus low awareness. Contrarily, a very large amount of comments may be perceived as too much to work through in a given time (cognitive overload) and thus leads to overstimulation and low awareness. Consequentially, a moderate number of comments may be not too trivial and not too much information for an interested investor and yields increased awareness.

Second, Metcalf’s law assumes an exponential value of a (social) network based on the number of users [14]. Still, network effects as well as viral effects have already explained dynamics in social networks such as Facebook or Twitter [23]. As outlined above, the mere number of supporters in form of Facebook likes, shares, or Twitter tweets may be seen as an indicator of support for the project.

Third, in financing markets with asymmetric information between entrepreneurs and backers like crowdfunding, any information contains a signal to the potential investor [24,25,26]. Some factors (e.g., Facebook friends) can clearly be expected to either increase the signaled quality or at least do no harm the higher they become [27]. However, others like project comments might signal bad preparation or too much uncertainty once the number exceeds a certain threshold. For example, in their fsQCA analysis on different paths to success, Kraus et al. only found the number of comments relevant for one out of three configurations leading to success [13]. Increased perceived uncertainty of projects reduces funding significantly [28].

Fourth and finally, a “saddle” effect is plausible from wear-out effects [29]. For example, marginal utility theory [30] states that additional consumption of an item with an objectively constant value can lead to complete saturation. Applied to information system research, it has been found that content contribution in social media can saturate [31]. In a crowdfunding context, it is possible that multiple viewable project updates, each of one may be justified and contain the same amount of information with constant quality), may wear out as (interested) backers may find repetitive project information. It should be noted that these explanations are not exclusive. Posting an enormous amount of updates can lead to cognitive overload and thus reduces its value to zero. It is also plausible that there is a “critical mass” [32] after which a former saddle effect gets exponential or an exponential effect gets saturated. To the best of our knowledge, these effects have not been investigated before in a time-invariant setting.



Due to the implications from previous research, we selected a large-scale dataset from a reward-based platform, Kickstarter. The data was obtained using a self-programmed web-crawler collecting a wide set of variables a project initiator can affect projects from Kickstarter’s initial start in 2009 to the end of 2016. Overall, 294,150 valid projects were retrieved. Film or video projects (19%, n = 54,525) are most frequent, followed by music (16%, n = 45,606), publishing (11%, n = 31,255), games (8%, n = 23,964), and technology (8%, n = 22,584). Projects have an average funding period of 34.32 days (SD = 13.10) and an average goal of 45,961.35 USD (SD = 1,139,705.09) and are backed by 90.61 investors on average (SD = 782.16) that pledge a mean of 7440.47 USD (SD = 75,234.47). That yields a mean success rate of 36.09% (SD = 48.02). Furthermore, projects provide on average 2.54 updates (SD = 4.55) and 7.60 rewards (SD = 4.83). Kickstarter users posted 27.56 comments (SD = 1010.63), 156.55 (SD = 1391.13) Facebook shares, and 37.89 (SD = 646.29) Twitter tweets per project. Creators had on average 820.26 (SD = 978.18) Facebook friends. All descriptive information corresponds to previous research [11, 33].


Non-linear effects can be incorporated into a variety of model types. However, incorporating non-linear effect terms in linear models, usually via polynomials, requires hypotheses about their nature (e.g., exponential, cubic) and successive significance testing. Since significance testing is not advisable in large-scale datasets [34], our approach is inherently explorative. As there is no theoretical assumption which variable follows which non-linear pattern, a family of non-linear models is chosen which resembles this exploratory approach. Generalized additive models (GAM) [35] try to find segments with (unique) non-linear patterns and aggregate these segments to a continuous function. Recent developments [36] have improved GAMs substantially and eased its application to the present type of datasets. We have selected two GAMs to estimate our models. The first GAM uses simple polynomial b-splines of third-degree polynomials of 95 percentile data to remove the sensitivity to outliers. Hence, these “b-splines”-termed models will show a strongly “smoothed” general trend among the variables. A second GAM uses low-rank isotropic smoothers using thin plates as penalty parameters to avoid oversaturation. Hereafter termed “tp-splines,” these advanced GAMs will produce a less smoothed, more data-driven trend. To compare the results with linear models, a traditional linear regression model using maximum likelihood-estimators is applied and quantiles as described before are used likewise.

Our modeling resembles previous approaches [10,11,12, 37], i.e., we use crowdfunding success as the ratio of amount pledged to goal, and we add a variety of control variables such as the log of goal, starting year, starting month, category, number of backers, and duration of the project. Further, we incorporate the amounts of updates, comments, rewards, Facebook friends, Facebook shares, and Twitter tweets as the focal variables that can be at least to some degree affected by projects with (b-spline, tp-spline) or without (linear model) a non-linear term. The general notation is therefore (consecutive numbering of parameters β for all values X):

$$ {Y}_{\mathrm{success}}=\log \left({\beta}_1\right){X}_{\mathrm{goal}}+{\beta}_2{X}_{\mathrm{year}}+{\beta}_3{X}_{\mathrm{month}}+{\beta}_4{X}_{\mathrm{category}}+{\beta}_5{X}_{\mathrm{backers}}+{\beta}_6{X}_{\mathrm{duration}}+{\omega}_1\ast {\beta}_7{X}_{\mathrm{updates}}+{\omega}_2\ast {\beta}_8{X}_{\mathrm{comments}}+{\omega}_3\ast {\beta}_9{X}_{\mathrm{rewards}}+{\omega}_4\ast {\beta}_{10}{X}_{\mathrm{FB}\ \mathrm{friends}}+{\omega}_5\ast {\beta}_{11}{X}_{\mathrm{FB}\ \mathrm{shares}}+{\omega}_6\ast {\beta}_{12}{X}_{\mathrm{Twitter}\ \mathrm{tweets}} $$

Omega (ω) is the additional term for either a b-spline (non-linear), a tp-spline (non-linear), or a linear model (1). Contrast categories are used for year (2009 to 2016), month (January to December), and category (15 categories ranging from art to theater). For the same reasons as explained before, we resign from model comparison tests within nested models of the same type, for instance, more parsimonious models for linear models and reliance on significance tests in the meantime.

Preliminary results

Non-linear and linear effects

In order to check the validity of our model, we first provide the estimates in Table 1. The analysis shows the consistent result to previous research [11, 38], regarding the direction and significance of the coefficients (Table 1).

Table 1 Coefficients. Non-linear and linear effects

Interpreting coefficient based on a single number is often misleading. Therefore, as the non-linearity of the effects is difficult to ascertain from the table, we now turn towards illustrative evidence; following Lin et al. who pointed out that studies with large sample sizes should not solely rely on p values, we also provide visual evidence for our models to illustrate the effects [34].

Figure 1 depicts the preliminary results regarding the non-linearity of updates, comments, rewards, Facebook friends, Facebook shares, and Twitter tweets based on their predicted values from the full models (with control variables). Most remarkably, none of the six variables clearly follows a linear pattern:

  • Updates’ effect on crowdfunding success is degressive both from a general trend (b-spline) and from a more data-driven trend (tp-spline) with a maximum of five updates per project (point of saturation). Despite marginal differences for ten and more updates in both splines, a wear-out effect is obvious compared to linear regression assuming a positive relationship.

  • Comments seem to be beneficial up to a maximum of four (tp-spline) and eight (b-spline) comments. Then, backers’ comments continuously lose importance until getting detrimental. We prefer the tp-spline relationship here as residuals somewhat increased for larger amounts of comments in b-splines indicating decreased prediction. In a complete picture, this finding may be contrary to Gleasure and Feller, who have found an inverted U-shape [15]. However, it might be plausible that a limited number of comments in have concealed the later relationship. In the range of 0 to 20 comments, the inverted U-shape is apparent. Too many comments might deter backers as this could signal ambiguity about the projects’ goal and characteristics.

  • Rewards provided by projects follow an inverted U-shape as well, in a very stretched way for both non-linear models. Remarkably, their maximum effect is different, seven rewards for the more sensitive tp-spline and nine for the b-spline. Linear regression again assumes a positive effect. That is, scarce as well as abundant amounts of rewards are detrimental, compared to moderate quantities. This makes sense as choice clutter can create confusion among backers and therefore diminish returns Tiwana [39].

  • Facebook friends seem to have a slight U-shaped relationship with crowdfunding success according to the non-linear models, in contrast to the linear model that cannot find a substantial effect. Thus, only the “stars” with an enormous amount of social media supporters (maximum 4859) benefit from their popularity. Alike both subsequent variables, this effect is rather subtle.

  • In contrast to this, Facebook shares illustrate a degressive effect with local maxima at 583 (tp-spline) and 1530 (b-spline) shares, hence supporting the assumption of a wear-out effect. This is considerably lower than the implication from the linear regression indicating increasing importance of Facebook forwarding. A possible explanation might be that extreme high values of Facebook shares might point towards fraud as proposed by Wessel et al. [38].

  • Lastly, Twitter tweets follow a comparable shape like Facebook shares in the linear model, while both non-linear models denote a wear-out effect for larger amounts of tweets once again. Maximum efficiency is achieved at, quite differently, 2064 (b-spline) or 365 (tp-spline) tweets. Again, linear regression assumes that more tweets result in more success.

Fig. 1
figure 1

Non-linear and linear effects

Goodness of fit and explained variance

In order to evaluate the advantage or disadvantage of all types of models, we apply global as well as local criteria. Fit tests like a chi-square-based likelihood test are inappropriate because of the different model assumptions as well as the large dataset. We thus apply the information criteria BIC and log-likelihood. Further, we rerun isolated effect models to obtain estimates for the relative contribution of each of the six variables. Table 2 summarizes the model comparisons.

Table 2 Goodness of fit

Overall, both non-linear models were substantially better able to predict success through updates and comments while being on par or slightly better than a linear model for all other variables (rewards, Facebook friends and shares, Twitter tweets). Consequentially, model prediction improved from .51 to .55% of variance explained. That does not appear to be much, nonetheless successful crowdfunding is substantially determined by vital project (e.g., goal, category), time (year, month, duration), and participation aspects (backers). Equal aggregated R2s of .45 confirm the importance of these control variables. As a consequence of better prediction, BIC and log-likelihood indicate that the data is best predicted by the rather sensitive tp-spline model, which is also most parsimonious (BIC, d.f.).


Implications for practice

A common theme in our analysis of 294,150 projects from the crowdfunding platform Kickstarter is that the “more is always better” logic indicated by positive linear relationships is consistently erroneous for factors that project initiators can at least affect or stimulate by some degree. It could be shown that non-linear (tp-spline) models capture the underlying effects of updates, comments, rewards, Facebook friends, Facebook shares, and Twitter tweets in Kickstarter data better than linear models. Consequently, most factors show a wear-out effect. Crowdfunding success is maximized for five updates, four comments, seven rewards, around 600 Facebook shares, and approximately 400 Twitter tweets. Only Facebook friends provide distinct value for very large quantities (~ 5000 friends), but with increased funding success for projects that have less than 500 friends (U-shape). Our analyses may guide practitioners to better manage participation and social media targets, especially in a—for crowdfunding projects typical—environment of very scarce resources. For example, if social media response is already high and no or only marginal further improvement can be made via Facebook and Twitter dissemination, project initiators should focus on using resources to provide initial updates and expand rewards to the anticipated limits. Since we have controlled for various background effects (e.g., funding goals), used representative data for a reward-based crowdfunding platform, and found consistent results with respect to previous research, we assume that our predictions provide high security for practitioners in terms of allocating resources to the strings that should be pulled.

Implications for further research progress

Our preliminary analyses of non-linear effects in crowdfunding are still at an early stage for different reasons. First, so far and due to the outlier sensitivity of non-linear as well as linear models, we used only the .95 quantiles of all Kickstarter data. Exploratory analyses with .99 quantile data revealed comparable effects, but further outlier detection instead of restrictive sampling is on the agenda for future analyses. Second, modeling should be expanded to both simpler and more complex models and further advanced techniques. Most importantly, investigating fixed and random effects in mixed models [36] may allow to better represent the underlying effects in Kickstarter and other crowdfunding platforms. Non-linear interaction effects, despite being rather difficult to understand, are also on our roadmap. By the same token, panel data can help to better differentiate time-specific and time-invariant effects. For example, project updates can develop a wear-out effect in time when backers and users are annoyed by untimely updates as well as a stable wear-out effect when there are multiple updates to work through on the project page. Third and notwithstanding the important implications for practice, theory development should be strengthened to explain the effects found. Since non-linear effects research is so scarce, we have not found enough theoretical anchors and concepts to fully understand theoretical implications. Again, we hope to achieve progress with this issue. However, our exploratory approach and preliminary results—showing that the “more is always better” logic might be erroneous or at least misleading—may help to stimulate discussions and dissemination, thereby advancing theory development, extend replicative analyses, and improving guidelines aimed at practitioners.