Data sample
We collected real crowdfunding project data from Kickstarter.com to carry out our empirical analysis. We use Kickstarter mainly for two reasons. First, Kickstarter is a popular and prevalent crowdfunding platform. Founded in 2009, Kickstarter has become one of the largest crowdfunding platforms in the world. It has more than nine million backers, and three million of them are repeat backers. As of today, more than 93,000 projects have been successfully funded, and more than two billion dollars have been raised (Kickstarter 2015). Second, the majority of research on crowdfunding uses Kickstarter data (Greenberg et al. 2013; Li and Duan 2014). This makes the comparison of our results against those previous reported more meaningful and reliable.
Kickstarter doesn’t provide a pubic API (Application Program Interface), and the non-live projects (e.g., completed, canceled, etc.) are not directly searchable. However, live projects are organized in category and are convenient to navigate. Our data collection mainly consists of two steps. First, started in late 2012, we scraped the those “live” projects from Kickstarter websites by using a specially developed crawler. The crawler visited the website every other day and captured all live projects newly launched. Second, we scraped project data in early years based on those live projects already collected. Project’s profile page contains the historical projects created and backed by the owner, and comments and updates contain backers’ information which leads to other projects they backed. Similar to the approach used by Zvilichovsky et al. (2015), we used those “live” projects as seed and recursively iterate from projects to backers and backers to projects until the number of newly discovered projects per iteration converged. This step was only performed once a while when a big enough number of new projects were scrapped.
Our data sample covers all the projects from 2009 to November 2014. We excluded those funding projects that were still ongoing (6559 projects). In addition, we excluded those projects that were canceled (15,116 projects), purged (36 projects), and suspended (584 projects). Purged and suspended projects were usually handled by Kickstarter according to its policy or terms of use. Projects were canceled by owners for a variety of reasons. It’s possible that majority of projects were canceled because they were unlikely to reach the funding goals and project owners want to avoid the dismal end (stonemaiergames.com 2013). A brief examination also finds that many projects were canceled because they were simply testing projects, with unreasonable low funding goal (e.g., $1, $2) or duration (e.g., 1 day). And some projects were canceled because project owners want to make necessary improvement and re-launch the project, or “amazing partners reached out” during the campaign (needwant.com 2016). It’s also interesting to find out some projects were canceled even after successfully funded because of unforeseen changes from either project owners or backers (themarysue.com 2016). These projects were not treated as failed projects because they were not typically failed projects. We also followed a previous study (Mollick 2014) and removed those projects with a funding goal below $100 (1982 projects) or above $1,000,000 (294 projects), because those extremely small or large projects may have characteristics very different from the majority of projects. We finally removed those projects with less than 100 words in their descriptions, because, upon inspection, they are either incomplete or represent non-serious efforts to raise funds. Our final data sample consists of 151,752 projects across all 15 funding categories. These steps are summarized in Table 2.
Table 2 Sample selection and description
Table 3 shows the descriptive statistics for variables used in this paper. On average, the projects in our sample have an average funding goal of $15,126, with half of them less than $5000. The average (median) campaign duration is 34 (30) days; 47 % of projects have at least an image, with the average (median) number of 4.67 (1); and 80 % of projects have at least a demo video, with the average (median) of 1.18 (1). The results also show that the descriptions have an average (median) length of 646 (482) words, are generally positive (with a net positive tone), and are easy to understand (with fog index around 13). In addition, although owners usually have some past experience, with several projects backed or created, their past expertise is very limited. Averagely, they have raised 22 present of required funds on their previous projects, if any (more than 75 % of projects are created by the first-time owners).
Table 3 Descriptive statistics
In the follow sections, we present our empirical analysis results from three aspects. First, we briefly discuss the current status of crowdfunding on Kickstarter from different angles such as category and year. Second, we evaluate the increment influence of newly identified antecedents and report the practical improvement when they are included to predict funding success. Third, we investigate the timeliness of project data and provide evidence that old project data is gradually becoming less relevant and losing predictive power to newly created projects.
Descriptive results
Overall funding status by category
Tables 4 and 5 present the status of crowdfunding by category on Kickstarter. Overall, the success rate of our sample projects is 46 %, which is comparable to what reported by Kickstarter (Kickstarter 2015). Most of the basic project properties vary across 15 categories. Table 4 shows that in term of project number, the top three categories created are Film & Video, Music, and Publishing; and the least three are Dance, Craft, and Journalism. However, a project attractive to owners doesn’t necessarily mean it is also attractive to backers. For example, Dance is one of the three categories that are least attractive to owners but has the highest success rate among all the categories, possibly because of low fund required and less market saturation (competition). On average, a project requires raising a fund of $14,541. Technology and Games require the highest funds, $39,073 and $27,520, respectively; Music and Crafts require the least funds, $7289 and $5794, respectively. It is also reasonable to notice that, generally, project categories requiring higher (lower) funds have lower (higher) success rates.
Table 4 Previously identified antecedents by category*
Table 5 Newly identified antecedents by category
Mollick (2014) reports that crowdfunding projects mostly succeed by narrow margins but fail by large amounts. However, we found this is more noticeable in popular categories with a significant number of projects. For example, Untabulated results show that only 3 % of the funded projects in Film & Video category have a funding ratio greater than 2 (two times the required fund) while that percentage is around 25 % in Comics and design category. Duration has little variation among different categories, with a range from 33 and 36 days, possibly because that Kickstarter has a default duration of 30 days and limits it up to 60 days (it was 90 days before June 2011). Surprisingly, there are some projects have a duration less than 5 days. Further investigation finds they are mostly small project, with funding goals less than $500.
Table 5 presents the data related to new antecedents derived from project descriptions. Projects in Games has the longest description (11,24 words) and those in Music has the shortest (453 words). It is also worth noting that the overall readability is high (with Fog index around 13). Possible reason is that comparing to traditional business plans, project descriptions are more likely to be written the informal language. Both past experience and past expertise vary greatly across categories, reflecting the different popularity and competition among categories.
Overall funding status by year
Tables 6 and 7 present the status of crowdfunding by year on Kickstarter. We find the project properties are relatively more stable over time than across categories. The results show a clear trend that the number of projects and backers and the amount of goal are all increasing over time, though the speed is decreasing. This is reasonable because, as more users join Kickstarter and become more familiar with the platform, more projects are created. In addition, the mutual trust between project owners and backers increases with the familiarity and maturity of Kickstarter platform, thus, more expensive ones are likely to be funded in later years. The results also shows that the durations before 2011 are higher, which is consistent with the fact that Kickstarter allows a duration up to 90 days before June 2011 but reduces to 60 days afterward.
Table 6 Previously identified antecedents by year*
Table 7 Newly identified antecedents by year
The results in Table 7 show that, although readability and objectivity of project descriptions are relatively stable, project owners are disclosing more information through project description over time, with an exception in 2014. Specifically, the length of project descriptions increases from 405 to 718 words. Another interesting finding is that increasing trend of both past experience and past experiment. As shown in the results, the value of past experience increase consistently from 3.65 in 2009 to 8.15 in 2014, and the value of past expertise also increases consistently from 0.03 in 2009 to 0.32 in 2014. This is reasonable because that, over time, project owners gain experience and expertise by backing and creating more projects, and by making projects more persuasive thus more likely to raise funds.
Impact of project descriptions on predicting funding success
Logistic regression results
In order to investigate whether those newly identified antecedents have incremental influence on funding success, we run two logistic models. Model A represents the mainstream model which only includes antecedents identified by previous research from basic project properties (i.e., control variables); Model B represents the proposed model which also includes those exemplary antecedents identified by this study. The results are reported in Table 8 below.
Table 8 Logistic testing results of antecedents
As shown in Model B (proposed model), consistent with unimodel of persuasion, we find antecedents identified from both content of project descriptions (length, readability, and objectivity) and owner traits of project descriptions (pastExperience and pastExpertise) are significantly associated with funding success. Specifically, we find for every 1 % increase of length, the log odds of funding success increases by 0.38; this number is 0.68 and 0.58, for 1 % of increase of past experience and expertise, respectively. We find tone is positively associated with funding success. However, the quadratic term of tone has a negative coefficient, which indicates the curvilinear relationship between tone and funding success. In other words, moderate use of positive tone can demonstrate project owners’ confidence and optimism, thus, increases the likelihood of success. Excessive use of positive tone, however, may weaken project’s credibility and lead to an adverse effect. The results are consistent with those reported by Parhankangas and Ehrlich (2014). Finally, we find readability is negatively associated with funding success, with 1 % of its increase reduce the log odds by 0.05. This is puzzling because we expect a positive association since a more readable project description is easier to be understood by potential backers. However, as discussed in the descriptive statistics section, project descriptions are mainly written in informal language with low Fog index (easy to understand). Under this circumstances, formally written project descriptions may signal the preparedness and professionalism of project owners (Chen et al. 2009), thus increase the positive perception of backers and increase the likelihood of funding success. Similar results have also been reported by a previous study (Luo et al. 2013).
The results of other antecedents are consistent between Model A and B. For example, a higher requirement of funding goal and a longer campaign duration are negatively associated with funding success; a higher number of reward levels has a positive influence on funding success; and as expected, a higher number of Facebook friends, images, and videos are also positively associated with funding success.
Evaluation of predicting performance
We first compare the predicting performance of our proposed model (Model B) with the mainstream model (Model A) discussed above. We use the entire data sample to train (via logistic models) and test the predicting performance. In addition to the accuracy rate, we also use F-measure to evaluate the prediction performance. F-measure considers both prediction accuracy and recall accuracy and thus provides a balanced performance evaluation (Ferri et al. 2009; Sokolova and Lapalme 2009; Powers 2011). The results are reported in Table 9.
Table 9 Performance measures and confusion matrix
As shown in Table 9 Panel B, the proposed model has a predicting accuracy (F-measure) of 73.09 % (70.31 %), while the mainstream model has a predicting accuracy (F-measure) of 69.34 % (66.20 %). This indicates the proposed model has a better performance of predicting funding success. In addition, the confusion matrixes of both models further demonstrate that proposed model has higher true positive and true negate rates (69.3 % and 76.32 %) than those rates (65.28 % and 72.79) of the mainstream model. Correspondingly, the proposed model has a lower false positive and false negative rates (23.68 % and 30.7 %) than those rates (27.21 % and 34.72) of the mainstream model. These results show that the newly identified antecedents have incremental predictive power.
However, training and testing a predictive model using the same data sample is not a good practice. A recommended practice is to use different datasets to train and test the model (Bengio and Grandvalet 2004). In this step, besides the proposed and mainstream model, we include another model called baseline model. Baseline model is based on informed guessing. In baseline model, we classify each project as “success” or “failure” simply according to the overall probability of funding success. For example, if 40 % projects are successfully funded, the overall probability of funding success is 40 %. Therefore, each project will be classified as “success” with a probability of 40 % and to “failure” with a probability of 60 %. Then we calculate the prediction performance by comparing projects’ assigned status values (i.e., success or failure) with their true status values.
For each predictive model, we employ N-fold cross-validation test (with N set as 3, 5 and 10) to evaluate the prediction performance. The N-fold cross-validation test has been widely used to validate the performance of classification (Bengio and Grandvalet 2004; Li 2008). For each N, our data sample is randomly divided into N parts, then N experiments are performed, with N-1 parts used as training data for the predictive model to classify the remaining part. The average prediction performance is reported for the given N. Our results of N-fold cross-validation test are reported in Table 10.
Table 10 N-Fold Cross-validation tests accuracy (F-Measure)
The results show that our proposed model achieves the highest performance, with the average accuracy rate (F-measure) around 73 % (70 %). The average accuracy rate (F-measure) of the mainstream model is around 69 % (66 %), and baseline model around 59 % (57 %). This indicates that the proposed model has an improvement of roughly 14 percentage points over the baseline model based on informed guessing and 4 percentage points improvement over the mainstream model based on basic project properties. The differences among these three models are statistically significant under the t-test. More importantly, considering that the mainstream model only beats the baseline model by 9 percentage points (66 % vs. 57 %), the 4 additional percentage points (70 % vs. 66 %) improved by our proposed model is fair significant, representing 44 % (i.e., 4 divided by 9) of mainstream’s performance over informed guessing. These results together show that our newly introduced variables have significant and practical impacts on the funding success of projects.
Both accuracy and F-measure are designed for the overall performance of predictive models. Sometimes, however, we need more specific information to make the funding decision. This is especially true when we evaluate prediction performance from the perspective of project owners. Because of the limited time and resources, project owners may not be interested in the overall success rate. Instead, they care more about whether their projects, if predicted as success, will truly be successfully funded. In another word, they want a predictive model that has high true positive rate and low false positive rate. Although this has been reported in the confusion matrixes, one of the visual illustration is to use ROC (Receiver Operating Characteristic) curve, which plots the true positive rate against the false positive rate at various threshold settings (Fawcett 2006). The results are illustrated in Fig. 1.
As shown in Fig. 1, comparing to the mainstream model, our proposed model has an ROC curve more convex toward upper-left. This indicates that the proposed model has a higher true positive rate and a lower false positive rate, which is more useful for project owners to change their project settings and evaluate the likelihood of funding success before projects are launched.
Predictive power of project data over time
In the previous subsections, in order to ensure the comparability, we follow the previous studies and conduct our analysis by ignoring the temporal (i.e., time) information of projects. In other words, all projects are put in a single pool and predictions are in both directions: older project data are used to predict newer projects and, at the same time, newer projects are used to predict older projects. This can be clearly seen from our N-fold cross-validation test, in which the total sample is randomly divided into N parts without considering the creating time of each project. When we predict a funding success of a project, however, the only project data available to make a prediction is those historical data before the one being predicted, and it seems unreasonable to use future project data to train the predictive model and predict the funding success of past projects.
On the other hand, as a new platform of crowdfunding, Kickstarter has experienced great changes since its inception, from perspectives such as system functions, platform policy, and so on. In addition, both backers and owners have changed their behaviors greatly though their use of the crowdfunding platform. Furthermore, the number of users and projects grow drastically over time, which changes the competition environment of crowdfunding. These changes make us wonder whether the project data in earlier years have become “out of date” and have less predictive power to future project success, and whether the sub-sample of project data right before the projects being predicted contains the most relevant information and have higher predictive power.
In order to answer these questions, we slice the whole sample (2009 to 2014) by month and construct narrower, but relatively big enough subsample (e.g., 6 or 12 monthFootnote 4) and analyze the effectiveness of project data of different subsamples on prediction performance. We construct six subsamples consisting of one year’s project data from 2009 to June, 2014 (the last sub-sample contains only 6 month’s data), each of these sub-samples is used as training data to predict the funding success of projects between July and November of 2014 (our data sample ends in November 2014).
Figure 2 presents the results of the prediction performance (F-measure) by using each year’s data from 2009 to 2014. Consistent with our conjectures, we find that, overall, the prediction performance increases over time for both mainstream and proposed models. We remove the informed guessing model from this analysis because its performance only depends on the success rate of each year, which increases and then decrease as evidenced in Table 6. The figure indicates there are two bigger jumps in the year 2010 and 2014. In addition to the reason that project data in 2009 is the oldest relative to projects in 2014, another possible reasons is that, since Kickstarter was funded in 2009, the project data in 2009 may contains more noises and inconsistency. These two reasons together may result in the prediction performance by using data of 2009 is “much” lower than other years. The performance jump in 2014, on the other hand, may mainly reflect the timeliness of data because the first half year’s project data is used as training data to predict the projects of the second half year. The results also show that, from 2010 to 2013, although the improvement is relatively small, the overall trend of prediction performance is obviously increasing. These results together provide evidence that the historical project data is gradually becoming less relevant and losing predictive power to newly created projects. We also replace the F-measure with the accuracy rate and find the similar pattern.