Skip to main content
Log in

An Intelligent System for Insider Trading Identification in Chinese Security Market

  • Published:
Computational Economics Aims and scope Submit manuscript

Abstract

Insider trading is one kind of criminal behaviors in security markets. It has existed since the birth of the security market. Until 2018, the history of the Chinese security market is less than 30 years, nonetheless, insider trading behavior frequently occurred. In this study, we mainly explore the features of insider trading behavior by studying relevant indicators during the sensitive period (time window length before the release of insider information). For this purpose, an intelligent system with an integration method of Principal Component Analysis (PCA) and Random Forest (RF) is proposed to identify insider tradings in Chinese security market. In the proposed method, we first collect twenty-six relevant indicators for insider trading samples that occurred from 2007 to 2017 and corresponding non-insider trading samples in Chinese security market. Next, by using the PCA, indicator dimension is reduced and principal components are extracted. Then, relations between insider trading samples and principal components are learnt by the RF algorithm. In the identification phase, the trained PCA-RF model is applied to classify the insider trading and non-insider trading samples, as well as analyzing the relative importance of indicators for insider trading identification. Experimental results showed us that under the 30-, 60-, and 90-days time window lengths, recall results of the proposed method for the out-of-samples identification were 73.53%, 83.87%, and 79.41%, respectively. We further investigate the voting threshold of RF for the proposed method, and we found when the voting threshold of RF was increased to more than 70%, the proposed method produced identification accuracy up to more than 90%. In addition, the relative importance result of RF indicated that three indicators were crucial for insider trading identification. Moreover, identification accuracy and efficiency of the proposed method were substantially superior to benchmark methods. In summary, experimental results indicated that the proposed method could be efficiently applied to Chinese security market. Thus, the proposed method can provide useful suggestions to market regulators for insider trading investigations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Abbreviations

PCA:

Principal component analysis

PC:

Principal component

RF:

Random forest

CSRC:

China securities regulatory commission

SVM:

Support vector machine

KNN:

K nearest neighbors

ANN:

Artificial neural network

NB:

Naive Bayesian

DT:

Decision tree

SEC:

Securities and exchange commission

PRA:

Prudential regulation authority

CARTs:

Classification and regression trees

FPR:

False positive rate

ACC:

Accuracy

PRE:

Precision

REC:

Recall

References

  • Aboody, D., & Lev, B. (2000). Information asymmetry, R&D, and insider gains. The journal of Finance, 55(6), 2747–2766.

    Article  Google Scholar 

  • Agrawal, A., & Nasser, T. (2012). Insider trading in takeover targets. Journal of Corporate Finance, 18(3), 598–625.

    Article  Google Scholar 

  • Ahern, K. R. (2017). Information networks: Evidence from illegal insider trading tips. Journal of Financial Economics, 125(1), 26–47.

    Article  Google Scholar 

  • Akashi, T. (1989). Regulation of insider trading in japan. Columbia Law Review, 89(6), 1296.

    Article  Google Scholar 

  • Ali, U., & Hirshleifer, D. (2017). Opportunism as a firm and managerial trait: Predicting insider trading profits and misconduct. Journal of Financial Economics, 126(3), 490–515.

    Article  Google Scholar 

  • Amiri, S., Von Rosen, D., & Zwanzig, S. (2009). The SVM approach for Box–Jenkins models. REVSTAT–Statistical Journal, 7(1), 23–36.

    Google Scholar 

  • Anand, A. I., & Beny, L. N. (2007). Why do firms adopt insider trading policies? Evidence from Canadian firms. In American law & economics association annual meetings (pp. 70).

  • Bhattacharya, U., & Daouk, H. (2002). The world price of insider trading. Journal of Finance, 57(1), 75–108.

    Article  Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.

    Article  Google Scholar 

  • Cernadas, E., & Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research, 15(1), 3133–3181.

    Google Scholar 

  • Cheng, L., Gao, B., Tian, G. Y., Woo, W. L., & Berthiau, G. (2014). Impact damage detection and identification using eddy current pulsed thermography through integration of PCA and ICA. IEEE Sensors Journal, 14(5), 1655–1663.

    Article  Google Scholar 

  • Christophe, S. E., Ferri, M. G., & Hsieh, J. (2010). Informed trading before analyst downgrades: evidence from short sellers. Social Science Electronic Publishing, 95(1), 85–106.

    Google Scholar 

  • Cohen, L., Malloy, C., & Pomorski, L. (2012). Decoding inside information. Journal of Finance, 67(3), 1009–1043.

    Article  Google Scholar 

  • Collindufresne, P., & Fos, V. (2016). Insider trading, stochastic liquidity and equilibrium prices. Econometrica, 84(4), 1441–1475.

    Article  Google Scholar 

  • Du, J., & Wei, S. J. (2004). Does insider trading raise market volatility? Economic Journal, 114(498), 916–942.

    Article  Google Scholar 

  • Easley, D., Kiefer, N. M., O’Hara, M., & Paperman, J. B. (2012). Liquidity, information, and infrequently traded stocks. Journal of Finance, 51(4), 1405–1436.

    Article  Google Scholar 

  • Ferreira, E. J. (1995). Insider trading activity, different market regimens, and abnormal returns. Financial Review, 30(2), 193–210.

    Article  Google Scholar 

  • Fidrmucova, J., Goergen, M., & Renneboog, L. D. R. (2010). Insider trading, news releases and ownership concentration. Social Science Electronic Publishing, 61(6), 2931–2973.

    Google Scholar 

  • Finnerty, J. E. (1976). Insiders and market efficiency. Journal of Finance, 31(4), 1141–1148.

    Article  Google Scholar 

  • Herzel, L., & Katz, L. (1987). Insider Trading-Who Loses? Lloyds Bank Annual Review, 165, 15–26.

    Google Scholar 

  • Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), 832–844.

    Article  Google Scholar 

  • Huang, H. (2005). The regulation of insider trading in China: a critical review and proposals for reform. Australian Journal of Corporate Law, 17(3), 281–322.

    Google Scholar 

  • Jakimowicz, A., & Baklarz, A. (2016). Identification of Insider Trading Using Network Numerical Models. Acta Physica Polonica, A.129(5).

  • Jarrell, G. A., Poulsen, A. B., & Annette, B. (1989). Stock trading before the announcement of tender offers: insider trading or market anticipation? Journal of Law Economics and Organization, 5(2), 225–248.

    Google Scholar 

  • Jee, H., Lee, K., & Pan, S. (2004). Eye and face detection using SVM. In Proceedings of the 2004 intelligent sensors, sensor networks and information processing conference, (pp. 577-580).

  • Jolliffe, I. (2011). Principal component analysis (pp. 1094–1096). Berlin: Springer.

    Google Scholar 

  • Kyle, A. S. (1985). Continuous auctions and insider trading. Econometrica, 53(6), 1316–1335.

    Article  Google Scholar 

  • Langevoort, D. (1991). Insider trading regulation (1991st ed.). New York: Clark Boardman.

    Google Scholar 

  • Lin, X., Sun, L., Li, Y., Guo, Z., Li, Y., Zhong, K., et al. (2010). A random forest of combined features in the classification of cut tobacco based on gas chromatography fingerprinting. Talanta, 82(4), 1571–1575.

    Article  Google Scholar 

  • Liu, Y., Hong, Z., Tan, G., Dong, X., Yang, G., Zhao, L., et al. (2014). NMR and LC/MS-based global metabolomics to identify serum biomarkers differentiating hepatocellular carcinoma from liver cirrhosis. International Journal of Cancer, 135(3), 658–668.

    Article  Google Scholar 

  • Llorente, G., Michaely, S., Saar, G., & Wang, J. (2002). Dynamic volume-return relation of individual stocks. Review of Financial Studies, 15(4), 1005–1047.

    Article  Google Scholar 

  • Loss, L. (1970). The fiduciary concept as applied to trading by corporate” Insiders” in the United States. The Modern Law Review, 33(1), 34–52.

    Article  Google Scholar 

  • Lu, C., Zhao, X., & Dai, J. (2018). Corporate social responsibility and insider trading: evidence from china. Sustainability, 10(9), 3163.

    Article  Google Scholar 

  • Ma, Z. X., & Zhang, W. (2010). Notice of Retraction An discrimination research on insider trading and market manipulation in Chinese security market based on probabilistic neural network. In 2010 IEEE international conference on advanced management science (pp. 116–119).

  • Malvoni, M., De Giorgi, M. G., & Congedo, P. M. (2016). Photovoltaic forecast based on hybrid PCA–LSSVM using dimensionality reducted data. Neurocomputing, 211, 72–83.

    Article  Google Scholar 

  • Maug, E. (2002). Insider trading legislation and corporate governance. Social Science Electronic Publishing, 46(9), 1569–1597.

    Google Scholar 

  • Mcinish, T. H., Frino, A., & Sensenbrenner, F. (2011). Strategic illegal insider trading prior to price sensitive announcements. Social Science Electronic Publishing, 18(3), 247–253.

    Google Scholar 

  • Minenna, M. (2003). Insider trading, abnormal return and preferential information: supervising through a probabilistic model. Journal of Banking & Finance, 27(1), 59–86.

    Article  Google Scholar 

  • Peters, J., Baets, B. D., Verhoest, N. E. C., Samson, R., Degroeve, S., Becker, P. D., et al. (2007). Random forests as a tool for ecohydrological distribution modelling. Ecological Modelling, 207(2), 304–318.

    Article  Google Scholar 

  • Qi, M., & Zhang, G. P. (2001). An investigation of model selection criteria for neural network time series forecasting. European Journal of Operational Research, 132(3), 666–680.

    Article  Google Scholar 

  • Reid, M. K., & Spencer, K. L. (2009). Use of principal components analysis (pca) on estuarine sediment datasets: the effect of data pre-treatment. Environmental Pollution, 157(8), 2275–2281.

    Article  Google Scholar 

  • Schwert, G. W. (1999). Markup pricing in mergers & acquisitions. Social Science Electronic Publishing, 50(2), 247–264.

    Google Scholar 

  • Shao, Q., & Feng, C. J. (2012). Pattern recognition of chatter gestation based on hybrid PCA-SVM. Applied Mechanics and Materials, 120, 190–194.

    Article  Google Scholar 

  • Shen, B. (2012) Study on formation mechanism and identification mechanism of insider trading in chinese stock market. Doctor Thesis of Chongqing University (in Chinese).

  • Shi, Y., & Jiang, X. (2004). Insider trading, volatility of stock price and information asymmetry. Social Science Electronic Publishing, 38(5), 581–598.

    Google Scholar 

  • Su, M. Y. (2011). Using clustering to improve the knn-based classifiers for online anomaly network traffic identification. Journal of Network & Computer Applications, 34(2), 722–730.

    Article  Google Scholar 

  • Tamersoy, A., Khalil, E., Xie, B., Lenkey, S. L., Routledge, B. R., Chau, D. H., et al. (2014). Large-scale insider trading analysis: patterns and discoveries. Social Network Analysis and Mining, 4(1), 201.

    Article  Google Scholar 

  • Tian, M., Wang, X. A., Zhang, X., Yang, Z., Huang, J., & Chen, H. (2016). The implementation of a KNN classifier on FPGA with a parallel and pipelined architecture based on Predetermined Range Search. In 2016 13th IEEE international conference on solid-state and integrated circuit technology (pp. 1491–1493).

  • Vapnik, V. N. (1999). An overview of statistical learning theory. IEEE Transactions on Neural Networks, 10(5), 988–999.

    Article  Google Scholar 

  • Website source 1: China Securities Regulatory Commission (CSRC). Retrieved January 11, 2018 from www.csrc.gov.cn/.

  • Website source 2: CSMAR database, Retrieved January 11, 2018 from http://www.gtafe.com/WebShow/ShowDataService/1.

  • Website source 3: RESSET database, Retrieved January 11, 2018 from http://www.resset.cn/databases.

  • Zhang, G. P. (2003). Time series forecasting using a hybrid arima and neural network model. Neurocomputing, 50(1), 159–175.

    Article  Google Scholar 

  • Zhang, X. (2011). Empirical analysis of the insider trading’s characteristics in China stock market. In 2011 2nd international conference on artificial intelligence, management science and electronic commerce (pp. 6628–6631).

Download references

Acknowledgements

This work is funded by Hubei Provincial Department of Education (No. Q20171208), Science Foundation of China Three Gorges University (No. KJ2016A001), and Starting Grant of China Three Gorges University (No. 20170907).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shangkun Deng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

  1. (1)

    Volatility

    It is the degree of variation of a security price time series over time that measured by the standard deviation of logarithmic returns.

  2. (2)

    ERCSS (Excess Return Compared with Same Scale):

    It is the excess rate of return on a security over the average return of the securities with the same scale. The calculation formula is as follows:

    $$ {\text{ERCSS}} = {\text{Return}}\;{\text{of}}\;{\text{security}} - {\text{Return}}\;{\text{of}}\;{\text{securities}}\;{\text{with}}\;{\text{same}}\;{\text{company}}\;{\text{scale}} $$
  3. (3)

    ERCSM (Excess Return Compared with Same Market):

    It is the excess rate of return over the average return of the market. The calculation formula is as follows:

    $$ {\text{ERCSM}} = {\text{Return}}\;{\text{of}}\;{\text{security}} - {\text{Return}}\;{\text{of}}\;{\text{market}} $$
  4. (4)

    ERCSR (Excess Return Compared with Same Risk)

    It is the excess rate of return over the average return of the investment with same risk. The calculation formula is as follows:

    $$ {\text{ERCSM}} = {\text{Return}}\;{\text{of}}\;{\text{security}} - {\text{Return}}\;{\text{of}}\;{\text{the}}\;{\text{investment}}\;{\text{with}}\;{\text{same}}\;{\text{risk}} $$
  5. (5)

    SC (Sigma Coefficient)

    The SC is the standard deviation of security prices in a certain period.

  6. (6)

    TSTR (Total Share Turnover Rate)

    It is the frequency of stock transfer in a certain period. The formula is as follows:

    $$ {\text{TSTR}} = ({\text{Volume}}\;{\text{of}}\;{\text{transaction}}/{\text{Total}}\;{\text{stocks}}) \times \, 100\% $$
  7. (7)

    FSTR (Floating Stock Turnover Rate)

    It is used to measure the degree of the frequency of stock transfer in a certain period. The formula is as follows:

    $$ {\text{FSTR}} = ({\text{Volume}}\;{\text{of}}\;{\text{transaction}}/{\text{Floating}}\;{\text{stocks}}\;{\text{in}}\;{\text{circulation}}) \times \, 100\% $$
  8. (8)

    BC (Beta Coefficient)

    The BC is a measure of an asset’s risk and return in relation to a market. A security’s beta coefficient is calculated by dividing the product of the covariance of the security’s returns and the benchmark’s returns by the product of the variance of the benchmark’s returns over a certain period.

  9. (9)

    P/E ratio (Price-Earning ratio)

    P/E ratio is the ratio of a company’s stock price to the company’s earnings per share, which is used in valuing companies, which is calculated by:

    $$ {\text{P/E}} = {\text{Share}}\;{\text{Price}}/{\text{Earnings}}\;{\text{Per}}\;{\text{Share}} $$
  10. (10)

    P/B ratio (Price-Book ratio)

    The price-to-book ratio, or P/B ratio, is a financial ratio used to compare a company’s current market price to its book value. It is calculated as:

    $$ {\text{P/B}} = {\text{Market Price Per Share}}/{\text{Book Value Per Share}}$$
  11. (11)

    P/S ratio (Price-Sales ratio)

    Price–sales ratio, or P/S ratio, is a valuation metric for stocks. It is calculated by dividing the company’s stock price by the revenue per share:

    $$ {\text{P/S}} = {\text{Share}}\;{\text{Price}}/{\text{Revenues}}\;{\text{Per}}\;{\text{Share}} $$
  12. (12)

    DR (Debt Ratio)

    The DR is a financial ratio that indicates the percentage of a company’s assets that are provided via debt. It is the ratio of total liabilities and total assets. The calculation formula is as follows:

    $$ {\text{DR}} = \left( {{\text{Total}}\;{\text{Liabilities}}/{\text{Total}}\;{\text{Assets}}} \right) \times \, 100\% $$
  13. (13)

    CR (Current Ratio)

    It is a ratio that measures whether a company has enough resources to meet its short-term obligations. It compares a company’s current assets to its current liabilities:

    $$ {\text{CR}} = ({\text{Current}}\;{\text{Assets}}/{\text{Current}}\;{\text{Liabilities}}) \times \, 100\% $$
  14. (14)

    OPR (Operating Profit Ratio)

    It refers to the percentage of operating profit from operating income as a percentage of net sales. This percentage is used to comprehensively reflect the business efficiency of a company. The calculation formula is as follows:

    $$ {\text{OPR}} = ({\text{Operating}}\;{\text{profit}}/{\text{Net Sales}}\times \, 100\% $$
  15. (15)

    QR (Quick Ratio)

    The QR refers to the ratio of the company’s quick assets and current liabilities. It is calculated as follows:

    $$ {\text{QR}} = ({\text{Quick}}\;{\text{assets}}/{\text{Current}}\;{\text{liabilities}}) \times 100\% $$
  16. (16)

    TAT (Total Asset Turnover)

    The TAT is the ratio of the net sales income to the average total assets in a certain period of time. The calculation formula is as follows:

    $$ {\text{TAT}} = {\text{Sales}}\;{\text{Revenue}}/{\text{Total}}\;{\text{Assets}} $$
  17. (17)

    RGR (Revenue Growth Rate)

    The RGR refers to the ratio of the increase in operating income of the company to the total operating income of the previous year. The calculation formula is as follows:

    $$ {\text{RGR}} = \left( {{\text{Increased}}\;{\text{operating}}\;{\text{income}}/{\text{Total}}\;{\text{operating}}\;{\text{income}}\;{\text{in}}\;{\text{the}}\;{\text{previous}}\;{\text{year}}} \right) \times 100\% $$
  18. (18)

    TAGR (Total Asset Growth Rate)

    The TAGR is the ratio of the total assets growth of the listed company to the total assets at the beginning of the current year, reflecting the growth of the assets of the company in the current year. The calculation formula is as follows:

    $$ {\text{TAGR}} = ({\text{Total asset growth in the current year}}/{\text{Total assets at the beginning of the current year}}) \times 100\% $$
  19. (19)

    ROE (Return On Equity)

    The ROE is a ratio of net income and to the Shareholders Equity.

    $$ {\text{ROE}} = {\text{Net}}\;{\text{Income}}/{\text{Shareholders Equity}}$$
  20. (20)

    ROA (Return On Assets)

    The ROA is an indicator used to measure how much net profit is generated per unit of assets. The calculation formula is as follows:

    $$ {\text{ROA}} = {\text{Net}}\;{\text{Income}}/{\text{Total Value of Assets}} $$
  21. (21)

    CR5 Index

    It is the proportion of shares held by the top five largest shareholders

  22. (22)

    CR10 Index

    It is the Proportion of shares held by the top ten largest shareholders

  23. (23)

    Z index

    The Z index refers to the ratio of the largest shareholder’s stock amount to the second-largest shareholder’s stock amount.

  24. (24)

    H5 index

    It is the sum of squares of the top five largest shareholders’ stock proportion. The closer of the index to 1, the greater the stock proportion difference between the top five largest shareholders.

  25. (25)

    H10 index

    It is the sum of squares of the top ten largest shareholders’ stock proportion. The closer the index to 1, the greater the stock proportion difference between the top ten largest shareholders.

  26. (26)

    ARAGMS (Attendance Ratio at Annual General Meeting of Shareholders)

    The ARAGMS is a ratio of the shareholder number with the total number of shareholders that attending at the annual general meeting.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Deng, S., Wang, C., Fu, Z. et al. An Intelligent System for Insider Trading Identification in Chinese Security Market. Comput Econ 57, 593–616 (2021). https://doi.org/10.1007/s10614-020-09970-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10614-020-09970-8

Keywords

Navigation