Proceedings of the 9th International Conference on Computer Recognition Systems CORES 2015 pp 641-647 | Cite as

# Prediction of Trend Reversals in Stock Market by Classification of Japanese Candlesticks

## Abstract

K-means clustering algorithm has been used to classify patterns of Japanese candlesticks which accompany the approach to trend reversals in the prices of several assets registered in the Warsaw stock exchange (GPW). It has been found that the trend reversals seem to be preceded by specific combinations of candlesticks with notable frequency. Surprisingly, the same patterns appear in both “bullish” and “bearish” trend reversals. The above findings should stimulate further studies on the problem of applicability of the so-called technical analysis in the stock markets.

### Keywords

Clustering K-means Trend reversals Japanese candlesticks## 1 Introduction

*not*random walks, and at least the short-time correlations

*are*present. Whether they can indeed be exploited with the purpose of maximization of returns is an open question. What we investigate here is meant to be a very small contribution to answer it. It is well known that a very important part of technical analysis is the localization of possible ends of a given trend—upward or downward. Many technical-analysis indicators (like MACD—Moving Average Convergence-Divergence or RSI—Relative Strength Index) are used by technical analysts for this purpose. Another tool used to this end is the graphical patterns made by the sequences of Japanese candlesticks. A Japanese candlestick is a four-element sequence containing the opening, maximal, minimal, and closing prices of an asset during a given trade session. We do not analyze here the patterns themselves but we attempt to obtain an answer to the following rather humble question: Are the trend reversals accompanied more often by some types of candlesticks than by others? For that purpose we perform first the classification of candlesticks to have the above “types” well defined. For the purpose of classification we employ a well-established clustering algorithm called K-means [9, 10, 11, 12, 13]. Let us notice that clusterization of candlesticks for a given asset allows one to ascribe labels to them. This, in turn, makes it possible to investigate how their sequences with given labels have performed in the past and what is the predictive power (if any) of sequences with particular labels. The main body of this work is organized as follows. In Sect. 2 we recall and modify the definition of the Japanese candlesticks, which form our working example. In Sect. 3 we provide our results of the relation between candlestick types and trend reversals in prices of several stocks. Finally, Sect. 4 comprises some concluding remarks.

## 2 Japanese Candlesticks as a Representation of Value of Assets in Stock Market

*O*(

*a*,

*t*),

*X*(

*a*,

*t*),

*N*(

*a*,

*t*),

*C*(

*a*,

*t*)), where

*O*denotes the opening value of the asset

*a*at the trading day

*t*,

*X*is the maximum value (

*high*) reached during the trading session,

*N*is the minimum (

*low*), and

*C*is the closing value. There exists a well-known graphical representation of the candlestick [14] often considered important in what is called the technical analysis of stock markets. In what follows below we employ a sequence of five elements (

*O*,

*X*,

*N*,

*C*,

*V*), which we call an

*augmented Japanese candlestick*where

*V*represents the transaction volume associated with the asset and the trading day. An augmented candlestick of the asset

*a*on the day

*t*can be denoted as a 5-tuple

*t*and ending time \(t+n\). We define the (Euclidean) distance between two candlesticks (differently from our previous work [15]) as

*O*,

*X*,

*N*, and

*C*by the standard deviation of

*C*. Similarly, the volume is also divided by its standard deviation. In this way, the standard deviations of renormalized

*C*and

*V*are exactly 1. All candlesticks analyzed further are normalized in the above sense. Using the K-means algorithm we have classified the Japanese candlesticks as they appear in the group of the 20 biggest and most powerful stocks (WIG20) in the Warsaw stock exchange. This has been done for each of the 20 stocks separately. We have assumed that there are 32 clusters. This should have corresponded to the 32 semi-quantitative features of the candlesticks: a candlestick can be “black” (close value lower than open value) or “white”, its body length (i.e., absolute value of the difference between the close and open value) can be larger or smaller than average, its upper “shadow” (i.e., the difference between the maximum and the larger of open and close value) can be longer or shorter than average, the lower “shadow” can also be long or short with analogous meaning; finally, the corresponding volume can be larger or smaller than average. All this gives \(2^5\) features, hence 32 clusters. An example of the coordinates of the centroids associated with each cluster for BZWBK stocks is shown in Fig. 1. We have used the implementation of the K-means algorithm as given in the module Scikit-learn [16, 17]. To improve the presentation, for every centroid with coordinates (

*O*,

*X*,

*N*,

*C*,

*V*) we subtracted the first coordinate from the first four to obtain \((0, X - O, N - O, C - O, V)\) and displayed its candlestick representation. This is the reason why the candles in the upper part of Fig. 1 have the same level of opening values.

## 3 Trend Reversals and Candlesticks

It is by no means self-evident what the trend in the data obtained from the random process really means qualitatively, even though the intuitive meaning is understandable. In particular, it is not clear when the trend actually starts and when it ends. To quantify these concepts, we have employed the following simple procedure. To every closing value of a given asset we have ascribed the slope of the straight line obtained from the preceding 10 (ten) closing values. As an indicator of the start and end of the trend we have chosen the change of sign of the above slope. A justification for using the number 10 is that the period of two trading weeks is usually considered important by the technical analysts. For every change of sign of the slope as defined above, which appears between *n*th and \((n+1)\)th trading sessions, we have ascribed one of the 32 cluster labels (from 0 to 31) of the candlestick which appeared in the session \(n+1\). If, however, a cluster with a given label contained less than *N* / 240 sticks, where *N* is the number of trading days considered, it has been discarded. For instance, in the case of BZWBK stocks (\(N = 5327\)) we have kept only 22 clusters (and 22 labels). We have applied such a filter to exclude candlesticks that appear too rarely, less than once per year. In Fig. 2 we have plotted an example of the time series generated by a stock in GPW (BZWBK stock, trading session No. 1000–1500) together with the associated time-dependence of the slopes generated by 10 preceding closing values. In Fig. 3 we have plotted the relative frequencies of appearances of the cluster labels (\(i = 0\)) for the change from downward to upward trend and from the upward to downward for BZWBK stops. It has been calculated as the ratios \(M_{j, du}/L_{j}\) and \(M_{j, ud}/L_{j}\), where \(M_{j, du}\) is the number of appearances of a candle belonging to the *i*th cluster near the down-to-up trend reversal, \(M_{j, ud}\) is the number of appearances of a candle belonging to the *i*th cluster near the up-to-down trend reversal, and \(L_{j}\) is the total amount of appearances of any candle belonging to *j*th cluster. This and similar figures which we have obtained for other stocks traded at GPW have been somewhat surprising to us. Indeed, it seems that there are such types of candlesticks that appear frequently close to the trend reversals (as defined above) and relatively rarely outside the regions close to the zeros of the slope series. Since the total number of these zeros has been of the order of *N* / 10 (575 for BZWBK stocks), where *N* is the number of trading sessions, one may be tempted to say that, in fact, candlesticks of some types are concentrated in the trend-reversal regions. We have observed that this behavior is more pronounced in the stocks that are traded longer in GPW and are much less visible for stocks relatively new in the market. It is quite evident, however, that the most significant from the point of view of the trend reversals are those types of candlesticks that, in general, appear quite infrequently. This is rather intuitive from the point of view of technical analysis. What is more, as for the down-to-up reversal, the most significant candlestick is the one with relatively long and light body (i.e., closing price is larger than the opening price), relatively short “shadows” and quite large volume. This means that during the trading day there is almost a steady growth and the interest of investors to buy an asset does not diminish. The fact that, with such investors’ mood, the new “bullish” trend is likely, seem to be intuitive. Similarly, the sticks that appear relatively likely near the up-to-down reversals are those with closing prices lower than the opening ones but with relatively long “shadows” and large volumes. It seems that the trade days on which the prices are characterized with such candlesticks are very turbulent, but such that the pessimistic moods finally settle down and the “bearish” reversal becomes likely. Let us notice that the candlestick that appears most frequently in the trend-reversal regions is that of the very short body (closing price very close to the opening price). It is called “doji” and, according to the technical analysis folklore, it traditionally signifies indecision in the market. The trend reversal may indeed follow, but the prediction of “doji” usually depends on further information about the context.

## 4 Concluding Remarks

In this work we have performed classification of the Japanese candlesticks that appear in the stocks traded in the Warsaw stock exchange using the standard K-means clustering algorithm. With each trend-reversal point we have attached a label associated with the cluster to which the candlestick appearing at that point belongs. We have found that there exist types of candlesticks that frequently tend to appear close to the trend-reversal regions and others that cannot be found in such regions. Needless to say, the above results are very preliminary and require careful reexamination. What is especially important is to find a much more convincing definition of the trend and the trend reversal than that given in this work in terms of linear regression slopes. What is more, as always stressed by technical analysts, the candlesticks are to be considered within the specific market context. This can be done using other technical-analysis indicators. We hope to report the results of such improved analysis in a forthcoming publication.

### References

- 1.Murphy, J.: Technical Analysis of Financial Markets. New York Institute of Finance, New York (1999)Google Scholar
- 2.Kaufman, P.: Trading Systems and Methods. Wiley, New York (2013)Google Scholar
- 3.Malkiel, B.: A Random Walk Down the Wall Street. Norton, New York (1981)Google Scholar
- 4.Fama, E., Blume, M.: Filter rules and stock-market trading. J. Bus.
**39**, 226–241 (1966)CrossRefGoogle Scholar - 5.Brock, W., Lakonishok, J., LeBaron, B.: Simple technical trading rules and the stochastic properties of stock returns. J. Financ.
**47**(5), 1731–1764 (1992)CrossRefGoogle Scholar - 6.Lo, A., MacKinley, A.: Stock market prices do not follow random walks: evidence from a simple specification test. Rev. Financ. Stud.
**1**, 41–66 (1988)CrossRefGoogle Scholar - 7.Lo, A., MacKinley, A.: A Non-Random Walk down Wall Street. Princeton University Press, Princeton (1999)Google Scholar
- 8.Lo, A., Mamaysky, H., Wang, J.: Foundations of technical analysis: computational algorithms, statistical inference, and empirical implementation. J. Financ.
**55**(4), 1705–1765 (2000)CrossRefGoogle Scholar - 9.MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: Cam, M.L., Neyman, J. (eds.) Proceeding of 5th Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press, Berkeley (1967)Google Scholar
- 10.Steinhaus, H.: Sur la division des corps matériels en parties. Bull. Acad. Polon. Sci.
**4**(12), 801–804 (1957)MathSciNetMATHGoogle Scholar - 11.Lloyd, S.: Least square quantization in PCM (1957) Bell Telephone Laboratories PaperGoogle Scholar
- 12.Forgy, E.: Cluster analysis of multivariate data: efficiency versus interpretability of classifications. Biometrics
**21**(3), 768–769 (1965)Google Scholar - 13.Hartigan, J.: Clustering Algorithms. Wiley, New York (1975)MATHGoogle Scholar
- 14.Wikipedia: Candlestick chart – Wikipedia, The Free Encyclopedia (2014) http://en.wikipedia.org/w/index.php?title=Candlestick_chart. [Online; accessed 19-December-2014]
- 15.Chmielewski, L., Janowicz, M., Orłowski, A.: Clustering algorithm based on molecular dynamics with Nose-Hoover thermostat. Application to Japanese candlesticks. In: Rutkowski, L., et al., (eds.) Artificial Intelligence and Soft Computing: Proceeding of International Conference ICAISC 2015. Lecture Notes in Artificial Intelligence, vol. 9120, pp. 330–340. Springer (2015)Google Scholar
- 16.Scikit-learn Community: Scikit-learn - machine learning in Python (2015) http://scikit-learn.org. [Online; accessed 10-February-2015]
- 17.Pedregosa, F., Varoquaux, G., Gramfort, A., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res.
**12**, 2825–2830 (2011)MathSciNetMATHGoogle Scholar