NYSE TAQ Database and Financial Durations
In Chapter 1, we reviewed basic concepts in theoretical and applied market microstructure and detailed the trading mechanisms used in several key exchanges As stated in the introduction, our empirical work focuses on tick-by-tick data for stocks traded on the NYSE. In this chapter, we start by describing the intraday database that is available from this exchange (see Section 2). The Trade And Quote database, also called TAQ database, provides intraday information on the price and quote processes for stocks traded on the NYSE and NASDAQ-AMEX. Although databases featuring financial information have been around for a long time, databases providing intraday information to the general public have only been available since the early nineties. Today, most stock exchanges make available to the general academic community the (more or less) complete record of their intraday activity. The release of this kind of information has given rise to a substantial amount of empirical research conducted on the trading mechanisms, the intraday characteristics of the markets (liquidity, volatility), and the price formation process. While intraday databases provide researchers with a substantial amount of valuable information, we also highlight some of the potential problems that arise when dealing with these databases due to the specific nature of these data.
KeywordsMarket Maker Order Book Marked Point Process Trade Process Trading Mechanism
Unable to display preview. Download preview PDF.
- 1.The information about ordering this database is available at the NYSE Web Site, www.nyse.com. Strictly speaking, the TAQ database is not the first intraday database of the NYSE as a much older database (TORQ database) is routinely used in empirical work (Engle and Russell, 1998). The Trades, Orders Reports, and Quotes (TORQ) dataset was constructed by Hasbrouck and the NYSE in 1991. It provides intraday information on the price and quote processes for a sample of stocks traded on the NYSE over a 3 month period. More recently, the NASDAQ now releases its own database which gives intraday information on the quotes posted by all the market makers active for a given stock traded on NASDAQ (thus not only the best bid-ask quotes but also the other valid quotes).Google Scholar
- 2.The ticker of the stock is the identification code of the stock used at the exchange. For example, BA is the identification code for BOEING at the NYSE.Google Scholar
- 3.See also the first footnote of this chapter.Google Scholar
- 4.It should however be mentioned that some researchers (usually affiliated with the NYSE) were granted access to the historical inventory databases kept by the specialists at the NYSE (e.g. Hasbrouck and Sofianos, 1993).Google Scholar
- 5.We implemented this procedure using the GAUSS econometric program to get the needed data before estimating the models presented in the next chapters. Our code is available on request.Google Scholar
- 6.We did not retrieve the quoted depth at the ask and bid prices as we do not include this information in the econometric models of the next chapters, but it is straightforward to do so.Google Scholar
- 7.See for example Bauwens and Giot (1998, 2000 ), Engle and Russell (1997, 1998), Giot (2000a) or Gerhard and Hautsch (1999).Google Scholar
- 8.A formal definition of the rule used to filter the quotes is given in Engle and Russell (1997).Google Scholar
- 9.However, it is valuable information if the bid-ask spread is to be modelled.Google Scholar
- 10.Gouriéroux, Jasiak and Le Fol (1999) introduce volume durations for the trade process.Google Scholar
- 11.See the discussion of liquidity which is provided in subsection 2.4 of Chapter 1. A related measure is VNET which is introduced for price durations.Google Scholar
- 12.We also looked at other actively traded stocks like Coca-Cola, Boeing, Exxon, ATT, and we found that they generally have the same intraday characteristics as IBM and Disney.Google Scholar
- 13.In the basic version of the Poisson process, or the corresponding exponential model for the durations, the mean of the durations is by definition equal to their standard deviation.Google Scholar
- 14.When cp = $0.25 for the IBM stock, the dispersion index is equal to 1.13 and it is equal to 0.63 for volume durations with c1, = 50, 000.Google Scholar
- 15.The hump close to the origin is not an artifact of the kernel density estimation of a density that starts at the origin. We used the gamma kernel proposed by Chen (1998). The bandwidth was set at (0.9 s n-°.2)2 where s is the standard deviation of the data and n the number of data.Google Scholar
- 16.Information about the database and how to order it can be found on the Olsen Web site at www.olsen.ch.