Classic: Avoid the Dangers Of Data-Mining, Part 2

 | Apr 16, 2013 02:49AM ET

Investing Strategies

Models that work well on data about the past may not work in the future.

Check methods for weak points, like overfitting or ignoring illiquidity or business relationships.

Keep in mind some practical considerations when testing a theory.

Other Areas of Data-Mining

In 1992-1993, there were a number of bright investors who had “picked the lock” of the residential mortgage-backed securities market. Many of them had estimated complex multifactor relationships that allowed them to estimate the likely amount of mortgage prepayment within mortgage pools.

Armed with that knowledge, they bought some of the riskiest securities backed by portions of the cash flows from the pools. They probably estimated the past relationships properly, but the models failed when no-cost prepayment became common, and failed again when the Federal Reserve raised rates aggressively in 1994. The failures were astounding: David Askin’s hedge funds, Orange County, the funds at Piper Jaffray that Worth Bruntjen managed, some small life insurers, etc. If that wasn’t enough, there were many major financial institutions that dropped billions on this trade without failing.

What’s the lesson? Models that worked well in the past might not work so well in the future, particularly at high degrees of leverage. Small deviations from what made the relationship work in the past can be amplified by leverage into huge disasters.

I recommend , by Jarrod X. Wilcox . Chapter 10 discusses this issue in detail. This is the best single book I know of on quantitative methods in investing.

4. Be careful when a method uses a huge number of screens in order to come down to a tiny number of stocks and then, with little or no further analysis, says these are the ones to buy or sell. Though the method may have worked very well in the past, accounting data are, by their very nature, approximate and manipulable; they require further processing in order to be useful. Screening only winnows down the universe of stocks to a number small enough for security analysis to begin. It can never be a substitute for security analysis.

5. Avoid using quantitative methods that lack a rational business explanation. Effective quantitative methods usually come from processes that mimic the actions of intelligent businessmen. Never confuse correlation with causation. Sometimes two economic variables with little obvious financial relationship to each other will show a statistically significant relationship in the past. Two financials merely being correlated in the past does not mean that they will be so in the future. This is particularly true when there is no business reason that relates them.

6. Look for the use of a control. A control is a portion of the data series not used to estimate the relationship. It’s left to the side to test the relationship after the “best” model is chosen. Often, the control will indicate that the “best” method isn’t all that good. And beware of methods that use the control data multiple times in order to test the best methods. That defeats the purpose of a control by data-mining the control sample.

Get The News You Want
Read market moving news with a personalized feed of stocks you care about.
Get The App

7. One of the trends in accounting is to make increasingly detailed rules in an attempt (wrongheaded) to fit each individual company more precisely. The problem with that is it makes many ratios difficult to compare across companies and industries without extra massaging to make the data comparable. This makes thinning out a stock universe via screening to be less useful as a tool. For quantitative analysis to succeed, the data need to represent the same thing across different firms.

Practical Recommendations
There are many pitfalls in quantitative analysis. But three simple considerations will help protect investors from the dangers of data-mining.

1. Paper trade any new quantitative method that you consider using. Be sure to charge yourself reasonable commissions, and take into account the bid/ask spread. Take into account market impact costs if you are trading in a particularly illiquid market. Even after all this, remember that your real-world results often will underperform the model.

2. Think in terms of sustainable competitive advantage. What are you bringing to the process that is not easily replicable? How does the method allow you to use your business judgment? Is the method so commonly used that even if it is a good model, returns still might be meager? Even good methods can be overused.

3. If doing quantitative analysis, do it honestly and competently. Form your theory before looking at the data and then test your theory. Then, if the method is a good one, apply the results to your control. If you perform quantitative analysis this way, you will have fewer methods that seem to work, but the ones that pass this regimen should be more reliable.

Trading in financial instruments and/or cryptocurrencies involves high risks including the risk of losing some, or all, of your investment amount, and may not be suitable for all investors. Prices of cryptocurrencies are extremely volatile and may be affected by external factors such as financial, regulatory or political events. Trading on margin increases the financial risks.
Before deciding to trade in financial instrument or cryptocurrencies you should be fully informed of the risks and costs associated with trading the financial markets, carefully consider your investment objectives, level of experience, and risk appetite, and seek professional advice where needed.
Fusion Media would like to remind you that the data contained in this website is not necessarily real-time nor accurate. The data and prices on the website are not necessarily provided by any market or exchange, but may be provided by market makers, and so prices may not be accurate and may differ from the actual price at any given market, meaning prices are indicative and not appropriate for trading purposes. Fusion Media and any provider of the data contained in this website will not accept liability for any loss or damage as a result of your trading, or your reliance on the information contained within this website.
It is prohibited to use, store, reproduce, display, modify, transmit or distribute the data contained in this website without the explicit prior written permission of Fusion Media and/or the data provider. All intellectual property rights are reserved by the providers and/or the exchange providing the data contained in this website.
Fusion Media may be compensated by the advertisers that appear on the website, based on your interaction with the advertisements or advertisers.

Sign out
Are you sure you want to sign out?
NoYes
CancelYes
Saving Changes