Data Mining as an Audit Tool

fcohen100Data mining is both an art and science. Roughly stated, the purpose is to extract useful information from data. Data mining has been used for many years and in a number of different ways, however it is only recently, with the advent of more powerful computers and more powerful software languages, that the practice has made significant gains in popularity - particularly when it comes to mining large databases.

Also known as predictive analytics, it is this set of methodologies that allows to recommend purchases based on what you have purchased from them (and often, from other vendors) in the past. It is predictive analytics that allows a seller to tell you that "customers who purchased this item also purchased that item."

Using data mining techniques, lenders are able to determine the probability that someone will default on a loan, allowing them to adjust interest rates based on risk. Predictive analytics also is used to conduct credit card fraud analysis in real time. I am sure that many of you at some point have received a call from your credit card company asking you to validate a "suspicious" purchase.

Issues at the Pump

Recently I went to a gas station I go to quite often to fill up, and as I usually do, I swiped my credit card at the pump. For some reason, the gas was coming out very slowly; it took more than a minute just to pump one gallon of gas, so I finished the transaction and went across the street to another gas station. This time, when I swiped my card the transaction was denied, and I had to use another credit card. Several minutes later I got a call from my credit card company asking me to verify that I had indeed attempted a purchase at that second gas station. How in the world could they have responded that quickly? Well, I conduct hundreds if not thousands of transactions a year, and when it comes to gas, when I am home they know that I normally purchase from one location.

Secondly, I rarely (if ever) purchase a gallon of gas from one location and then, within minutes, attempt to purchase gas again from another location. In this case, the credit card company used some predictive analytical algorithm to score my purchase with regard to fraud risk, and it scored high enough to invoke some action (such as putting the account on hold until I could verify the purchase). It's pretty cool stuff, actually.

The Application of Data Mining

I use data mining a lot in my work. It allows me to predict who is most likely to be an end user of my services, making my marketing efforts more efficient. In one recent case, we used data mining in order to determine the probability that a particular claim would be denied by a particular payer, giving us the opportunity to review charts in advance of billing. Regarding possible pending audits, we see potential for the use of data mining techniques, including predictive analytics, to identify "bad" claims (as defined by CMS).

Let's take a look at CERT as an example. In the most recent CERT study, it was reported that 4.5 percent of the reviewed claims were considered overpaid due to lack of medical necessity. Determination of a medical necessity denial (or overpayment) normally is defined based on documentation contained within a chart, however there are other factors that come into play. Let's say that a patient comes to the office complaining of a runny nose, itchy eyes and other symptoms that result in a diagnosis of seasonal rhinitis (ICD-9 code 477.0). Now let's say the provider codes the patient with a 99204, the second-highest level of new office visit. An audit very well may support that level of visit based on the documentation guidelines; the auditor, however, might question whether that level of E/M code (complexity of DDx) is commensurate with the level of complexity of the diagnosis. In this case, it is likely that this claim line would result in a denial due to lack of medical necessity.

Understand the issue here: there is a direct relationship between the documentation and the procedure code and the documentation and the diagnosis code, but unfortunately there is not a nexus between the procedure code and the diagnosis code - and this is where the issue of medical necessity rears its highly judgmental and elastic head.

Predicting Claims Subject to Audit

So, then, how could we possibly know ahead of time what claims have the greatest probability of being subjected to a medical necessity review? Here is where I employ predictive analytics. To start I would access the 4,500 or so claims that were determined to have been overpaid. Next I would divide the database in half. Then I would take the first group of 2,250 claims and run them through a data mining program, training a number of different algorithms. Then I would run the other 2,250 claims through these trained algorithms to see which predicted the medical necessity outcome most accurately. What I have gained now is the ability to take all of the claims from your office, run them through my data mining algorithm and spit out the claims that are most likely to be audited for medical necessity. By extension, I have created a model that uses probability to predict the likelihood that any particular claim will be subject to an audit.  Pretty cool, huh?