The title of this article was inspired by Michael Stagar, certified fraud examiner, with whom I recently shared a brief discussion on RACmonitor.com.
Mr. Stagar expressed several salient notions concerning the problems we face with regard to government audits. In response, I have created “the auditor’s conundrum,” and it goes like this:
“The less qualified the auditor, the greater the potential for financial damage to the healthcare provider and financial benefit to the system (CMS) – a system that is ultimately responsible for contracting with these auditors.”
In other words, the government, which contracts with the companies that send their dark minions among us, seem to have a vested interest in not having qualified folks conducting provider audits.
“Many RAC … auditors ‘cannot explain’ in the most simplest of mathematical/statistical terms the concept of sampling and do not comprehend in the most simplest of mathematical/statistical terms ‘why, when, where, what or how’ one form of sampling technique or testing is correct or incorrect,” Stagar said.
The above supports the need for us as medical providers to pay attention to the methods auditors use in order to reach their conclusions – and to what you can do, as non-statisticians, to defend yourselves appropriately.
I know I have written more than a few times about how to ensure that extrapolation audits are conducted fairly. Let’s once again review the CMS Statement of Work to get a better understanding as to when and why an extrapolation audit would be conducted:
Section 1842(a)(2)(6) of the Social Security Act requires the government to review, identify and deny inappropriate, medically unnecessary or excessive services. Extrapolation techniques are used when the size of a group of claims prohibits a complete review of every claim. In many cases of extrapolation, a statistically valid random sample is drawn from that group of claims in order to estimate potential payment error. In its Statement of Work, CMS indicates that extrapolation may be used when there has been a determination that, within any group of claims, there is a “sustained or high level of payment error” – and again, the guidelines note that this determination should be based on a statistically valid random sample drawn from that group.
An important point to note is that the SOW also states that the determination of whether a sample indicates a sustained or high level of payment error is not subject to legal or administrative review. This means that no matter what the error rate is, you can’t dispute whether it constitutes a high or sustained level of error. I recently worked on a case in which an auditor found only four of 100 claims overpaid, and the auditor chose to extrapolate with the group of claims, resulting in a 4 percent error rate. In so many words, that’s just plain crazy – yet the only way to challenge this is to challenge the random sample. CMS clearly states, again, that the finding of “high or sustained level or payment error” should be based upon a statistically valid random sample drawn from the group.
In the past I have written about things any practice can do to validate (non-statistically speaking, of course) whether a sample passes the smell test, such as checking the average paid amount per claim or comparing the distribution and rank order of procedure codes and modifiers of the sample with those of the universe. I also have covered topics that deal with when to stratify a sample, the effect of multi-modal distribution, and other really fun matters! This time, however, I want to talk more about a concept than the actual statistical issues themselves.
First, what is a cluster sample? Say you want to determine the average number of items a typical bagger at a typical grocery store puts into a specific type of bag (a standard plastic grocery bag). Let’s say that we use a sample size of 100. We then could get a list of all of the grocery stores in the country (somewhere north of 250,000) and draw from that list a simple random sample (SRS) of 100 stores. The problem with this method is that it is possible, but unlikely, that the sample would include stores from all 50 states (and it would be a bit of a burden if you wanted to observe and count all the items in all the bags personally). Using a cluster sample, we instead could identify a random sample of states (say, 10 for our example here) in order to limit the cost and travel required as part of the study. This now approaches what is called a two-stage cluster sample; to go this route, first we would randomly select 10 states, and then draw a SRS of 100 stores from those states. There are some issues with this, such as the potential for a challenge to the precision, but there are plenty of ways to overcome such a challenge.
In a multi-stage cluster sample, we would dig even deeper. For example, let’s say we wanted to study a limited number of baggers (five for each store, for example). In this case, a three-stage cluster sample. We would select at random five baggers from the randomly selected 100 stores (which, again, are from the randomly selected 10 states). We even could introduce a fourth stage, which might be a random sample of five customers of each of the five baggers at each of the 100 stores in each of the 10 states. And on, and on, and on.
So where does this apply within healthcare, particularly as it pertains to auditing? In other industries, the main benefit to cluster sampling is to reduce the costs and resources involved in conducting a study or survey. In healthcare, however, it has some other benefits, too. First of all, I think most would agree that sicker patients (in this case, chronically ill patients with multiple co-morbidities) visit the doctor more often than healthier patients. In an internal medicine practice, for example, patients with chronic conditions such as diabetes or COPD end up in the office more often than healthier patients with acute problems such as bronchitis or an injury. At the end of any given year, for any given practice, if you were to look at the number of claims per patient you likely would see (as we do in our research) that those “sicker” patients also report more claims than healthier patients – and in many cases, those former claims would contain higher levels of services. In essence, as I have discussed in the past, this actually would create two tiers of patients with regard to billing characteristics. The goal in this case would be to stratify the sample into at least two different statistically determined classifications: one for patients with more claims per time period and one for patients with fewer Unfortunately, it is unlikely that an auditor would consider this, so it is up to you to bring it up.
You also could apply a cluster analysis here – and my experience is that it works much better than the typical SRS we see on claims within any given group. Even if the above condition doesn’t exist, a cluster analysis has shown to represent most groups of claims accurately in typical audit situations. Here is how it would work: First, we would draw a random sample of beneficiaries (i.e. patients), with the number equal to the number of claims we want to examine. So, as in a typical audit in which we would want to examine 30 claims, we would use simple random sampling techniques to select 30 patients at random. Then, for each patient, using the SRS techniques, we draw one claim each for the given time period. For example, let’s say that one of the selected patients reported six claims during a certain period. We then randomly would pick a number between one and six and review that claim. If a patient reported only one claim, then that one claim would be reported.
Hopefully you can see the benefit here: If we were to draw 30 claims at random from the group of claims, sicker patients would have a greater probability that one (or even more) of their claims would be selected, potentially introducing bias into the study. This also likely would result in having a greater number of claim lines per claim for the sample, or even a higher paid amount per claim. In such cases, either or both could end up exaggerating the extrapolated overpayment amount.
While multi-stage cluster sampling works well within a medical practice, it is of particular benefit in situations in which there is a higher line-to-event ratio, such as in hospitals, SNFs and other inpatient settings. It also is highly beneficial in situations in which there is a high degree of diversity in patient characteristics.
For organizations with multiple locations, cluster sampling can work well to reduce the cost of conducting reviews – and when the sampling process is managed appropriately, challenges found with this method can be mitigated.
About the Author
Frank Cohen is the senior analyst for The Frank Cohen Group, LLC. He is a healthcare consultant who specializes in data mining, applied statistics, practice analytics, decision support and process improvement.
Contact the Author
To comment on this article please go to firstname.lastname@example.org
Michael Stagar, certified fraud examiner, (Mstagarcfe@zoominternet.net)