In that past, I have opined on retroactive baseline probe audits (RBPAs) and why I think they are both ineffective and potentially dangerous, from a purely mathematical perspective. But one of the problems often spawned by RBPAs is the corporate (or practice) error rate, which in my opinion can prove as dangerous that the probe audit itself, if not more so.
First, a refresher: during a RBPA, a practice will select some fixed number of charts to review for a given physician over a given period of time. In my experience, the modal count seems to be 30, meaning that 30 charts representing 30 line items (services or procedures) are selected for a given doctor and then audited by an internal coder in an attempt to establish a potential pattern of improper coding. So for a 600-physician group, this amounts to some 18,000 charts per year.
From a coding perspective, since the purpose is to audit an already-coded event, which requires a bit more time and effort, we are talking in the neighborhood of 2.2 FTEs, assuming that this is all they do, day in and day out. So, what do you get for your money? Depending upon the specialty, a typical provider will report some 75 to 100 unique procedures and submit some 2,400 claims per year. If you pull 30 claims for a physician, let’s say at random, what do you think the chances are that those charts will all represent unique services or procedures?
Well, it is an infinitesimally small number. And if you did happen to achieve this, what would you have accomplished? One chart for each of 30 unique procedures or services, which is certainly not enough to draw any general conclusion about whether the procedure is being coded correctly or not. In fact, the best you could do is to issue an error rate of either 100 percent or 0 percent: hardly a useful statistic in either case. And at the outset, you miss the ability to review coding for 60 percent (45 of the 75) of the unique procedures/services reported by that provider.
Yet as I said, pulling 30 charts at random creates another problem. Why? Because a statistically valid random sample, if that is what you end up with, can put you in the position of having to self-disclose should you discover some pattern or trend of improper coding. Even if you were able to get something like 10 charts for a unique procedure, what would be the value of your results? Let’s say that you determined that three of the 10 charts were coded in error, so you report the error rate as 30 percent. There are a few problems with this, and this speaks to the purpose of this article: to identify the dangers of creating enterprise-wide provider error rates. First of all, most government and private auditors calculate an error rate as dollars, represented by lines coded in error divided by total dollars represented in the sample, and the dollars are normally expressed as the amount paid. But many of the organizations with whom I work that report some overall coding error rate use the example above (charts) rather than dollars.
Getting back to our example, what does three out of 10 really mean? Well, most of us would report this as a 30 percent error rate, and that would be correct. However, it would only be correct for the 10 CHARTS IN OUR SAMPLE (sorry for yelling). If we were to want to infer this to the larger population of those unique procedures (remember, the 10 charts were only for one procedure), the actual error rate would be somewhere between 6.7 and 65 percent, which would pretty much render the results useless.
Let’s use this as a basis to get back to the original hypothesis of enterprise-wide error rates. Let’s say that, of those 30 charts, you were able to represent 15 unique procedures at an average of two charts for each unique procedure. As shown above, no matter what you found, the results would be useless. In addition, you would have missed reviewing the remaining charts, so you begin your global error rate assessment with only a fifth of the potential unique procedures, which should be a red flag itself. Even if you were to combine these 30 charts with the other 17,970 from the other 599 providers, you still would be very lucky to capture any more than 20 percent of all risk events – which, again, renders your assessment virtually useless.
So in essence, determining an error rate for a given provider, at least using this method, is worthless. But wait, there’s more! Let’s say you use all 18,000 charts as a group to identify your error rate. It’s a bit of a stretch, but it can be done. You still likely would fail to capture a large enough volume of each unique procedure to achieve an acceptable precision; however, it gets us a bit closer to the goal. As our example continues, let’s say that, of those 18,000 charts, you found that 3,500 of them were coded improperly. That would give us a 95 percent confidence interval of 18.9 to 20 percent: a much tighter interval. Remember, however, that this is aggregated for all charts for all providers, which discounts the relative importance of assessing error by provider. And it is at the provider level where remediation is normally conducted. Remediating an enterprise is a near impossible task.
Back to our example: if you accept the concept of a general aggregate error rate as being acceptable, then what you are doing is reporting to the world (or maybe just the auditors) that nearly one out of every five claims you file are coded incorrectly. That message floating around a healthcare organization is simply a qui tam waiting to happen. And I am not saying this because I don’t think that internal auditing isn’t important or that a healthcare organization shouldn’t be aggressive about ensuring they follow all of the rules, regulations, and laws; it’s because in all likelihood, the error rate you are reporting is wrong. It’s not only wrong, but so wrong that it is simply of no value other than to create more risk for your organization.
So, if you can’t report an organizational error rate, what can you report? In general, it’s only that which you find. I would report only what is relevant and accurate about the event at hand. If, for example, you find an error rate of 19.4 percent for the 18,000 charts that you audit, then report it as 19.4 percent for just that sample of audits. I would bet dollars to donuts that what you have is not a statistically valid random sample, nor is it representative of the universe, and as such, it should not be used to extrapolate or infer a general error rate for the organization. Remember, our example is 30 procedures per provider for 600 providers. Of those 2,400 claims a provider files per year, there are some 4,000 line items or individual procedures. Therefore, the results of your sample of 18,000 are being inferred to a universe of nearly a million individual procedures, of which representation is inadequate at best. If you really feel the need to report something, then stick to the facts. I figure that as you are walking through the process, someone has to have defined the reason for the audit, and it should extend beyond just that it is required under your compliance plan. We expend the time and resources necessary in order to improve the effectiveness of our billing functionality. Whatever you decide to do, avoid adding more risk to an already risky situation.
And that’s the world according to Frank.
About the Author
Frank Cohen is the director of analytics and business intelligence for DoctorsManagement. He is a healthcare consultant who specializes in data mining, applied statistics, practice analytics, decision support, and process improvement. Mr. Cohen is also a member of the National Society of Certified Healthcare Business Consultants (NSCHBC.ORG).
Contact the Author
Comment on this Article