March 2, 2016

E&M Error Rate Reported at Nearly 40 Percent

By

For those that are younger than me (74 percent of the population in the United States), the title of this article may not evoke much in the way of strange memories. But for those of my generation, it should bring to mind images of Maxwell Smart, Agent 86 for the spy agency CONTROL.

“Missed it by that much” was perhaps the most popular catchphrase spoken by the character on the popular television series “Get Smart,” and it always followed some near miss. The first use came in the episode titled “The Day Smart Turned Chicken,”when an agent attempted to jump from a window into a mattress truck and missed, hitting the pavement. 

Smart, having witnessed this, turns and utters those famous words for the first time, leaving a lasting impression on people like me. But I digress…

I have spoken about the federal Fraud Prevention System in the past, discussing the different predictive models used to identify claims that meet the government’s criteria for fraud and abuse. For physician practices, there are five categories that are reviewed as part of those models:

  1. E&M code utilization (99201-99499)
  2. Non-E&M code frequency utilization
  3. RVU utilization at risk
  4. Modifier utilization
  5. Time

Even though E&M codes do not contribute as much as non-E&M codes to recoupments, they do garner a lot of attention. In total, E&M codes make up about 1 percent of all of the codes in the physician fee schedule database. When associated with Medicare Part B, E&M codes make up some 18 percent of frequency and almost 38 percent of all Medicare Part B payments. Now, I am not a certified coder, so far be it for me to opine on the methods required to select the proper E&M code. Nonetheless, as a statistician and compliance risk analyst, I have successfully opined on many occasions regarding how even a difference of one level in an E&M code could result in serious financial damage. 

Without getting into the specifics, sufficed to say that E&M codes cover the different categories of patient encounters, such as office visits, hospital visits, critical care, home visits, etc. And for many of these categories, there are several codes that could be applied depending upon the nebulous and oft-confusing guidelines, which include one set published in 1994 (called the 1995 guidelines) and one set published in 1996 (called the 1997 guidelines). In general, both sets of guidelines recognize seven components that are used in defining the levels of E&M services. They are:

  • History
  • Examination
  • Medical decision-making
  • Counseling
  • Coordination of care
  • Nature of presenting problem
  • Time

The first three in the list are called the “key” components and are most often used in selecting the proper level of E&M code within certain categories, such as office, hospital, emergency room, and nursing home visits, to name a few. It goes without saying that coding for an E&M visit is often more complex than the diagnosis of the patient’s condition. In fact, I calculated that, in order to code for a single visit, a provider has to go through some 1,600 decision points. The bottom line is that E&M coding is way more subjective than it should be and far more elastic than one might expect for a system tied so closely to quantitative metrics, such as the resource-based relative value scale (RBRVS).  

There are at least three national studies that I have reviewed that put the rate of disagreement for office visits in the range of 40 to 50 percent. Recently, I conducted my own study to see how my results would compare. I got together a mix of 40 coders and auditors, some certified and some not with a wide range of years of experience (I wasn’t trying to control for these variables). My goal was to gauge, within the general community, how variable E&M coding was reported, on average. Each coder was given three hours to select a code for 18 charts consisting of levels 3, 4, and 5 for new and established office visits. In reviewing the charts, I had them select what they thought would be the proper code for both documentation and medical necessity. In general, I was conducting what is called an inter-rater reliability study, with a couple of tweaks to make it more applicable to E&M coding. The code that was selected by the coder was compared against the reference code, which was selected by a trained coding auditor and then re-checked to ensure that there was consensus. 

The results were quite interesting, although not unexpected. Overall, the rate of disagreement for all coders and all charts was 38.7 percent. This means that, if I had 100 coders in a room and I put a vignette up on the screen, about 39 of them would disagree with the other 61 as to the proper E&M code level. That’s not too far from the other studies – and it’s quite scary, actually. The way I interpret this is that there is a general error rate for E&M coding of almost 40 percent. I can’t think of any other industry that would allow that high of an error rate to exist within their processes. What was of particular interest was the disagreement rate for one level and two levels from the reference code. The one-level rate of disagreement was 34.15 percent, which accounts for 88.2 percent of the total, and the two-level rate of disagreement was 4.3 percent, or 11.1 percent of the total. There was only one reported case of a three-level disagreement, which is less than the margin of error, and as such was disregarded.

If you think about it, E&M coding is so unique that it really needs to be treated much differently than the other code categories. Here’s a question I always like to ask coders: how many E&M codes are there? Wait for it … the answer is one. There is only one E&M code: the code that you assign for an encounter. Unlike most other codes, E&M codes are mutually exclusive, and that alone is reason for special treatment. Here’s another great question: what is the difference between a 99213 and a 99214? Wait for it … the answer (again) is one. That’s right, the difference between a 992213 and a 99214 is one, and it is way too easy to affect that change.

In general, then, it is my professional opinion (yes, professional, not personal) that the overall disagreement rate among a trained cohort of coding professionals (at any level) is so high as to negate the validity of any audit finding that involves only one level of disagreement. I have used this argument in many post-audit extrapolation mitigation cases and have, in fact, won a few. I believe that if the data is presented properly and in a compelling manner, any reasonable person would be hard-pressed to disagree. The bottom line?

If you are involved an audit and are subject to overpayment demands based on a one-level difference for a given E&M visit code, don’t give up without a fight. After all, you have the full power of the statistical community behind you! 

And if by chance you don’t prevail, you can always say “missed it by that much!”

And that’s the world according to Frank.

About the Author

Frank Cohen is the director of analytics and business intelligence for DoctorsManagement, a Knoxville, Tenn. consulting firm. Mr. Cohen’s specializes in data mining, applied statistics, practice analytics, decision support, and process improvement. He can be

Contact the Author

fcohen@drsmgmt.com

Comment on this Article

editor@racmonitor.com

This email address is being protected from spambots. You need JavaScript enabled to view it.