Part 4 of Statistical Sampling by Bruce Truitt

Sift Media
Share this content



Click here to learn more about Bruce.

Posted December 2010 on  newsletter archives

Ah, sweetness!  One more election season has faded to blessed oblivion.  Congratulations on surviving another chance to elect gods then suffer the shenanigans of mere mortals. 

As an alternative, our last episode (more properly epizoodie), thrust Howard the Duck, a/k/a/ “El Pato Vato” or “Duck D-u-u-u-de,” into the political spotlight.  “Hey, why not a duck?” he squawked.  “We’ve had turkeys for over 200 years!” (See our last installment at:

But, recall that exit polls gave our fearless fowl 48% of the vote, plus-or-minus 3%.  These results told him he would win with 51% (48% + 3%) or lose with 45% (48% - 3%).  This less than useless information left our comic-book candidate in suspended animation.  Go ahead and groan, it’s OK.

Following fantastic fits of fulminating frustration, Howard hired a statistics consultant, though ‘twas tough to find one who spoke plain English, much less Mandarin duck.  (BTW, do you know how to recognize an extroverted statistician?  Find the one looking down at your shoes). 

Howard’s numbers nerd said the only way to know if the 48% was dead-on correct was to poll all voters.  “All we have is 95% confidence that your true share of the votes is between 45% and 51%.”  At this, Howard fumed, “Why did I pay for these polls!  I’m in hock up to my bill and still don’t know if I should keep pressing the feathers out there or just fly south!”  Well, our winged warrior has flummoxed his flipper on a fundamental fact of sampling:

  • We must estimate variation to calculate sample size
  • We must sample to estimate this variation

In other words, Howard has to estimate his share of the vote in order to calculate the sample size needed to estimate his share of the vote.  This is a duck and egg sorta thing, at which Howard honked and headed for Florida, thus his absence as a write-in in Alaska.  I did get him to draft Alfred E. Neumann as VP, though.  After all, “What!  Me Worry?” deserves serious consideration as a campaign slogan, n’est-ce pas?

So, how do we estimate the variation needed to calculate sample size via “The Formula:”?

Sample Size = Confidence x Variation

The general answer is simple:

  • Probe
  • History
  • Criteria

That is, we can do a sniff test, like Howard’s exit poll (comprising at least 30 items, the why for which we’ll handle in a later missive), rely on past experience, or look for a “thou shalt” in law, rule, regulation, norm, standard, policy, procedure, professional practice, etc.  We then plug the resulting value into “The Formula,” calculate our sample size, and hope that the variation we find (the duck) is generally akin to the variation we expected (the egg).

Or we can rely on the oldest basis for management assertion – the “Wild A#! Guess,” a/k/a the WAG. 

This sow’s ear sometimes comes in a silk purse called “subjective probability.”   Then, we also have the SWAG, or “Scientific Wild A#! Guess.”  Both of these are windage lacking verifiable substantiating data, but, hey, it’s better than nothing, and we gotta use something to calculate our sample size!  FYI, unrestrained use of the term “windage” guarantees endearment at family reunions, staff meetings, and, most definitely, exit conferences.

OK.  We have an idea of where variation comes from but how much is too much?  Is there a standard for deviation?  Only your oxymoron knows for sure, but Lou Rawls gave us the answer in 1976….”You’ll Never Find.” 

Yet, though no stated norm exists, practical guidelines do:

  • On the Normal Curve (All Hail The Mighty Bell Curve!), the standard deviation is 34% of the mean, “average” for those of us west of the Mississippi.
  • A standard deviation greater than the mean means (Aha!  Spellcheck didn’t snag that redundant repetition) the data are getting really spread out, which tells you to whack before you quack – toss the outliers or group (stratify) the data.  If help is needed, please contact our Ph.D. data experts, Drs. Hacken Whack, Sly Sindice, and Ed Itenfer Getit, or, of course, Tony Soprano, the true Whackmaster.

And, for all you Latin lovers out there, nota bene, (I’ve waited months to use that), the singular of “strata” is “stratum”!  Just like “data” and “datum.”

Well, then, what about standards for “The Formula’s” other two pieces – confidence and precision?

Regarding confidence, you’ll be glad to know that neither the Yellow Book, Green Book, Red Book, Grey Book, nor Pink Book (yes, they all really exist) criterionize.  Hey, if “impact” can be a verb, so can “criteria.”  Frankly, I am waiting on the Clear Book, which I hear is due out with the next Kelley Blue Book.

I found some salvation in the “AICPA Sampling Guide” (The Guide), a great example of what happens when statisticians, accountants, and lawyers collide with way too much coffee.  Seriously, though, while its Dostoyevsky-like density mimics that of the “Big Boy” bomb, it is an extremely well-written and valuable volume, despite occasionally high hypnagogics.

In mining “The Guide” I found that, as with my “standard” deviation quest, no criterion confidence exists.  While referencing confidence levels from as low as 50% (!) up to 95% and 99%, it posited no norm.  So, as before, we turn to professional practicality which tells us that, especially if an assertion about the population is sought, auditors do not work at confidence levels below 90%.  Makes sense, right?  After all, “auditor” starts with “A,” and no auditor wants a grade less than an “A.”  That’s good enough for me!

The identical scenario obtains for precision, a/k/a “margin of error,” nee Marge Innoverra of CarTalk fame.  While “The Guide” maxes out precision at 20%, it avoids any stake-in-the-ground standard.  But, again, practice helps out:

  • Auditors rarely use precisions over 10%, especially if a population pronouncement is needed.  And remember that this 10% creates a 20%-wide window since it means “plus-or-minus 10%,” a fact that hobbled Howard’s hopes.
  • As a rule, the more important the audit, the smaller the margin of error.
  • The smaller the margin of error, the more the audit costs.

And, of course, by logical extension:

  • The more the audit costs, the more work you gotta do.
  • The more work you do, the older you get.
  • The older you get, the sooner you retire.
  • The sooner you retire, the happier you are.
  • The happier you are, the longer you live.

Hoohah!  Sampling saves lives!  Sorry ‘bout that.  Got carried away.  Please ignore the last ten seconds of your life.

At any rate:

  • If you are sampling non-critical controls or compliance and/or need to estimate a population error rate, use margin of error of at most 10%.
  • If bad findings mean people die or write big checks, 10% aresn’t gonna cut it.  A five-per cent precision is a useful maximum in such cases.  “Aresn’t” is my contribution to the lexicon and likely the only triple negative in English given its fusion of “aren’t,” “ain’t,” and “isn’t.”
  • The more money on the table, the more you move toward the 3% often used in Harris and Gallup polls and in Medicare and Medicaid work.
  • If you will face God, the Devil, and F. Lee Bailey in court (buena suerte, vato!), you might go to 1%, though this generates ginormously egantic sample sizes.

Hmm.  It looks like your old Audit Manager was right – it all depends on the objective.

OK.  Enough at the firehose fountain.  Let’s wrap it up.  Here is your tchotchke:

Sample Size = Confidence (≥ 90%) x Variation (≤ Mean)
                                          Precision (≤ 10%)

So, in the end, there is nothing standard about deviation, confidence, or precision.  And you thought auditing was a precise profession?!?  Silly goose!  But, hey, our professional sacrament – auditor judgment – remains inviolate.  Yay!

Naturally, this chat aresn’t done.  Tune in next time for more statistical lies, rumors, and half-truths (is that redundant?) when we consider the fundamental lemmas of our honorable profession:

Only Errors Exist


One Man’s Errors Are Another Man’s Data




Bruce Truitt has 25+ years' experience in applied statistics and government auditing, with particular focus on quantitative methods and reporting in health and human services fraud, waste, and abuse. His tools and methods are used by public and private sector entities in all 50 states and 33 foreign countries and have been recognized by the National State Auditors Association for Excellence in Accountability.

He also teaches the US Government Auditor's Training Institute's "Practical Statistical Sampling for Auditors" course, is on the National Medicaid Integrity Institute's faculty, and taught Quantitative Methods in Saint Edward's University's Graduate School of Business.

Bruce holds a Master of Public Affairs from the LBJ School of Public Affairs, as well as Masters' degrees in Foreign Language Education and Russian and East European Studies from The University of Texas at Austin.

About admin


Please login or register to join the discussion.

There are currently no replies, be the first to post a reply.