Too often when attempting to measure healthcare processes or outcomes we struggle to collect the data we need. Maybe we couldn’t anticipate the barriers we would encounter, or maybe we just didn’t observe as many events as we thought we would. When this happens, we’re often left with small samples or data that are different than we had hoped, and we’re not sure what can (or should) be done with it.
All is not lost! There are often several options in these situations, but you have to know how to proceed and you need to be careful when communicating the results of analyses from these types of datasets. When it trying to utilize small samples or “not-quite-what-you-had-hoped” data, there are three things to keep in mind:
1. Create thoughtful, effective graphs. One of the advantages of having few data is that you can display a lot of information on a single graph, and sometimes tell the entire story. For example, if we are tracking clostridium difficile infections (CDIs) and have only a handful of datapoints, a simple graph or the points over time looks like this:
This tells part of the story, including that there seems to be a decrease in CDI rates starting in late 2014. But, this only displays two characteristics of our data: the rate and the time. With this relatively small dataset, we could easily add more information to the graph without confusing the reader or cluttering the display, and it would help to tell a more complete story of the data.
To demonstrate this, examine the figure below where we’re representing 7 pieces of information at the same time:
In addition to the rate (1) and the time (2), we’ve differentiated between before and after some event (3) (e.g., a training session, a new guideline, etc.) with the vertical red line and shape of the point (circle vs square), added the mean rate before (4) and after (5) the event as dotted lines, color-coded the points to represent some dichotomous characteristic (6) (this could be a patient characteristic like gender, or perhaps an indication of where the patient initially presented [ED vs inpatient], or any number of things), and finally have called attention to a point of interest (7) using an arrow (perhaps upon further investigation there is something unusual about this point we want the reader to know).
With just one or two good graphs, you can construct an effective narrative to demonstrate to the reader important patterns in the data and what the results tell you.
2. Consider using non-parametric tests. To explain what non-parametric tests are, let’s start by describing parametric tests, which you have certainly heard of even if you don’t realize it. Parametric tests are things like “t-tests” and “Chi-square tests.” These are tests performed when you’re comparing two means, examining proportions, or running an ANOVA, and they are called “parametric” because they rely on an assumption that the data follow a certain distribution (that is defined by certain “parameters” that are typically estimated during the process of performing the test). Non-parametric tests, by contrast, don’t require that sort of assumption, and so they are a good alternative when you have few data which may not follow the distribution required for the parametric tests to apply (e.g., data that are not Normally Distributed). Why do we make a distinction between parametric and non-parametric tests, and why do we typically choose one over the other? Because when we have enough data to ensure that the distribution-based assumptions hold, parametric tests are more powerful than non-parametric tests. That means they are more able to correctly identify a difference or an effect when there actually is one than non-parametric tests. But, as we said: parametric tests also rely on the distribution assumptions, and when small sample sizes limit our ability to assess the validity of those assumptions, we need to consider the non-parametric alternatives.
Luckily, there are a variety of non-parametric methods to test for differences in means and proportions, perform ANOVA-like analyses, and even produce confidence intervals (through a method called “bootstrapping”). Common non-parametric tests include the Wilcoxon rank-sum test, the Kruskal-Wallis test, and the Mann-Whitney test. An advantage of these tests is that often they can be relatively intuitive, so even non-statisticians can understand exactly what the test is doing. For example, the Wilcoxon rank-sum test compares two independent samples by ranking all of the observations in both samples and then looking at the sum of the ranks of each sample to see how different they are. Even if you only have 10 or 12 data points, you may be able to run some of these tests to check for statistical significance.
3. Be transparent with the data and precise with your language. Any time you have a small sample, your audience will be skeptical. If you’re publishing a paper or report, consider making the entire dataset available to the readers (de-identified, of course). Then, if they want to, they can run their own analyses to check your work (this almost never happens, but providing the data sends the message that you have nothing to hide and assures the reader that you’re not trying to mislead them or misrepresent the data). Precision with your language is essential so that you do not inappropriately generalize your results beyond the scope of your data. It can be tough, and readers may be particularly sensitive. Statements like, “These results suggest that treatment X can be effective…” may receive pushback because the reader may assume you’re trying to generalize your small-sample results too much, and you may have to be more cautious: “Within this dataset, treatment with X produced a statistically significant improvement in…”
Whenever we discuss results of health-related data, we should think about clinical significance as well as statistical significance. That is, in large samples, sometimes small differences will be statistically significant, even if they don’t represent much difference in clinical states or outcomes. A correct and complete interpretation of the data should acknowledge this discrepancy. For small samples, the opposite could be true: large and clinically significant differences may not achieve statistical significance, and it will be important to keep that in mind when interpreting your results. Again: be transparent with your approach and interpretation. If you find the results to be interesting or compelling, it’s likely that others will, too – you just need to be careful to not lead (or mislead) the reader.
A Final Word
When you have a small dataset, you need to get creative. Having only a few data points forces you to get clarity on what your data say and what they don’t say, and that in and of itself is a story worth telling. So, if you find yourself with a dataset that isn’t quite what you’d hope it would be, don’t toss it aside. Create thoughtful graphs, consider non-parametric tests, and be transparent when you describe the story it tells.