Managing missing data is an important step in the data analysis process. Missing data almost always happens; people drop out of studies or skip a few questions on a survey. What is important to know is whether the data is randomly missing or if there is a systematic pattern that can affect the results. Therefore, missing data can be categorized in three ways: MCAR (missing completely at random), MAR (missing at random, ignorable), and MNAR (missing not at random, unignorable). While there is no set standard for how much missing data can be tolerated, many suggest that less than 5% is acceptable.
Once it has been determined that unignorable missing data is present, the next step is to decide how to deal with it. The most common way to handle missing data is to simply remove any cases with missing data from analysis, no matter what the reason. Programs such as SPSS and SAS will delete these missing values automatically. One downside to this option, however, is the potential for a large loss of data. Especially if missing cases are spread randomly throughout the data. Therefore, a researcher can choose to estimate the values of the missing data in a variety of ways. One way could be to use prior knowledge of the literature to make an educated guess on what the value should be. This method can be biased on the researchers’ beliefs about the study and relies on the assumption that values will not change over time so other researchers choose to estimate missing values using more mathematical approaches. These methods involve using the mean of the data (mean substitution) or predicted values from a regression to substitute the missing values. Using methods like a regression to estimate a missing value is more objective, but also has the potential to be more consistent with the other data points than a real score. Finally, there are those who choose to use run analyses with and without missing data to determine if differences exist in results. This option is usually highly recommended when a data set is small, and the amount of missing data is large.
Overall, there are many ways to deal with missing data, all accompanied with several pros and cons to consider. It is up to the researcher to determine the pattern of missing data and the appropriate solution to deal with it based on the study in question.
We work with graduate students every day and know what it takes to get your research approved.