The data analysis plan refers to articulating how your data will be cleaned, transformed, and analyzed. All scientific research is replicable, and to be replicable you need to give the reader the roadmap of how you managed your data and conducted the analyses. Each of the following areas could be added into a data analysis plan.
Cleaning the Data
The cleaning of data is the removing of univariate and multivariate outliers, dealing with missing data, and assessing for normality.
When the data is not normally distributed, a transformation of the data can be appropriate. Some common transformations are the square root, logarithmic, and inverse.
Describing the Specific Statistical Tests to Examine Each of the Research Questions
The selection of the statistical analyses are based on two things: the way the hypothesis is stated in statistical language and the level of measurement of the variables.
Hypothesis
The way the researcher states the hypothesis makes a difference in the selection of data analysis test. Here are three null hypothesis examples:
(Example 1) Variable A does not relate to Variable B,
Example 1 tends to be stated in correlation or chi-square language,
(Example 2) Variable A does not predict to Variable B,
Example 2 is stated in regression language,
(Example 3) There are no differences on Variable A by Variable B.
Example 3 is stated in ANOVA or Mann-Whitney language.
How is one to choose the precise data analysis test? In addition to the phraseology of differences, prediction, or relationship, the other consideration in the test selection is the level of measurement of each of the variables.
Level of Measurement of the Variables to Select the Correct Data Analysis
In the hypotheses above, the level of measurement of the variables is a key factor in selecting the correct data analysis.
In example 1, if the variables are both categorical, the correct analysis would be a chi-square test, while if both variables are interval-level, a Pearson correlation would most likely be the correct analysis to examine relationships.
In example 2, regression is the appropriate test (i.e., examining the influence of a variable on another variable). Linear regression is the correct analysis if the dependent variable is interval-level, logistic regression if the dependent variable is dichotomous, and multinomial logistic regression if the dependent variable has three or more categories.
In example 3, if the dependent variable is interval, an ANOVA is appropriate whereas an ordinal dependent variable would lead one to select the Mann-Whitney as the appropriate test.
Putting the Data Analysis All Together
In the data analysis plan, data cleaning and transformation procedures should be addressed, then discuss the specific data analysis tests to be conducted. Be sure to state the hypotheses the way you want—to examine relationships, to predict, or to examine differences on a variable by another variable.
If you’re like others, you’ve invested a lot of time and money developing your dissertation or project research. Finish strong by learning how our dissertation specialists support your efforts to cross the finish line.