Get access to all lessons in this course.

Welcome
 Welcome to Inferential Statistics with R
 Introduction to the Dataset

ttests
 Independent ttest
 Dependent ttest

OneWay ANOVA
 OneWay ANOVA
 Post Hoc Comparisons
 Other ANOVA Tests

ChiSquare
 ChiSquare
 Dealing with Small Cells

Correlation
 Correlation

Regression
 Linear Regression
 Multiple Regression
 Hierarchical Regression

Reliability
 Reliability

Reporting Results
 Extracting Output
 Reporting Results

Testing Assumptions
 Testing Assumptions
 Testing for Normality
 Testing for Homogeneity of Variance
 Violated Assumptions
Inferential Statistics with R
ChiSquare
This lesson is locked
This lesson is called ChiSquare, part of the Inferential Statistics with R course. This lesson is called ChiSquare, part of the Inferential Statistics with R course.
Transcript
Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.
Your Turn
Perform a chisquare to examine how
grade_class
relates tolive_on_campus
. What is the pvalue? Is there a relationship?If there is a significant difference, examine standardized residuals and the observed/expected frequencies to determine what grade class is more or less likely to live on campus. Interpret the results.
You need to be signedin to comment on this post. Login.
Zach Tilton
March 9, 2022
Hi Dana, thanks for the great videos here. I have been using what I learned in this video a lot lately. However, I recently hit a roadblock when running a chisquare test on two variables in a data set I am working with. Both variables are factors, but one has a level or value I don't want to include in the analysis because it doesn't make sense and would bias my test. To explain, for this test I am looking at the relationship between evaluation report type and generic evaluation use, where report type has the levels "written", "oral", "both", or "none" (which is what I am attempting to filter out) and where generic use has the dichotomous levels of "yes" or "no".
When I run the following code, the "none" level still shows up in the tabyl output, despite the fact I seem to have successfully filtered that level from the original dataframe. This prevents the test from working, though it does show the residuals.
report_type_yes % filter(report_type != "none")
report_type_x_generic_use % tabyl(report_type, generic_use, show_na = FALSE) %>% janitor::chisq.test()
tidy(report_type_x_generic_use)
report_type_x_generic_use$stdres
report_type_x_generic_use$observed
report_type_x_generic_use$expected
Here are the outputs:
statistic p.value parameter method
NaN NaN 3 Pearson's Chisquared test
(stdres) report_type no yes none NaN NaN written 4.264175 4.264175 oral 1.489306 1.489306 both 4.905978 4.905978 (observed) report_type no yes none 0 0 written 53 120 oral 10 23 both 33 235 (expected) report_type no yes none 0.000000 0.00000 written 35.037975 137.96203 oral 6.683544 26.31646 both 54.278481 213.72152
Any thoughts on what might be happening? I recognize this might be more of a data manipulation question, but related to this statistic, nonetheless. Let me know if anything isn't clear or you need more information. Many thanks in advance for responding to a long question.
Zach Tilton
March 9, 2022
Those initial pipe operators don't seem to have translated, but they are correct like the third one in this sample code chunk.
Dana Wanzer
March 17, 2022
Hey Zach! I know we spoke briefly offline about this, but I want to comment here for other students who might also have similar questions. filter() does not drop levels, it just makes it so there are no values within that level.
One option would have been to use the droplevels() function to remove the "none" option.
Another option, that I think you used, was to use the forcats package with the fct_drop() function to drop levels within a categorical variable.