Get access to all lessons in this course.
-
Welcome
- Welcome to Inferential Statistics with R
- Introduction to the Dataset
-
t-tests
- Independent t-test
- Dependent t-test
-
One-Way ANOVA
- One-Way ANOVA
- Post Hoc Comparisons
- Other ANOVA Tests
-
Chi-Square
- Chi-Square
- Dealing with Small Cells
-
Correlation
- Correlation
-
Regression
- Linear Regression
- Multiple Regression
- Hierarchical Regression
-
Reliability
- Reliability
-
Reporting Results
- Extracting Output
- Reporting Results
-
Testing Assumptions
- Testing Assumptions
- Testing for Normality
- Testing for Homogeneity of Variance
- Violated Assumptions
Inferential Statistics with R
Chi-Square
This lesson is locked
This lesson is called Chi-Square, part of the Inferential Statistics with R course. This lesson is called Chi-Square, part of the Inferential Statistics with R course.
If the video is not playing correctly, you can watch it in a new window
Transcript
Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.
Your Turn
Perform a chi-square to examine how
grade_class
relates tolive_on_campus
. What is the p-value? Is there a relationship?If there is a significant difference, examine standardized residuals and the observed/expected frequencies to determine what grade class is more or less likely to live on campus. Interpret the results.
You need to be signed-in to comment on this post. Login.
Zach Tilton
March 10, 2022
Hi Dana, thanks for the great videos here. I have been using what I learned in this video a lot lately. However, I recently hit a roadblock when running a chi-square test on two variables in a data set I am working with. Both variables are factors, but one has a level or value I don't want to include in the analysis because it doesn't make sense and would bias my test. To explain, for this test I am looking at the relationship between evaluation report type and generic evaluation use, where report type has the levels "written", "oral", "both", or "none" (which is what I am attempting to filter out) and where generic use has the dichotomous levels of "yes" or "no".
When I run the following code, the "none" level still shows up in the tabyl output, despite the fact I seem to have successfully filtered that level from the original dataframe. This prevents the test from working, though it does show the residuals.
report_type_yes % filter(report_type != "none")
report_type_x_generic_use % tabyl(report_type, generic_use, show_na = FALSE) %>% janitor::chisq.test()
tidy(report_type_x_generic_use)
report_type_x_generic_use$stdres
report_type_x_generic_use$observed
report_type_x_generic_use$expected
Here are the outputs:
statistic p.value parameter method
NaN NaN 3 Pearson's Chi-squared test
(stdres) report_type no yes none NaN NaN written 4.264175 -4.264175 oral 1.489306 -1.489306 both -4.905978 4.905978 (observed) report_type no yes none 0 0 written 53 120 oral 10 23 both 33 235 (expected) report_type no yes none 0.000000 0.00000 written 35.037975 137.96203 oral 6.683544 26.31646 both 54.278481 213.72152
Any thoughts on what might be happening? I recognize this might be more of a data manipulation question, but related to this statistic, nonetheless. Let me know if anything isn't clear or you need more information. Many thanks in advance for responding to a long question.