Sunday, May 8, 2016

Chi Square Analysis

Chi Square Analysis

Used to examine differences in distributions of nominal data. Developed by Pearson in the early 1900s

Chi Square Goodness of Fit: used to evaluate whether observed data fit a theoretical or known distribution
Different variations of Chi Square Goodness of Fit:
Simple - 2 cateogries at a time (k=2)
Example - k=2 cateogries of flower color: 1) yellow flowers and 2) green flowers.
Ex: compare observed distribution of individuals with yellow or green flowers with a hypothetical distribution of 3 yellow : 1 green or 75% yellow 25% green.
Complex - More than 2 categories at time (k>2)
Example - k=4 categories of seeds: 1) yellow and smooth seeds, 2) yellow and wrinkled seeds, 3) green and smooth seeds, 4) green and wrinkled seeds.
Ex: compare observed distribution of individuals in these categories with a theoretical distribution of 9:3:3:1 or 9/16 yellow and smooth seeds, 3/16 yellow and wrinkled seeds, 3/16 green and smooth seeds, 1/16 green and wrinkled seeds.
Mechanism:

  1. Statistical hypotheses: simple statements that a population fits a theoretical or known distribution or it does not
  2. Formula for Chi Square involves comparison of deviations between observed and expected frequencies
  3. Compare observed Chi Square with critical value from a table of critical values
Chi Square Contingency Analysis:

-Evaluates whether frequency of occurrence of one variable is independent of frequencies in a second variable or asks question: is membership in one category influenced by membership in a second category
Example: is hair color independent of gender or would you expect more boys to have dark hair and more girls to have light hair?
Mechanism:
Statistical hypotheses: simple statements that one variable is independent or is not influenced by a second variable
Formula for Chi Square involves comparison of deviations between observed and expected frequencies

  1. use a contingency table with rows (r) and columns (c)
  2. obtain expected values for each cell in the table
  3. compare observed Chi Square with critical value from a table of critical values
  4. use the number of rows and columns to calculate the df for the critical value



No comments:

Post a Comment