Biostats BI 345 Blog : April 2016

Measure of Central Tendency

Measures of Central Tendency

Arithmetic Mean or Average
Median- Middle datum of a sample

50% of data lies about mean
50% of data lies below mean

To find the mean

Step 1- sort data
Step 2- determine whether n= even or odd

Mode- datum that occurs most often in a sample

2 Step process

Sort the data
Conduct frequency analysis
Count the number of occurrences of each datum

Measure of Variation

Measure of Variation
Used to asses the variation of data around the mean or median or mode

Range

the numerical difference between the minimum and maximum values in a data set

Variance

calculate in squared unites of the original data

population
sample

Standard deviation

measure of the spread of data around a mean

Standard error

describes dispersion of sample mans around their population mean

Coefficient of variation

used to compare amount of variation among samples with data that differs in magnitude

Variance

2 step process

Find SS
Divide SS by n-1=df

use n-1=df
Conservative estimate of the population variance

Normal Distribution

Used to determine the probability of obtaining random samples with different means
Many samples and populations contain data that fit a normal distribution
Can be used for basis of statistical testing

Characteristics

Bell Curve

most values near the middle datum or average of the sample
very few values near the upper and lower extremes

Data fit the formula of a normal distribution

Y= frequency of a value of x

Deviations from a Normal Distribution

Asymmetric Deviations

Skewed distributions

Skewed to the right= pos. skewed
Skewed to the left= neg. skewed

Mathematical Analysis of skewness

Kurtic Deviations

Platykurtic
Leptokurtic
Mathematical analysis of kurtosis

Student's t-test

Background

developed by Gusset working at Guinness Brewery
Problems with the Normal Distribution
Gossett discovers the t distribution
Publishes it under an assumed name-student

Calculation of Z scores require knowledge of population parameter

Population Mean
Population Standard Deviation and Standard Error
Small samples do not provide reliable enough estimates of population parameters

Characteristics of the t distribution

Leptokurtic
As n and v (df=v=n-1) increased the t distribution begins to approach a normal distribution

Types of Student's t tests

One-sample Student's t test
Two independent (unpaired) Samples Student's t test
Two dependent (paired) Samples Student't test

One-sample Student's t test

Used to compare a population mean inferred from a sample with a hypothetical population mean

Two Independent (unpaired) Sample Student's t test

Used to compare two independent population mean inferred from two samples (independent indicated that the value from both samples are numerical independent of each- there is no correlation

Two dependent (paired) Samples Student's t test

Used to compare two dependent populations inferred from two samples (dependent indicates that the value from both samples are numerically dependent upon each other- there is a correlation between corresponding values)

Two variations of all Student's t test

Two-tailed test
One-tailed test

Two-tailed test- evaluates whether a difference exists between 2 samples, not the direction of the difference

One-tailed test- evaluates whether a difference exists between 2 samples, and specifically evaluates the direction of the difference

ANOVA

Developed by Fisher
Studied Agriculture and crop output with different fertilizers
Needed test that could evaluate differences between three or more means
Why- problems with applying the Student's t test

One- ANOVA

Examines one factor at a time, tests for differences among levels of the factor

Fixed Effects or Model I One-way ANOVA

the levels of the factor are specifically chosen by the investigator

Random effects or Model II One-way ANOVA

the levels of the factor are randomly chosen by the investigator

Mechanics of One-way ANOVA

Statistical hypothesis
Formulae
Critical values and decisions to reject/ not reject the null hypothesis

Formulae

Focus in on analysis of variance- comparison between 2 types of variance
Numerator= among group variance (variation among the grand mean and the sample means)
Denominator= within group variance (sum of the variation within each sample-around each sample mean)

Results of ANOVA

No difference between among group variance and within group variance
There is no difference among means
Stop all testing and write the results section
Differences exists between among group variance and within group variance
There is a difference among means
Follow with a multiple comparisons test to determine which means are different from each other

Example of an ANOVA

State the biological question
Translate into statistical hypotheses
State the alpha level
State the statistical test
State the assumption of the test
Calculate the observed test statistic
Find the degrees of freedom and critical value
Compared the observed and critical value
Interpret the results

Multiple Comparisons Test

Multiple Comparisons Tests

Only used if the results of an ANOVA yield significant difference
ANOVA results only indicate that a difference exits among means, not where the difference is
Referred to as ad hoc or a posteriori test
Used after you know there is a significant difference from the ANOVA
Several types of multiple comparisons tests
Three broad categories

Generic multiple comparisons test
Control group test
Multiple contrasts tests

Generic Multiple comparisons test

evaluate all possible pairs/combinations of means
Tukey's HSD test
Student-Newman-Keels (SNK) test

Control Group test

Evaluate differences between experimental group versus the control group
Dunnett's test

Multiple Contrats tests

can be used like the traditional tests mentioned above to evaluate differences among pairs of mean but is better used to evaluate homogeneous groups of means against other such groups or individual means
Scheme Test

Basic mechanics of Multiple Comparisons Test

Observed Test Statistic
SE=standard error
Statistical Hypotheses
A and B represent any pairs of means
Pairwise comparisons

arranged in order from largest to smallest
Calculate observed test statistic for each comparison

Enclosure Rule

If two mean are not different from each other then all means in between them are also not different from each other

Similarities among different Multiple Comparisons Test

All test involve pairwise comparisons of means
Rank order the means for comparisons
Calculate an observed q value similar to the t test and z scores
Compare with a critical value and reject or do not reject the null hypothesis for each pairwise comparison
Use the enclosure rule in all tests

Differences among different Multiple Comparisons test

how the means are rank order
most test are two-tailed but control group tests can be one tailed test
the SE term differs among tests

Mechanics

Arrange statistical hypotheses
Calculate test statistics= observed q
Decisions rules and critical values

Test

Tukey's HSD
SNK
Dunnett
Scheme

Linear Regression

Tests for significant relationship between 2 variables
Defines each variable

Y- dependent variable
X- independent variable
Y varies in response to changes in X

Defines functional mathematical relationship

y=bx+a

Used for prediction
Relationship is a functional dependence
The magnitude of a dependent variable (Y) is dependent on magnitude of an independent variable (X)
Functional dependence is a mathematical relationship that can be quantified
Linear Regression equation used to describe the mathematical relationship

Y=a+bX or
Y=bx+a

Y or dependent or criterion or response variable
X independent or predictor or regressor variable
a= y-intercept (where x=0)
b= slope or regression coefficient

Functional dependence is a mathematical relationship that can be quantified

Postitive, negative, and no relationship

Monday, April 11, 2016