Statistics

Descriptive

Graphs           charts              frequency count         mean              median           mode 

range              variability        standard deviation

 

Analytical, these tests use a statistical test to give a p value

Correlation studies, Positive correlation, negative correlation, no correlation

Association, causal, non-causal

 

Descriptive Statistics

Measures of central tendency

 

Sample averages, (arithmetical mean)

Find the mean of      15.3  17.8  13.2  14.0  19.9  14.9

 

Median, the value at which half the measurements in a sample lie below and which the other half lie above.

Find the median of         15.3  17.8  13.2  14.0  19.9  14.9  19.7  2.1  14.97  21.0

(Hint - try making up a line plot)

If you have an even number of measurements take the mean of the two middle values.

Find the median of    12.0  15.3  17.8  13.2  14.0  19.9  14.9  19.7  2.1  14.97  21.0

 

Mode, the result that occurs most often

What is the mean number of legs owned by humans in the UK?

Find the mode of the following data

12.0  15.3  17.8  13.2  14.0  12.0  14.9  19.7  2.1  14.97  21.0

 

Data dispersal, describes the range of the results, the scatter of data.

Range = (highest measurement) - (lowest measurement)

Find the range of the following data

a.  10.3  10.7  11.6  14.8  14.7  14.9 

b.  12.6  12.7  12.8  12.9  13.2  13.3

c.  10.3  12.9  13.0  13.1  13.2  14.9

 

What is the difference between sample a and c?

Does the range indicate this difference?

 

Sample standard deviation

To give a sort of average distance of the measurements from the sample mean

However average distance from the mean of all data is always zero

-1.0 + 0 + 1.0 

       3

So the deviations from the mean are squared, this gets rid of the minus signs.

Therefore standard deviation form the mean is described as:

 

Find the SD of the following data

a.  10.3  10.7  11.6  14.8  14.7  14.9 

b.  12.6  12.7  12.8  12.9  13.2  13.3

c.  10.3  12.9  13.0  13.1  13.2  14.9

(Hint, make up columns under these headings)

x                                  x                                  x-x                               (x-x)

 

 

 

The larger the SD the greater the average distance of a data point from the mean. Therefore this is a more sensitive measure of scatter than the range.

 

Frequency distribution

The number of times each event occurs is counted or the data are grouped and the frequency of each group reported.

Normally reported as a histogram or line graph.

 

Normal distribution

What is meant by normal distribution?

 

Skewness

Distribution curves may show skewness. Positive skewness is biased towards the left, ie. the mean is to the right of the median eg. global income

 

Percentile

Percentile represents the percentage of cases a score exceeds, ie 90th percentile exceeds all but 10% of cases. The median is the 50th percentile

 

Univariate and bivariate

Univariate statistics consider one variable at a time

More commonly in research the relationship between two variables is considered

In theory there are an infinite number of "variates"

 

Inferential Statistics

To what degree are sets of variables associated, eg. poor correlation, moderate correlation, strong correlation.

In what way does association indicate causality?

What is meant by the following terms? Also what does each indicate?

Positive correlation

Negative correlation

No correlation

 

P = Probability

This value indicates the degree of association between two sets of variables.

1                      =          no correlation at all   

0.5                   =          50/50 chance of association

0.25               =          75%   chance of association

0.1                   =          90 %  chance of association

0.05                =          95%   chance of association

0.01                =         

Association is significant at p = 0.05          Association is highly significant at p = 0.01

P values are reached by using an inferential statistical test.

 

Critiquing Criteria for Statistics

Perhaps the most important application of this knowledge to you will be to allow you to analytically critique published research. This can be done by addressing the following points.

Are appropriate descriptive stats used?    What are the levels of measurement?

Is the sample size adequate?                      What descriptive statistics are given?

Are the descriptive statistics used appropriate to the measurement of each variable?

Is there summary statistics for the major variables?

Is there enough information to allow you to judge the results?

Are results clearly and completely stated?

Are statistics, tables and graphs consistent with the text?