7 Tests for Nominal Data

(PSY206) Data Management and Analysis

Author

Md Rasel Biswas

7.1 Nominal Data

Nominal data are also known as categorical data
Categories have no natural order
Numbers used for coding are labels only, not quantities

Examples:

Sex: Male / Female
Smoking status: Smoker / Non-smoker
School type: Private / State

Nominal data are sometimes called qualitative data, but this is different from qualitative research.

7.2 Dichotomous Variables

A dichotomous variable is a special case of nominal data
It has only two categories

Examples:

Yes / No
Alive / Dead
Disease present / Disease absent

Every dichotomous variable is nominal, but not every nominal variable is dichotomous.

7.3 Descriptive Statistics for Nominal Data

For nominal data, the following are meaningless:

Mean
Median
Standard deviation

The only appropriate summaries are:

Frequencies (counts)
Percentages

Recommended displays:

Frequency tables
Bar charts

Histograms should NOT be used for nominal data.

7.4 Chi-square Test: Overview

The chi-square test is designed for nominal data
It compares observed frequencies with expected frequencies

Two main types:

Goodness-of-fit chi-square
Multidimensional chi-square

7.5 Goodness-of-Fit Chi-square

Purpose:

To test whether an observed distribution differs from what is expected by chance

Example:

Do children prefer toy A, B, and C equally in a play-therapy setting?
Do smokers choose Brand A and Brand B equally often?
Are different coping strategies (avoidance, problem-focused, emotion-focused) used equally often by stressed students?

Null hypothesis:

Categories occur in the expected (usually equal) proportions

This test is used less frequently in psychology.

7.6 Multidimensional Chi-square

This is the most commonly used chi-square test in psychology and public health.

It can be viewed as:

A test of association, or
A test of difference between groups

Research question:

Are two nominal variables independent?

7.7 Examples of Research Questions

Is smoking status associated with income level?
Is treatment received associated with survival status?
Is gender associated with help-seeking behaviour (Yes/No)?
Is exposure to trauma associated with PTSD diagnosis?
Is school type associated with exam anxiety level (High/Low)?

A significant result indicates association, not causation.

7.8 Assumptions of Chi-square Test

To use chi-square, the following must hold:

Variables are nominal
Data are frequency counts
Categories are mutually exclusive
Observations are independent

Repeated measures data violate independence.

7.9 Contingency Table

Data are summarized using an N × N contingency table

Example: 2 × 2 table

	High income	Low income	Total
Smokers	10	20	30
Non-smokers	35	35	70
Total	45	55	100

7.10 Expected Frequencies

If variables are independent: \[ E = \frac{(\text{Row total}) (\text{Column total})}{\text{Grand total}} \]

Chi-square compares observed (O) and expected (E) counts

Large differences between O and E lead to a larger chi-square value.

7.11 Example Data

In this lecture, we will use a sample psychology dataset to demonstrate the chi-square test of independence. Eighty young women completed an eating questionnaire, which allowed them to be classified as having either high or low tendency toward anorexia (1 = high, 2 = low), where participants with high scores are at greater risk of developing anorexia. In addition, the dataset includes several nominal background variables: cultural background (1 = Asian, 2 = Caucasian, 3 = Other), employment status of the women’s mother (1 = Full-time, 2 = None, 3 = Part-time), and type of school she attended (1 = Comprehensive, 2 = Private).

Download data: (In Excel) (In SPSS Format))

Previous research has suggested that the incidence of anorexia is higher among girls attending private schools than state schools, and higher among girls whose mothers are not in full-time employment. In addition, the incidence seems to be higher in Caucasian girls than non-Caucasian girls.

We therefore hypothesised that there would be an association between these factors and the classification on the eating questionnaire. To test this hypothesis, we conducted a series of chi-square analyses.

7.12 SPSS: Multidimensional Chi-square

Menu path:

Analyze → Descriptive Statistics → Crosstabs

Steps:

Put one variable in Rows
Put the other variable in Columns
Click Statistics → select Chi-square
Click Cells → select Observed, Expected, Row %, Column %

SPSS Syntax

CROSSTABS
  /TABLES=var1 BY var2
  /STATISTICS=CHISQ PHI
  /CELLS=COUNT EXPECTED ROW COLUMN.

7.13 Interpreting SPSS Output

When Crosstabs options are selected properly, each cell reports:

Observed Count: Actual number of cases in the cell
Expected Count: Number expected if variables were independent
Row %: Percentage within the row category
Column %: Percentage within the column category
Total %: Percentage of the full sample

Always describe results using row or column percentages, not raw counts alone.

7.14 Reporting Chi-square Results

Standard Reporting Format

χ²(df, N = sample size) = value, p = value

Example (non-significant):

There was no association between mother’s employment status and anorexia tendency: χ²(2, N = 80) = 0.29, p = .862.

Example (significant):

There was a significant association between school type and anorexia tendency: χ²(1, N = 80) = 28.19, p < .001.

Always follow this with a description of the pattern observed in the contingency table.

7.15 Degrees of Freedom

\[ df = (r - 1)(c - 1) \]

r = number of rows
c = number of columns

Example:

2 × 2 table → df = 1
2 × 3 table → df = 2

7.16 Effect Size: Phi and Cramer’s V

Phi (φ): used for 2 × 2 tables
Cramer’s V: used for larger tables
In the Crosstabs → Statistics dialog box:
- Select Phi for 2 × 2 tables
- Select Cramer’s V for larger tables
Interpretation is similar to correlation coefficients.
Interpretation Guidelines (rule of thumb):
- 0.10 → small association
- 0.30 → moderate association
- 0.50 → strong association

Effect size should be reported even if the chi-square result is statistically significant.

7.17 Small Expected Frequencies

Chi-square is not valid if expected count < 5

Solutions:

For 2 × 2 tables: use Fisher’s Exact Test
For larger tables: use Exact option in SPSS

Always check the footnote in SPSS output.

7.18 Summary

Nominal data require special handling
Use frequencies and percentages only
Chi-square tests association between nominal variables
Always check assumptions and expected counts
Association does not imply causation