4 Data Handling

(PSY206) Data Management and Analysis

Author

Md Rasel Biswas

3.1 Introduction

SPSS provides a range of commands to modify, manipulate, or transform data, collectively referred to as Data Handling commands.
These commands are particularly useful when working with large datasets containing numerous variables for each participant, such as survey or questionnaire data.
Questionnaires often include items (questions) that can be grouped into subscores, which can be calculated using Data Handling commands.
Data transformations, such as log transformations, can also be performed to reduce distortions like skewness and improve the validity of statistical analyses.
Another common use of these commands is to filter data to analyze specific groups of participants — for example, analyzing males and females separately, or excluding respondents who do not meet certain inclusion criteria.

Example Data File

To illustrate the use of these commands, we will use a small dataset (download) based on a fictitious survey exploring people’s attitudes toward adoption.

This dataset includes:

Participant number
Demographic variables (age, sex, ethnicity, religious belief, and adoption experience)
Responses to 10 statements on adoption, measured on a 5-point Likert scale ranging from Strongly Agree (1) to Strongly Disagree (5)

Each response is recorded in variables q1 to q10.

3.2 Sorting Data

Although the order of cases usually does not affect statistical analysis, sorting can make it easier to inspect and verify data. For example, sorting participants by sex and then by ethnicity can help detect data entry issues or compare group distributions.

In this example, we sort the data first by sex, and then within each sex by ethnicity.

3.3 Splitting Data

The Split File function allows SPSS to temporarily divide a dataset into groups, so that all subsequent analyses are performed separately for each group.

For instance, you may want to produce separate statistical outputs for male and female participants.

To split a file, follow the steps below:

The difference between the two options is important:

Compare groups: produces one combined output section showing group comparisons.
Organize output by groups: generates separate output sections for each group.

We usually prefer Organize output by groups for clearer interpretation, but you should explore both options to understand their differences.

Undoing Split File

The Split File command remains active until you manually turn it off. You can check whether it is on by looking at the bottom-right corner of the Data View window. When Split File is active, SPSS displays a message like “Split by Sex.”

To disable it, simply select Unsplit File from the same menu.

3.4 Selecting Cases

Sometimes you may wish to analyze only a subset of your data, such as respondents who have been adopted.
The Select Cases command allows you to temporarily exclude all other participants from analysis.

Split File analyzes all data but displays separate outputs by group.
Select Cases analyzes only the chosen subset, suppressing all other cases.

Use Select Cases when you want to restrict analysis to specific participants.

Selection Rules

You can define complex selection criteria using logical operators such as AND, OR, and NOT.
Rules can be typed directly or created using the on-screen calculator.

For example, to select only Chinese Christians with experience of adoption, the expression would be: religion = 3 and ethnicity = 3 and adopted > 0

You can also create more advanced selection rules by combining logical conditions with built-in functions available in the dialogue box.

Reselecting All Cases

The selection remains in effect until you manually reset it.
To restore all participants, open the Select Cases dialog and choose All cases.

3.5 Recoding Values

Recoding is the process of changing the values of a variable — often to correct errors, merge categories, or prepare data for specific analyses.

For instance, if preliminary results show very few participants with adoption experience through “immediate family” or “other family,” these categories could be combined.

SPSS provides two main recode options:

Recode into Same Variables — replaces original values (riskier)
Recode into Different Variables — creates a new variable (safer and recommended)

Tip: Always use Recode into Different Variables to preserve the original data in case of mistakes.

Conditional Recoding

You can also apply conditional recoding, where values are changed only if specific conditions are met — for example, recoding age values only for female participants.

3.6 Computing New Variables

The Compute Variable command allows you to create new variables from existing ones.
This is useful when:

Summing item scores into total or subscale scores
Calculating averages
Applying mathematical transformations

In our example, the 10 questionnaire items can be combined into two subscales by summing or averaging specific variables (q1–q5, q6–q10).

SPSS also provides built-in functions such as SUM(), MEAN(), SD(), etc., that simplify computation.

3.7 Counting Values

Sometimes we need to count how many times a particular response occurs across several variables.
For example, you may want to know how many times each participant selected “Strongly Agree (1)” across all 10 questionnaire items (q1–q10).

The Count Values within Cases function creates a new variable representing this count.

Summary

In this chapter, you learned how to:

Sort and split datasets
Select specific cases for analysis
Recode and compute variables
Count responses across variables

These data handling skills are fundamental for data preparation and cleaning — an essential step before conducting any statistical analysis in SPSS.