Version info: Code for this page was tested in SPSS 20.git
Logistic regression, also called a logit model, is used to model dichotomous outcome variables. In the logit model the log odds of the outcome is modeled as a linear combination of the predictor variables.web
Please note: The purpose of this page is to show how to use various data analysis commands. It does not cover all aspects of the research process which researchers are expected to do. In particular, it does not cover data cleaning and checking, verification of assumptions, model diagnostics and potential follow-up analyses.less
Example 1: Suppose that we are interested in the factorsdom
that influence whether a political candidate wins an election. Theide
outcome (response) variable is binary (0/1); win or lose. ui
The predictor variables of interest are the amount of money spent on the campaign, thethis
amount of time spent campaigning negatively and whether or not the candidate is an3d
incumbent.rest
Example 2: A researcher is interested in how variables, such as GRE (Graduate Record Exam scores),code
GPA (grade
point average) and prestige of the undergraduate institution, effect admission into graduate
school. The response variable, admit/don’t admit, is a binary variable.
For our data analysis below, we are going to expand on Example 2 about getting into graduate school. We have generated hypothetical data, which can be obtained from our website by clicking on binary.sav. You can store this anywhere you like, but the syntax below assumes it has been stored in the directory c:data. This dataset has a binary response (outcome, dependent) variable called admit, which is equal to 1 if the individual was admitted to graduate school, and 0 otherwise. There are three predictor variables: gre, gpa, and rank. We will treat the variables gre and gpa as continuous. The variable rank takes on the values 1 through 4. Institutions with a rank of 1 have the highest prestige, while those with a rank of 4 have the lowest. We start out by opening the dataset and looking at some descriptive statistics.
get file = "c:databinary.sav". descriptives /variables=gre gpa.frequencies /variables = rank.crosstabs /tables = admit by rank.
Below is a list of some analysis methods you may have encountered. Some of the methods listed are quite reasonable while others have either fallen out of favor or have limitations.
logistic regression. The choice of probit versus logit depends largely on
individual preferences.
as a linear probability model and can be used as a way to
describe conditional probabilities. However, the errors (i.e., residuals) from the linear probability model violate the homoskedasticity and
normality of errors assumptions of OLS
regression, resulting in invalid standard errors and hypothesis tests. For
a more thorough discussion of these and other problems with the linear
probability model, see Long (1997, p. 38-40).
grouping variable, and the former predictors are turned into outcome
variables. This will produce an overall test of significance but will not
give individual coefficients for each variable, and it is unclear the extent
to which each "predictor" is adjusted for the impact of the other
"predictors."
Below we use the logistic regression command to run a model predicting the outcome variable admit, using gre, gpa, and rank. The categorical option specifies that rank is a categorical rather than continuous variable. The output is shown in sections, each of which is discussed below.
logistic regression admit with gre gpa rank /categorical = rank.
The first table above shows a breakdown of the number of cases used and not used in the analysis. The second table above gives the coding for the outcome variable, admit.
The table above shows how the values of the categorical variable rank were handled, there are terms (essentially dummy variables) in the model for rank=1,rank=2, and rank=3; rank=4 is the omitted category.
cells by doing a crosstab between categorical predictors and the outcome variable. If a cell has very few cases (a small cell), the model may become unstable or it might not run at all.
New York: John Wiley & Sons, Inc.
Thousand Oaks, CA: Sage Publications.