A factor is a special type of vector, normally used to hold a categorical variable–such as smoker/nonsmoker, state of residency, zipcode–in many statistical functions. Such vectors have class “factor”. Factors are primarily used in Analysis of Variance (ANOVA) or other situations when “categories” are needed. When a factor is used as a predictor variable, the corresponding indicator variables are created (more later).
Note of caution that factors in R often appear to be character vectors when printed, but you will notice that they do not have double quotes around them. They are stored in R as numbers with a key name, so sometimes you will note that the factor behaves like a numeric vector.
# create the character vectorcitizen<-c("uk","us","no","au","uk","us","us","no","au")# convert to factorcitizenf<-factor(citizen)citizen
[1] "uk" "us" "no" "au" "uk" "us" "us" "no" "au"
citizenf
[1] uk us no au uk us us no au
Levels: au no uk us
# convert factor back to character vectoras.character(citizenf)