Describe structure of Data Sets and Importers¶
Description¶
The function codeplan()
creates a data frame that describes the structure of an item
list (a data.set
object or an importer
object), so that this structure can be
stored and and recovered. The resulting data frame has a particular print method that
delimits the output to one line per variable.
With setCodeplan
an item list structure (as returned by codeplan()
) can be
applied to a data frame or data set. It is also possible to use an assignment like
codeplan(x) <- value
to a similar effect.
Usage¶
codeplan(x)
## S4 method for signature 'item.list'
codeplan(x)
## S4 method for signature 'item'
codeplan(x)
setCodeplan(x,value)
## S4 method for signature 'data.frame,codeplan'
setCodeplan(x,value)
## S4 method for signature 'data.set,codeplan'
setCodeplan(x,value)
## S4 method for signature 'data.set,NULL'
setCodeplan(x,value)
## S4 method for signature 'item,codeplan'
setCodeplan(x,value)
## S4 method for signature 'item,NULL'
setCodeplan(x,value)
## S4 method for signature 'atomic,codeplan'
setCodeplan(x,value)
## S4 method for signature 'atomic,NULL'
setCodeplan(x,value)
codeplan(x) <- value
Arguments¶
x
-
for
codeplan(x)
an object that inherits from class"item.list"
, i.e. can be a"data.set"
object or an"importer"
object, it can also be an object that inherits from class"item"
value
-
an object as it would be returned by
codeplan(x)
orNULL
.
Value¶
If applicable, codeplan
returns a data frame with additional S3 class attribute
"codeplan"
. For arguments for which the relevant information does not exist, the
function returns NULL
. Such a data frame has the following variables:
name
-
The name of the item/variable in the item list or data set.
description
-
The description/variable label string of the item/variable.
annotation
-
code to recreate the annotation attribute,
labels
-
code to recreate the value labels,
value.filter
-
code to recreate the value filter attribute (declaration of missing values, range of valid values, or an enumeration of valid values.)
mode
-
a character string that describes storage mode, such as
"character"
,"integer"
, or"numeric"
. measurement
-
a character string with the measurement level,
"nominal"
,"ordinal"
,"interval"
, or"ratio"
.
Examples¶
Data1 <- data.set(
vote = sample(c(1,2,3,8,9,97,99),size=300,replace=TRUE),
region = sample(c(rep(1,3),rep(2,2),3,99),size=300,replace=TRUE),
income = exp(rnorm(300,sd=.7))*2000
)
Data1 <- within(Data1,{
description(vote) <- "Vote intention"
description(region) <- "Region of residence"
description(income) <- "Household income"
foreach(x=c(vote,region),{
measurement(x) <- "nominal"
})
measurement(income) <- "ratio"
labels(vote) <- c(
Conservatives = 1,
Labour = 2,
"Liberal Democrats" = 3,
"Don't know" = 8,
"Answer refused" = 9,
"Not applicable" = 97,
"Not asked in survey" = 99)
labels(region) <- c(
England = 1,
Scotland = 2,
Wales = 3,
"Not applicable" = 97,
"Not asked in survey" = 99)
foreach(x=c(vote,region,income),{
annotation(x)["Remark"] <- "This is not a real survey item, of course ..."
})
missing.values(vote) <- c(8,9,97,99)
missing.values(region) <- c(97,99)
})
cpData1 <- codeplan(Data1)
Data2 <- data.frame(
vote = sample(c(1,2,3,8,9,97,99),size=300,replace=TRUE),
region = sample(c(rep(1,3),rep(2,2),3,99),size=300,replace=TRUE),
income = exp(rnorm(300,sd=.7))*2000
)
codeplan(Data2) <- cpData1
codebook(Data2)
====================================================================================================
vote 'Vote intention'
----------------------------------------------------------------------------------------------------
Storage mode: double
Measurement: nominal
Missing values: 8, 9, 97, 99
Values and labels N Valid Total
1 'Conservatives' 48 35.3 16.0
2 'Labour' 44 32.4 14.7
3 'Liberal Democrats' 44 32.4 14.7
8 M 'Don't know' 35 11.7
9 M 'Answer refused' 47 15.7
97 M 'Not applicable' 44 14.7
99 M 'Not asked in survey' 38 12.7
Remark:
This is not a real survey item, of course ...
====================================================================================================
region 'Region of residence'
----------------------------------------------------------------------------------------------------
Storage mode: double
Measurement: nominal
Missing values: 97, 99
Values and labels N Valid Total
1 'England' 123 46.1 41.0
2 'Scotland' 99 37.1 33.0
3 'Wales' 45 16.9 15.0
99 M 'Not asked in survey' 33 11.0
Remark:
This is not a real survey item, of course ...
====================================================================================================
income 'Household income'
----------------------------------------------------------------------------------------------------
Storage mode: double
Measurement: ratio
Min: 315.228
Max: 17402.271
Mean: 2526.490
Std.Dev.: 1974.115
Remark:
This is not a real survey item, of course ...
# Note the difference between 'as.data.frame' and setting
# the codeplan to NULL:
Data2df <- as.data.frame(Data2)
codeplan(Data2) <- NULL
str(Data2)
'data.frame': 300 obs. of 3 variables:
$ vote : num 9 97 3 3 97 97 9 3 1 1 ...
$ region: num 1 3 2 3 2 3 1 2 3 2 ...
$ income: num 1721 598 750 971 3005 ...
str(Data2df)
'data.frame': 300 obs. of 3 variables:
$ vote : Factor w/ 3 levels "Conservatives",..: NA NA 3 3 NA NA NA 3 1 1 ...
..- attr(*, "label")= chr "Vote intention"
$ region: Factor w/ 3 levels "England","Scotland",..: 1 3 2 3 2 3 1 2 3 2 ...
..- attr(*, "label")= chr "Region of residence"
$ income: num 1721 598 750 971 3005 ...
..- attr(*, "label")= chr "Household income"
# Codeplans of survey items can also be inquired and manipulated:
vote <- Data1$vote
str(vote)
Nmnl. item w/ 7 labels for 1,2,3,... + ms.v. num [1:300] 9 8 9 99 99 99 3 97 8
3 ...
cp.vote <- codeplan(vote)
codeplan(vote) <- NULL
str(vote)
num [1:300] 9 8 9 99 99 99 3 97 8 3 ...
codeplan(vote) <- cp.vote
vote
Item 'Vote intention' (measurement: nominal, type: double, length = 300)
[1:300] *Answer refused *Don't know *Answer refused *Not asked in survey *Not
asked in survey ...