Title: | DDI with R |
---|---|
Description: | Useful functions for various DDI (Data Documentation Initiative) related inputs and outputs. Converts data files to and from DDI, SPSS, Stata, SAS, R and Excel, including user declared missing values. |
Authors: | Adrian Dusa [aut, cre, cph] |
Maintainer: | Adrian Dusa <[email protected]> |
License: | GPL (>= 3) |
Version: | 0.18.9 |
Built: | 2024-12-06 12:39:33 UTC |
Source: | https://github.com/dusadrian/DDIwR |
This function converts (or transfers) between R, Stata, SPSS, SAS, Excel and DDI XML files. Unlike the regular import / export functions from packages haven or rio, this function uses the DDI standard as an exchange platform and facilitates a consistent conversion of the missing values.
convert( from, to = NULL, declared = TRUE, chartonum = FALSE, recode = TRUE, encoding = "UTF-8", csv = NULL, ... )
convert( from, to = NULL, declared = TRUE, chartonum = FALSE, recode = TRUE, encoding = "UTF-8", csv = NULL, ... )
from |
A path to a file, or a data.frame object |
to |
Character, the name of a software package or a path to a specific file |
declared |
Logical, return the resulting dataset as a declared object |
chartonum |
Logical, recode character categorical variables to numerical categorical variables |
recode |
Logical, recode missing values |
encoding |
The character encoding used to read a file |
csv |
Complex argument, see the Details section |
... |
Additional parameters passed to other functions, see the Details section |
When the argument to
specifies a certain statistical package
("R"
, "Stata"
, "SPSS"
, "SAS"
, "XPT"
) or "Excel"
, the name of the
destination file will be identical to the one in the argument from
,
with an automatically added software specific extension.
SPSS portable file (with the extension ".por"
) can only be read, but not
written.
The argument to
can also be specified as a path to a specific file,
in which case the software package is determined from its file extension.
The following extentions are currently recognized: .xml
for DDI,
.rds
for R, .dta
for Stata, .sav
for SPSS, .xpt
for SAS, and
.xlsx
for Excel.
Additional parameters can be specified via the three dots argument
...
, that are passed to the respective functions from packages
haven and readxl. For instance the function
write_dta()
has an additional argument called
version
when writing a Stata file.
The most important argument to consider is called user_na
, part of
the function read_sav()
. Defaulted to FALSE
in
package haven, in package DDIwR it is used as
having the value of TRUE
, and it can be deactivated by explicitly
specifying user_na = FALSE
in function convert()
.
The same three dots argument is used to pass additional parameters to other
functions in this package, for instance exportCodebook()
when writing
to a DDI file. One of its argument embed
(activated by default) can be
used to control embedding the data in the XML file. Deactivating it will
create a CSV file in the same directory, using the same file name as the
XML file.
When converting from DDI, if the dataset is not embedded in the XML file, the
CSV file is expected to be found in the same directory as the DDI Codebook,
and it should have the same file name as the XML file. The path to the CSV
file can be provided via the csv
argument. Additional formal
parameters of the function read.csv()
can
be passed via the same three dots ...
argument. Alternatively, the
csv
argument can also be an R data frame.
When converting to DDI, if the argument embed
is set to FALSE
, users
have the option to save the data in a separate CSV file (the default) or not
to save the data at all, by setting csv
to FALSE
.
The DDI .xml file generates unique IDs for all variables, if not already present in the attributes. These IDs are useful for newer versions of the DDI Codebook, for referencing purposes.
The argument chartonum
signals recoding character categorical
variables, and employs the function recodeCharcat()
.
This only makes sense when recoding to Stata, which does not allow allocating
labels for anything but integer variables.
If the argument to
is left to NULL
, the data is (invisibly) returned
to the R enviroment. Conversion to R, either in the working space or as
a data file, will result (by default) in a data frame containing declared
labelled variables, as defined in package declared.
The current version reads and creates DDI Codebook version 2.6, with future
versions to extend the functionality for DDI Lifecycle versions 3.x and link
to the future package DDI4R for the UML model based version 4. It
extends the standard DDI Codebook by offering the possibility to embed a
serialized version of the R dataset into the XML file containing the
Codebook, within a notes
child of the fileDscr
component. This type of
generated codebook is unique to this package and automatically detected when
converting to another statistical software. This will likely be replaced with
a time insensitive text version.
Converting to SAS is experimental, and it relies on the same package
haven that uses the ReadStat C library. The safest way to
convert, which at the same time consistently converts the missing values, is
to export the data to a CSV file and create a setup file produced by function
setupfile()
and run the commands manually.
Converting data from SAS is possible, however reading the metadata is also
experimental (the current version of haven only partially imports the
metadata). Either specify the path to the catalog file using the argument
catalog_file
from the function read_sas()
,
or have the catalog file in the same directory as the data set, with the same
file name and the extension .sas7bcat
The argument recode
controls how missing values are treated. If the
input file has SPSS like numeric codes, they will be recoded to extended
(a-z) missing types when converting to Stata or SAS. If the input has Stata
like extended codes, they will be recoded to SPSS like numeric codes.
The character encoding
is usually passed to the corresponding functions
from package haven. It can be set to NULL
to reset at the
default in that package.
Converting to SPSS works with numerical and character labelled vectors, with or without labels. Date/Time variables are partially supported by package haven: either having such a variable with no labels and missing values, or if labels and missing values are declared the variable is automatically coerced to numeric, and users may have to make the proper settings in SPSS.
An invisible R data frame, when the argument to
is NULL.
Adrian Dusa
DDI - Data Documentation Initiative, see the DDI Alliance website.
setupfile
,
getCodebook
,
declared
## Not run: # Assuming an SPSS file called test.sav is located in the working directory # The following command imports the file into the R environment: test <- convert("test.sav") # The following command will extract the metadata in a DDI Codebook and # produce a test.xml file in the same directory convert("test.sav", to = "DDI") # The data may be saved separately from the DDI file, using: convert("test.sav", to = "DDI", embed = FALSE) # To produce a Stata file: convert("test.sav", to = "Stata") # To produce an R file: convert("test.sav", to = "R") # To produce an Excel file: convert("test.sav", to = "Excel") ## End(Not run)
## Not run: # Assuming an SPSS file called test.sav is located in the working directory # The following command imports the file into the R environment: test <- convert("test.sav") # The following command will extract the metadata in a DDI Codebook and # produce a test.xml file in the same directory convert("test.sav", to = "DDI") # The data may be saved separately from the DDI file, using: convert("test.sav", to = "DDI", embed = FALSE) # To produce a Stata file: convert("test.sav", to = "Stata") # To produce an R file: convert("test.sav", to = "R") # To produce an Excel file: convert("test.sav", to = "Excel") ## End(Not run)
addChildren()
adds one or more children to a standard DDI Codebook element
(see makeElement
), anyChildren()
checks if an element has any
children at all, hasChildren()
checks if the element has specific children,
indexChildren()
returns the positions of the children among all containing
children, and getChildren()
extracts them. For attributes and content,
there are dedicated functions to add*()
, remove*()
and change*()
.
addChildren(children, to, overwrite = TRUE, ...) anyChildren(element) getChildren(xpath, from, ...) hasChildren(element, name) indexChildren(element, name) removeChildren(name, from, overwrite = TRUE, ...) addContent(content, to, overwrite = TRUE) changeContent(content, to, overwrite = TRUE) removeContent(from, overwrite = TRUE) addAttributes(attrs, to, overwrite = TRUE) anyAttributes(element) changeAttributes(attrs, from, overwrite = TRUE) hasAttributes(element, name) removeAttributes(name, from, overwrite = TRUE)
addChildren(children, to, overwrite = TRUE, ...) anyChildren(element) getChildren(xpath, from, ...) hasChildren(element, name) indexChildren(element, name) removeChildren(name, from, overwrite = TRUE, ...) addContent(content, to, overwrite = TRUE) changeContent(content, to, overwrite = TRUE) removeContent(from, overwrite = TRUE) addAttributes(attrs, to, overwrite = TRUE) anyAttributes(element) changeAttributes(attrs, from, overwrite = TRUE) hasAttributes(element, name) removeAttributes(name, from, overwrite = TRUE)
children |
A standard element of class |
to |
A standard element of class |
overwrite |
Logical, overwrite the original object in the parent frame. |
... |
Other arguments, mainly for internal use. |
element |
A standard element of class |
xpath |
Character, a path to a DDI Codebook element. |
from |
A standard element of class |
name |
Character, name(s) of specific child element / attribute. |
content |
Character, the text content of a DDI element. |
attrs |
A list of specific attribute names and values. |
Although an XML list generally allows for multiple contents, sometimes spread between the children elements, it is preferable to maintain a single content (eventually separated with carriage return characters for separate lines).
Arguments are unique, and can be changed by simply referring to their names.
Elements, however, can be repeated. For instance element var
to describe
variables, within the dataDscr
(data description) sub-element in the
codeBook
. There are as many such var
elements as the number of variables
in the dataset, in which case it is not possible to change a specific var
element by referring to its name. For this purpose, it is useful to extract
the positions of all var
elements to iterate through, which is the purpose
of the function indexChildren()
.
Future versions will allow deep manipulations of child elements using the
xpath
argument.
If more than one children, they should be grouped into a list.
An invisible standard DDI element. Functions any*()
and has*()
return a logical (vector).
Adrian Dusa
Create a DDI Codebook version 2.6, XML file structure.
exportCodebook(codeBook, to = "", OS = "", indent = 2, ...)
exportCodebook(codeBook, to = "", OS = "", indent = 2, ...)
codeBook |
A standard element of class |
to |
either a character string naming a file or a connection open for writing ("" indicates output to the console) |
OS |
The target operating system, for the eol - end of line character(s) |
indent |
Indent width, in number of spaces |
... |
Other arguments, mainly for internal use |
#' The information object is a codeBook
DDI element having at least two
main children:
fileDscr
, with the data provided as a sub-component named
datafile
dataDscr
, having as many components as the number of variables in the
(meta)data.
For the moment, only DDI codebook version 2.6 is exported, and DDI Lifecycle is planned for future releases.
A small number of required DDI specific elements and attributes have generic
default values, if not otherwise specified in the codeBook
list object. For
the current version, these are: monolang
, xmlang
, IDNo
, titl
,
agency
, URI
(for the holdings
element), distrbtr
, abstract
and
level
(for the otherMat
element).
The codeBook
object is exported as provided, and it is the user's
responsibility to test its validity against the XML schema. Most of these
arguments help create the mandatory element stdyDscr
, which cannot be
harvested from the dataset. If this element is not already present, providing
any of these arguments via the three dots ...
gate, signal an automatic
creation and inclusion with the values provided.
Argument xmlang
expects a two letter ISO country coding, for instance
"en"
to indicate English, or "ro"
to indicate Romanian etc. The original
DDI Codebook attribute is called xml:lang
, which for obvious reasons
had to be renamed into this R function.
A logical argument monolang
signal if the document is monolingual, in which
case the attribute xmlang
is placed a single time for the entire document
in the codeBook
element. For multilingual documents, xmlang
should be
placed in the attributes of various other (child) elements, for instance
abstract
, or the study title, name of the distributing institution,
variable labels etc.
The argument OS
can be either:"windows"
(default), or "Windows"
, "Win"
, "win"
,"MacOS"
, "Darwin"
, "Apple"
, "Mac"
, "mac"
,"Linux"
, "linux"
.
The end of line separator changes only when the target OS is different from the running OS.
The argument indent
controls how many spaces will be used in the XML
file, to indent the different sub-elements.
An XML file containing a DDI version 2.6 metadata.
Adrian Dusa
https://ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/field_level_documentation.html
## Not run: exportCodebook(codeBook, to = "codebook.xml") # using a namespace exportCodebook(codeBook, to = "codebook.xml", xmlns = "ddi") ## End(Not run)
## Not run: exportCodebook(codeBook, to = "codebook.xml") # using a namespace exportCodebook(codeBook, to = "codebook.xml", xmlns = "ddi") ## End(Not run)
Extract a list containing the variable labels, value labels and any available information about missing values.
getCodebook(from = NULL, encoding = "UTF-8", ignore = NULL, ...)
getCodebook(from = NULL, encoding = "UTF-8", ignore = NULL, ...)
from |
A path to a file, or a data frame object |
encoding |
The character encoding used to read a file |
ignore |
Character, ignore DDI elements when reading from an XML file |
... |
Additional arguments for this function (internal use only) |
This function extracts the metadata from an R dataset, or alternatively it can read an XML file containing a DDI codebook version 2.6, or an SPSS or Stata file and returns a list containing the variable labels, value labels and information about the missing values.
If the input is a dataset, it will extract the variable level metadata (labels, missing values etc.). From a DDI XML file, it will import all metadata elements, the most expensive being the data description.
For the moment, only DDI Codebook is supported, but DDI Lifecycle is planned to be implemented.
An R list roughly equivalent to a DDI Codebook, containing all variables, their corresponding variable labels and value labels, and (if applicable) missing values if imported and found.
Adrian Dusa
x <- data.frame( A = declared( c(1:5, -92), labels = c(Good = 1, Bad = 5, NR = -92), na_values = -92 ), C = declared( c(1, -91, 3:5, -92), labels = c(DK = -91, NR = -92), na_values = c(-91, -92) ) ) getCodebook(from = x)
x <- data.frame( A = declared( c(1:5, -92), labels = c(Good = 1, Bad = 5, NR = -92), na_values = -92 ), C = declared( c(1, -91, 3:5, -92), labels = c(DK = -91, NR = -92), na_values = c(-91, -92) ) ) getCodebook(from = x)
catgry
elements for a particular variableUtility function to create the catgry
elements, as well as all
necessary sub-elements (e.g. catValu
, labl
, varFormat
) along with their
associated XML attributes.
makeCategories(metadata)
makeCategories(metadata)
metadata |
A list of two or three components: |
A list of standard catgry
DDI elements.
Adrian Dusa
notes
element for the dataset.Create the notes
element to embed a serialized, gzip-ed version of the data
in the fileDscr
section of the codeBook
.
makeDataNotes(data)
makeDataNotes(data)
data |
An R dataframe. |
A standard notes
DDI element.
Adrian Dusa
Creates a standard DDI element.
makeElement( name, children = NULL, attributes = NULL, content = NULL, fill = FALSE, ... )
makeElement( name, children = NULL, attributes = NULL, content = NULL, fill = FALSE, ... )
name |
Character, a DDI Codebook element name. |
children |
A list of standard DDI codebook elements. |
attributes |
A vector of named values. |
content |
Character scalar. |
fill |
Logical, fill the element with arbitrary values for its mandatory children and attributes |
... |
Other arguments, see Details. |
The structure of a DDI element in R follows the usual structure of
an XML node, as returned by the function as_list()
from package xml2,
with one additional (first) component named ".extra" to accommodate any other
information that is not part of the DDI element.
In the DDI Codebook, most elements and their attributes are optional, but some are mandatory. In case of attributes, some become mandatory only if the element itself is present. The mandatory elements need to be present in the final version of the Codebook, to pass the validation against the XML schema.
By activating the argument fill
, this function creates DDI elements
containing all mandatory (sub)elements and (their) attributes, filled with
arbitrary values that can be changed later on. Some recommended elements are
also filled, as expected by the CESSDA Data Catalogue profile for DDI
Codebook.
By default, the Codebook is assumed to have a single language for all
elements. The argument monolang
can be deactivated through the "...
"
gate, in which situation the appropriate elements will receive a default
argument xmlang = "en"
. For other languages, that argument can also be
provided through the "...
" gate.
One such DDI Codebook element is the stdyDscr
(Study Description), with the
associated mandatory children, for instance title, ID number, distributor,
citation, abstract etc.
The complete list of elements for which default values are added is: "IDNo", "titl", "titlStmt", "distrbtr", "distStmt", "holdings", "citation", "abstract", "stdyInfo", "stdyDscr", "prodDate", "software", "prodStmt", "docDscr" and "otherMat".
A standard list element of class "DDI"
with reserved component names.
Adrian Dusa
addChildren
getChildren
showDetails
stdyDscr <- makeElement("stdyDscr", fill = TRUE) # easier to extract with: getChildren("citation/titlStmt/titl", from = stdyDscr)
stdyDscr <- makeElement("stdyDscr", fill = TRUE) # easier to extract with: getChildren("citation/titlStmt/titl", from = stdyDscr)
Recodes a character categorical variables to a numerical categorical variable.
recodeCharcat(x, ...)
recodeCharcat(x, ...)
x |
A character categorical variable |
... |
Other internal arguments |
For this function, a categorical variable is something else than a base
factor. It should be an object of class "declared"
, or an object of class
"haven_labelled_spss"
, with a specific attribute called "labels"
that
stores the value labels.
A numeric categorical variable of the same class as the input.
Adrian Dusa
x <- declared( c(letters[1:5], -91), labels = c(Good = "a", Bad = "e", NR = -91), na_values = -91 ) recodeCharcat(x)
x <- declared( c(letters[1:5], -91), labels = c(Good = "a", Bad = "e", NR = -91), na_values = -91 ) recodeCharcat(x)
A function to recode all missing values to either SPSS or Stata types, uniformly (re)using the same codes across all variables.
recodeMissings( dataset, to = c("SPSS", "Stata", "SAS"), dictionary = NULL, start = -91, ... )
recodeMissings( dataset, to = c("SPSS", "Stata", "SAS"), dictionary = NULL, start = -91, ... )
dataset |
A data frame |
to |
Software to recode missing values for |
dictionary |
A named vector, with corresponding Stata missing codes to SPSS missing values |
start |
A named vector, with corresponding Stata missing codes to SPSS missing values |
... |
Other internal arguments |
When a dictionary is not provided, it is automatically constructed from the available data and metadata, using negative numbers starting from -91 and up to 27 letters starting with "a".
If the dataset contains mixed variables with SPSS and Stata style missing values, unless otherwise specified in a dictionary it uses other codes than the existing ones.
For the SPSS type of missing values, the resulting variables are coerced to a declared labelled format.
Unlike SPSS, Stata does not allow labels for character values. Both cannot be
transported from SPSS to Stata, it is either one or another. If labels are
more important to preserve than original values (especially the information
about the missing values), the argument chartonum
replaces all character
values with suitable, non-overlapping numbers and adjusts the labels
accordingly.
If no labels are found in the metadata, the original values are preserved.
A data frame with all missing values recoded consistently.
Adrian Dusa
x <- data.frame( A = declared( c(1:5, -92), labels = c(Good = 1, Bad = 5, NR = -92), na_values = -92 ), B = labelled( c(1:5, haven::tagged_na('a')), labels = c(DK = haven::tagged_na('a')) ), C = declared( c(1, -91, 3:5, -92), labels = c(DK = -91, NR = -92), na_values = c(-91, -92) ) ) xrec <- recodeMissings(x, to = "Stata") attr(xrec, "dictionary") dictionary <- data.frame( old = c(-91, -92, "a"), new = c("c", "d", "c") ) recodeMissings(x, to = "Stata", dictionary = dictionary) recodeMissings(x, to = "SPSS") dictionary$new <- c(-97, -98, -97) recodeMissings(x, to = "SPSS", dictionary = dictionary) recodeMissings(x, to = "SPSS", start = 991) recodeMissings(x, to = "SPSS", start = -8)
x <- data.frame( A = declared( c(1:5, -92), labels = c(Good = 1, Bad = 5, NR = -92), na_values = -92 ), B = labelled( c(1:5, haven::tagged_na('a')), labels = c(DK = haven::tagged_na('a')) ), C = declared( c(1, -91, 3:5, -92), labels = c(DK = -91, NR = -92), na_values = c(-91, -92) ) ) xrec <- recodeMissings(x, to = "Stata") attr(xrec, "dictionary") dictionary <- data.frame( old = c(-91, -92, "a"), new = c("c", "d", "c") ) recodeMissings(x, to = "Stata", dictionary = dictionary) recodeMissings(x, to = "SPSS") dictionary$new <- c(-97, -98, -97) recodeMissings(x, to = "SPSS", dictionary = dictionary) recodeMissings(x, to = "SPSS", start = 991) recodeMissings(x, to = "SPSS", start = -8)
Search function to return elements that contain a certain word or regular expression pattern.
searchFor( x, where = c("everywhere", "title", "description", "attributes", "examples"), ... )
searchFor( x, where = c("everywhere", "title", "description", "attributes", "examples"), ... )
x |
Character, either word(s) or a regular expression. |
where |
Character, in which section(s) to search for. |
... |
Other arguments to be passed to the grepl() function. |
Character vector of DDI element names.
Adrian Dusa
Creates a setup file, based on a list of variable and value labels.
setupfile( obj, file = "", type = "all", csv = NULL, recode = TRUE, OS = "", stringnum = TRUE, ... )
setupfile( obj, file = "", type = "all", csv = NULL, recode = TRUE, OS = "", stringnum = TRUE, ... )
obj |
A data frame, or a list object containing the metadata, or a path to a data file or to a directory where such objects are located, for batch processing |
file |
Character, the (path to the) setup file to be created |
type |
The type of setup file, can be: "SPSS", "Stata", "SAS", "R", or "all" (default) |
csv |
The original dataset, used to create the setup file commands, or a path to the directory where the .csv files are located, for batch processing |
recode |
Logical, recode missing values to extended .a-.z range |
OS |
The target operating system, for the eol - end of line character(s) |
stringnum |
Logical, recode string variables to numeric |
... |
Other arguments, see Details below |
When a path to a metadata directory is specified for the argument obj
,
then next argument file
is silently ignored and all created setup files
are saved in a directory called "Setup Files" that (if not already found) is
created in the working directory.
The argument file
expects the name of the final setup file being
saved on the disk. If not specified, the name of the object provided for the
obj
argument will be used as a filename.
If file
is specified, the argument type
is automatically
determined from the file's extension, otherwise when type = "all"
, the
function produces one setup file for each supported type.
If batch processing multiple files, the function will inspect all files in
the provided directory, and retain only those with the extension .R
or
.r
or DDI versions with the extension .xml
or .XML
(it will
subsequently generate an error if the .R files do not contain an object list,
or if the .xml
files do not contain a DDI structured metadata file).
If the metadata directory contains a subdirectory called "data"
or
"Data"
, it will match the name of the metadata file with the name of the
.csv
file (their names have to be exactly the same, regardless of
their extension).
The csv
argument can provide a data frame object produced by reading
the .csv
file, or a path to the directory where the .csv
files are
located. If the user doesn't provide something for this argument, the
function will check the existence of a subdirectory called data
in the
directory where the metadata files are located.
In batch mode, the code starts with the argument delim = ","
, but if
the .csv
file is delimited differently it will also try hard to find other
delimiters that will match the variable names in the metadata file. At the
initial version 0.1-0, the automatically detected delimiters include ";"
and "\t"
.
The argument OS
(case insensitive) can be either:"Windows"
(default), or "Win"
,"MacOS"
, "Darwin"
, "Apple"
, "Mac"
,"Linux"
.
The end of line character(s) changes only when the target OS is different from the running OS.
A setup file to complement the imported raw dataset.
Adrian Dusa
## Not run: # IMPORTANT: # make sure to set the working directory to a directory with # read/write permissions # setwd("/path/to/read/write/directory") setupfile(codeBook) # if the csv data file is available setupfile(codeBook, csv="/path/to/csv/file.csv") # generating a specific type of setup file setupfile(codeBook, file = "codeBook.do") # type = "Stata" also works # other types of possible utilizations, using paths to specific files # an XML file containing a DDI metadata object setupfile("/path/to/the/metadata/file.xml", csv="/path/to/csv/file.csv") # or in batch mode, specifying entire directories setupfile("/path/to/the/metadata/directory", csv="/path/to/csv/directory") ## End(Not run)
## Not run: # IMPORTANT: # make sure to set the working directory to a directory with # read/write permissions # setwd("/path/to/read/write/directory") setupfile(codeBook) # if the csv data file is available setupfile(codeBook, csv="/path/to/csv/file.csv") # generating a specific type of setup file setupfile(codeBook, file = "codeBook.do") # type = "Stata" also works # other types of possible utilizations, using paths to specific files # an XML file containing a DDI metadata object setupfile("/path/to/the/metadata/file.xml", csv="/path/to/csv/file.csv") # or in batch mode, specifying entire directories setupfile("/path/to/the/metadata/directory", csv="/path/to/csv/directory") ## End(Not run)
Describe what a DDI element is
showDetails(x, ...) showDescription(x, ...) showAttributes(x, name = NULL, ...) globalAttributes() showExamples(x, ...) showRelations(x, ...) showLineages(x, ...)
showDetails(x, ...) showDescription(x, ...) showAttributes(x, name = NULL, ...) globalAttributes() showExamples(x, ...) showRelations(x, ...) showLineages(x, ...)
x |
Character, a DDI Codebook element name. |
... |
Other arguments, mainly for internal use. |
name |
Character, print only a specific element (name) |
All arguments having predefined values such as "(Y | N) : N" are mandatory if the element is used
Adrian Dusa
showDetails("codeBook") showAttributes("catgry") showExamples("abstract") showLineages("titl")
showDetails("codeBook") showAttributes("catgry") showExamples("abstract") showLineages("titl")
Attempts a minimal validation of a DDI Codebook element, by searching for mandatory elements and attributes.
testValid(element, monolang = TRUE)
testValid(element, monolang = TRUE)
element |
A standard element of class |
monolang |
Logical, the codebook file is monolingual |
This function currently attempts a minimal check for the absolute
most mandatory elements, such as the stdyDscr
. An absolute bare version
of this element, filled with arbitrary default values, can be produced with
the function makeElement()
, activating its attribute fill
.
It also checks for chained expectations, that is element X is mandatory only
if the parent element is present.
Future versions will implement more functionality for recommended elements and attributes, with the intention to provide a 1:1 validation as offered by the "CESSDA Metadata Validator".
To ease the validation of the DDI Codebook XML files, the argument monolang
is activated by default. This means a single attribute xmlang
in the main
codeBook
element. For multi-language codebooks, an error is flagged if this
argument is missing where appropriate.
A character vector of validation problems found.
Adrian Dusa
Update an XML file containing a DDI Codebook.
updateCodebook(xmlfile, with, ...)
updateCodebook(xmlfile, with, ...)
xmlfile |
A path to a DDI Codebook XML document. |
with |
An R object containing a root |
... |
Other internal arguments. |
This function replaces entire Codebook sections. Any such section present in the R object will replace the corresponding section from the XML document.
Adrian Dusa