# Generating Rd files

## Roxygen process

There are three steps in the transformation from roxygen comments in your source file to human readable documentation:

2. roxygen2::roxygenise() converts roxygen comments to .Rd files.
3. R converts .Rd files to human readable documentation.

The process starts when you add specially formatted roxygen comments to your source file. Roxygen comments start with #' so you can continue to use regular comments for other purposes.

#' Add together two numbers
#'
#' @param x A number
#' @param y A number
#' @return The sum of \code{x} and \code{y}
#' @examples
x + y
}

For the example, above, this will generate man/add.Rd that looks like:

% Generated by roxygen2 (3.2.0): do not edit by hand
\usage{
}
\arguments{
\item{x}{A number}

\item{y}{A number}
}
\value{
The sum of \code{x} and \code{y}
}
\description{
}
\examples{
}

Rd files are a special file format loosely based on LaTeX. You can read more about the Rd format in the R extensions manual. With roxygen2, there are few reasons to know about Rd files, so here I’ll avoid discussing them as much as possible, focussing instead on what you need to know about roxygen2.

When you use ?x, help("x") or example("x") R looks for an Rd file containing \alias{x}. It then parses the file, converts it into html and displays it. These functions look for an Rd file in installed packages. This isn’t very useful for package development, because you want to use the .Rd files in the source package. For this reason, we recommend that you use roxygen2 in conjunction with devtools: devtools::load_all() automatically adds shims so that ? and friends will look in the development package. Note, however, that this preview does not work with intra-package links. To preview those, you’ll need to install the package. If you use RStudio, the easiest way to do this is to click the “Build & Reload button”.

## Basics

Roxygen comments start with #'. Each documentation block starts with some text which defines the title, the description, and the details. Here’s an example showing what the documentation for sum() might look like if it had been written with roxygen:

#' Sum of vector elements.
#'
#' sum returns the sum of all the values present in its arguments.
#'
#' This is a generic function: methods can be defined for it directly
#' or via the [Summary()] group generic. For this to work properly,
#' the arguments ... should be unnamed, and dispatch is on the
#' first argument.
sum <- function(..., na.rm = TRUE) {}

This introductory block is broken up as follows:

• The first sentence is the title: that’s what you see when you look at help(package = mypackage) and is shown at the top of each help file. It should generally fit on one line, be written in sentence case, and not end in a full stop.

• The second paragraph is the description: this comes first in the documentation and should briefly describe what the function does.

• The third and subsequent paragraphs go into the details: this is a (often long) section that comes after the argument description and should provide any other important details of how the function operates. The details are optional..

Note that you can use markdown formatting within roxygen blocks, as long as you have the following in the DESCRIPTION:

Roxygen: list(markdown = TRUE)

We’ll discuss that more in formatting. Also note the wrapping of the roxygen block. You should make sure that your comments are less than ~80 columns wide.

## Object specifics

Further details of roxygen2 depend on what you’re documenting. You use tags like @tag details. Tags must start at the beginning of a line, and the content of a tag extends to the start of the next tag (or the end of the block). The following sections describe the most commonly used tags for functions, S3, S4, and RC objects and data set.

### Functions

Functions are the mostly commonly documented objects. Most functions use three tags:

• @param name description describes the inputs to the function. The description should provide a succinct summary of the type of the parameter (e.g. a string, a numeric vector), and if not obvious from the name, what the parameter does. The description should start with a capital letter and end with a full stop. It can span multiple lines (or even paragraphs) if necessary. All parameters must be documented.

You can document multiple arguments in one place by separating the names with commas (no spaces). For example, to document both x and y, you can say @param x,y Numeric vectors.

• @examples provides executable R code showing how to use the function in practice. This is a very important part of the documentation because many people look at the examples before reading anything. Example code must work without errors as it is run automatically as part of R CMD check.

However for the purpose of illustration, it’s often useful to include code that causes an error. \dontrun{} allows you to include code in the example that is never used. There are two other special commands. \dontshow{} is run, but not shown in the help page: this can be useful for informal tests. \donttest{} is run in examples, but not run automatically in R CMD check. This is useful if you have examples that take a long time to run. The options are summarised below.

Command example help R CMD check
\dontrun{} x
\dontshow{} x x
\donttest{} x x

Instead of including examples directly in the documentation, you can put them in separate files and use @example path/relative/to/package/root to insert them into the documentation.

• @return description describes the output from the function. This is not always necessary, but is a good idea if you return different types of outputs depending on the input, or you’re returning an S3, S4 or RC object.

We could use these new tags to improve our documentation of sum() as follows:

#' Sum of vector elements.
#'
#' sum() returns the sum of all the values present in its arguments.
#'
#' This is a generic function: methods can be defined for it directly
#' or via the [Summary] group generic. For this to work properly,
#' the arguments ... should be unnamed, and dispatch is on the
#' first argument.
#'
#' @param ... Numeric, complex, or logical vectors.
#' @param na.rm A logical scalar. Should missing values (including NaN)
#'   be removed?
#' @return If all inputs are integer and logical, then the output
#'   will be an integer. If integer overflow
#'   (<http://en.wikipedia.org/wiki/Integer_overflow>) occurs, the output
#'   will be NA with a warning. Otherwise it will be a length-one numeric or
#'   complex vector.
#'
#'   Zero-length vectors have sum 0 by definition. See
#'   <http://en.wikipedia.org/wiki/Empty_sum> for more details.
#' @examples
#' sum(1:10)
#' sum(1:5, 6:10)
#' sum(F, F, F, T, T)
#'
#' sum(.Machine$integer.max, 1L) #' sum(.Machine$integer.max, 1)
#'
#' \dontrun{
#' sum("a")
#' }
sum <- function(..., na.rm = TRUE) {}

Indent the second and subsequent lines of a tag so that when scanning the documentation so it’s easy to see where one tag ends and the next begins. Tags that always span multiple lines (like @example) should start on a new line and don’t need to be indented.

### S3

S3 generics are regular functions, so document them as such. S3 classes have no formal definition, so document the constructor function. It is your choice whether or not to document S3 methods. You don’t need to document methods for simple generics like print(). If your method is more complicated, you should document it so people know what the parameters do. In base R, you can find documentation for more complex methods like predict.lm(), predict.glm(), and anova.glm().

Generally, roxygen2 will automatically figure the generic associated with an S3 method. It should only fail if the generic and class are ambiguous. For example is all.equal.data.frame() the equal.data.frame method for all, or the data.frame method for all.equal?. If this happens, you can disambiguate with (e.g.) @method all.equal data.frame.

### S4

S4 generics are also functions, so document them as such. Document S4 classes by adding a roxygen block before setClass(). Use @slot to document the slots of the class. Here’s a simple example:

#' An S4 class to represent a bank account.
#'
#' @slot balance A length-one numeric vector
Account <- setClass("Account",
slots = list(balance = "numeric")
)

S4 methods are a little more complicated. Unlike S3, all S4 methods must be documented. You can document them in three places:

• In the class. Most appropriate if the corresponding generic uses single dispatch and you created the class.

• In the generic. Most appropriate if the generic uses multiple dispatch and you control it.

• In its own file. Most appropriate if the method is complex. or the either two options don’t apply.

Use either @rdname or @describeIn to control where method documentation goes. See the next section for more details.

### Datasets

Datasets are usually stored as .rdata files in data/ and not as regular R objects in the package. This means you need to document them slightly differently: instead of documenting the data directly, you quote the dataset’s name.

There are two additional tags that are useful for documenting datasets:

• @format, which gives an overview of the structure of the dataset.

• @source where you got the data form, often a \url{}.

To show how everything fits together, the example below is an excerpt from the roxygen block used to document the diamonds dataset in ggplot2.

#' Prices of 50,000 round cut diamonds.
#'
#' A dataset containing the prices and other attributes of almost 54,000
#' diamonds. The variables are as follows:
#'
#' * price: price in US dollars (\$326--\$18,823)
#' * carat: weight of the diamond (0.2--5.01)
#' * ...
#'
#' @format A data frame with 53940 rows and 10 variables
#' @source <http://www.diamondse.info/>
"diamonds"
#> [1] "diamonds"

### Packages

As well as documenting every exported object in the package, you should also document the package itself. Relatively few packages do this, but it’s an extremely useful because instead of just listing functions like help(package = pkgname) it organises them and shows the user where to get started.

Package documentation should describe the overall purpose of the package and point out the most important functions. The title and description are automatically inherited from the DESCRIPTION, so generally you should only need to supply additional details in @details

Package documentation should be placed in pkgname.R. Here’s an example:

#' @details
#' The only function you're likely to need from roxygen2 is [roxygenize()].
#' Otherwise refer to the vignettes to see how to format the documentation.
#' @keywords internal
"_PACKAGE"
#> [1] "_PACKAGE"

I recommend using @keywords internal for package documentation.

Some notes:

• Like for datasets, there isn’t a object that we can document directly. Use "_PACKAGE" to indicate that you are creating the package’s documentation. This will automatically add the correct aliases so that both ?pkgname and package?pkgname will find the package help. This also works if there’s already a function called pkgname().

• Use @references point to published material about the package that users might find helpful.

Package documentation is a good place to list all options() that a package understands and to document their behaviour. Put in a section called “Package options”, as described below.

## Sections

You can add arbitrary sections to the documentation for any object with the @section tag. This is a useful way of breaking a long details section into multiple chunks with useful headings. Section titles should be in sentence case and must be followed by a colon. Titles may only take one line.

#' @section Warning:
#' Do not operate heavy machinery within 8 hours of using this function.

To add a subsection, you must use the Rd \subsection{} command, as follows:

#' @section Warning:
#' You must not call this function unless ...
#'
#' \subsection{Exceptions}{
#'    Apart from the following special cases...
#' }

Sections with identical titles will be merged. This is especially useful in conjunction with the @rdname tag:

#' Basic arithmetic
#'
#' @param x,y numeric vectors.
#' @section Neutral elements:
add <- function(x, y) x + y

#' @section Neutral elements:
#'   Multiplication: 1.
times <- function(x, y) x * y

For very long documentation files, you might consider using a vignette instead.

## Do repeat yourself

There is a tension between the DRY (do not repeat yourself) principle of programming and the need for documentation to be self-contained. It’s frustrating to have to navigate through multiple help files in order to pull together all the pieces you need. Roxygen2 provides three ways to avoid repeating yourself in code documentation, while assembling information from multiple places in one documentation file:

• Cross-link documentation files with @seealso and @family.
• Reuse parameter documentation with @inherit, @inheritParams, and @inheritSections.
• Document multiple functions in the same place with @describeIn or @rdname.
• Run arbtirary R code with @eval.
• Create reusable templates with @template and @templateVar.

### Cross-references

There are two tags that make it easier for people to navigate around your documentation: @seealso and @family.

@seealso allows you to point to other useful resources, either on the web <http://www.r-project.org> or to other documentation with [functioname()].

If you have a family of related functions, you can use the @family <family> tag to automatically add appropriate lists and interlinks to the @seealso section. Because it will appear as “Other :”, the @family name should be plural (i.e., “model building helpers” not “model building helper”). You can make a function a member of multiple families by repeating the @family tag for each additional family. These will then get separate headings in the seealso section.

For sum, these components might look like:

#' @family aggregate functions
#' @seealso [prod()] for products, [cumsum()] for cumulative sums, and
#'   [colSums()]/[rowSums()] marginal sums over high-dimensional arrays.

### Inheriting documentation from other topics

You can inherit documentation from other functions in a few ways:

• @inherit source_function will inherit parameters, return, references, description, details, sections, and seealso from source_function().

• @inherit source_function return details will inherit selected components from source_function()

• @inheritParams source_function inherits just the parameter documentation from source_function().

• @inheritSection source_function Section title will inherit the single section called “Section title” from source_function().

All of these work recursively so you can inherit documentation from a function that has inherited it from elsewhere.

You can also inherit documentation from functions provided by another package by using pkg::source_function.

### Documenting multiple functions in the same file

You can document multiple functions in the same file by using either @rdname or @describeIn tag. It’s a technique best used with care: documenting too many functions in one place leads to confusion. Use it when all functions have the same (or very similar) arguments.

@describeIn is designed for the most common cases:

• documenting methods in a generic
• documenting methods in a class
• documenting functions with the same (or similar arguments)

It generates a new section, named either “Methods (by class)”, “Methods (by generic)” or “Functions”. The section contains a bulleted list describing each function, labelled so that you know what function or method it’s talking about. Here’s an example, documenting an imaginary new generic:

#' Foo bar generic
#'
#' @param x Object to foo.
foobar <- function(x) UseMethod("x")

#' @describeIn foobar Difference between the mean and the median
foobar.numeric <- function(x) abs(mean(x) - median(x))

#' @describeIn foobar First and last values pasted together in a string.
foobar.character <- function(x) paste0(x[1], "-", x[length(x)])

An alternative to @describeIn is @rdname. It overrides the default file name generated by roxygen and merges documentation for multiple objects into one file. This gives you complete freedom to combine documentation however you see fit. There are two ways to use @rdname. You can add documentation to an existing function:

#' Basic arithmetic
#'
#' @param x,y numeric vectors.
add <- function(x, y) x + y

times <- function(x, y) x * y

Or, you can create a dummy documentation file by documenting NULL and setting an informative @name.

#' Basic arithmetic
#'
#' @param x,y numeric vectors.
#' @name arith
NULL
#> NULL

#' @rdname arith
add <- function(x, y) x + y

#' @rdname arith
times <- function(x, y) x * y

### Evaluating arbitrary code

A new and powerful technique is the @eval tag. It evaluates code and treatments the result as if it was a literal roxygen tags. This makes it possible to eliminate duplication by writing functions.

### Roxygen templates

Roxygen templates are R files that contain only roxygen comments and that live in the man-roxygen directory. Use @template file-name (without extension) to insert the contents of a template into the current documentation.

You can make templates more flexible by using template variables defined with @templateVar name value. Template files are run with brew, so you can retrieve values (or execute any other arbitrary R code) with <%= name %>.

Note that templates are parsed a little differently to regular blocks, so you’ll need to explicitly set the title, description and details with @title, @description and @details.

## Other tags

### Title, description, details

You can also use explicit @title, @description, and @details tags. This is unnecessary unless you want to have a multi-paragraph description.

### Keywords

Use @keywords keyword1 keyword2 ... to add standardised keywords. Keywords are optional, but if present, must be taken from the predefined list replicated in the keywords vignette.

Keywords are not very useful, except for @keywords internal. Using the internal keyword removes the function from the documentation index and is useful for functions aimed primarily at other developers, not typically users of the package.

### Indexing

Three other tags make it easier for the user to find documentation:

• Aliases form the index that ? searches. Use @aliases space separated aliases to add additional aliases.

• @concept add extra keywords that will be found with help.search()

### Back references

The original source location is added as a comment to the second line of each generated .Rd file in the following form:

% Please edit documentation in ...

roxygen2 tries to capture all locations from which the documentation is assembled. For code that generates R code with Roxygen comments (e.g., the Rcpp package), the @backref tag is provided. This allows specifying the “true” source of the documentation, and will substitute the default list of source files.

Use one tag per source file:

#' @backref src/file.cpp
#' @backref src/file.h