Getting started

Colin Fay



Compute string distance the tidy way. Built on top of the ‘stringdist’ package.

Install tidystringdist

You’ll get the dev version on:

Stable version is available with :

tidystringdist basic workflow


First, you need to create a tibble with the combinations of words you want to compare. You can do this with the tidy_comb and tidy_comb_all functions. The first takes a base word and combines it with each elements of a list or a column of a data.frame, the 2nd combines all the possible couples from a list or a column.

If you already have a data.frame with two columns containing the strings to compare, you can skip this part.


Once you’ve got this data.frame, you can use tidy_string_dist to compute string distance. This function takes a data.frame, the two columns containing the strings, and a stringdist method.

Note that if you’ve used the tidy_comb function to create you data.frame, you won’t need to set the column names.

Default call compute all the methods. You can use specific method with the method argument:

Tidyverse workflow

The goal is to provide a convenient interface to work with other tools from the tidyverse.


Questions and feedbacks welcome!