This function takes a data.frame and returns a simple summary of the columns and their content as a data.frame.

column_summaries(df, dates_as_char = TRUE)

Arguments

df

a data.frame object

dates_as_char

logical() whether to treat date columns as character

Value

a tibble with two columns, name and column_details. The name column is the column name. The column_detailscolumn is a list with five elements:

  • min: the minumum value (including for dates and character)

  • max: the maximum value (including for dates and character)

  • class: the class of the column

  • sample_values: up to five randomly-sampled values

Examples

data(iris)
cs = column_summaries(iris)
head(cs)
#> # A tibble: 5 × 2
#>   name         column_details  
#>   <chr>        <list>          
#> 1 Sepal.Length <named list [5]>
#> 2 Sepal.Width  <named list [5]>
#> 3 Petal.Length <named list [5]>
#> 4 Petal.Width  <named list [5]>
#> 5 Species      <named list [5]>
str(cs$column_details)
#> List of 5
#>  $ :List of 5
#>   ..$ min          : num 4.3
#>   ..$ max          : num 7.9
#>   ..$ class        : chr "numeric"
#>   ..$ nrows        : int 150
#>   ..$ sample_values: num [1:6] 5.6 4.8 5.2 7.7 7.3 6.6
#>  $ :List of 5
#>   ..$ min          : num 2
#>   ..$ max          : num 4.4
#>   ..$ class        : chr "numeric"
#>   ..$ nrows        : int 150
#>   ..$ sample_values: num [1:6] 3 3.7 2.2 2.7 2.9 2.6
#>  $ :List of 5
#>   ..$ min          : num 1
#>   ..$ max          : num 6.9
#>   ..$ class        : chr "numeric"
#>   ..$ nrows        : int 150
#>   ..$ sample_values: num [1:6] 1.5 4.2 4.6 5.8 1.3 1.9
#>  $ :List of 5
#>   ..$ min          : num 0.1
#>   ..$ max          : num 2.5
#>   ..$ class        : chr "numeric"
#>   ..$ nrows        : int 150
#>   ..$ sample_values: num [1:6] 1.6 0.3 1.5 2.1 0.1 2
#>  $ :List of 5
#>   ..$ min          : logi NA
#>   ..$ max          : logi NA
#>   ..$ class        : chr "factor"
#>   ..$ nrows        : int 150
#>   ..$ sample_values: Factor w/ 3 levels "setosa","versicolor",..: 3 2 1