R/column_summaries.R
column_summaries.Rd
This function takes a data.frame and returns a simple summary of the columns and their content as a data.frame.
column_summaries(df, dates_as_char = TRUE)
a data.frame object
logical() whether to treat date columns as character
a tibble
with two columns, name
and column_details
.
The name column is the column name. The column_details
column is a list with five elements:
min: the minumum value (including for dates and character)
max: the maximum value (including for dates and character)
class: the class of the column
sample_values: up to five randomly-sampled values
data(iris)
cs = column_summaries(iris)
head(cs)
#> # A tibble: 5 × 2
#> name column_details
#> <chr> <list>
#> 1 Sepal.Length <named list [5]>
#> 2 Sepal.Width <named list [5]>
#> 3 Petal.Length <named list [5]>
#> 4 Petal.Width <named list [5]>
#> 5 Species <named list [5]>
str(cs$column_details)
#> List of 5
#> $ :List of 5
#> ..$ min : num 4.3
#> ..$ max : num 7.9
#> ..$ class : chr "numeric"
#> ..$ nrows : int 150
#> ..$ sample_values: num [1:6] 5.6 4.8 5.2 7.7 7.3 6.6
#> $ :List of 5
#> ..$ min : num 2
#> ..$ max : num 4.4
#> ..$ class : chr "numeric"
#> ..$ nrows : int 150
#> ..$ sample_values: num [1:6] 3 3.7 2.2 2.7 2.9 2.6
#> $ :List of 5
#> ..$ min : num 1
#> ..$ max : num 6.9
#> ..$ class : chr "numeric"
#> ..$ nrows : int 150
#> ..$ sample_values: num [1:6] 1.5 4.2 4.6 5.8 1.3 1.9
#> $ :List of 5
#> ..$ min : num 0.1
#> ..$ max : num 2.5
#> ..$ class : chr "numeric"
#> ..$ nrows : int 150
#> ..$ sample_values: num [1:6] 1.6 0.3 1.5 2.1 0.1 2
#> $ :List of 5
#> ..$ min : logi NA
#> ..$ max : logi NA
#> ..$ class : chr "factor"
#> ..$ nrows : int 150
#> ..$ sample_values: Factor w/ 3 levels "setosa","versicolor",..: 3 2 1