Compares 2 dataframes and outputs any differences.
diffdf(base, compare, keys = NULL, suppress_warnings = FALSE, strict_numeric = TRUE, strict_factor = TRUE, file = NULL, tolerance = sqrt(.Machine$double.eps), scale = NULL)
base | input dataframe |
---|---|
compare | comparison dataframe |
keys | vector of variables (as strings) that defines a unique row in the base and compare dataframes |
suppress_warnings | Do you want to suppress warnings? (logical) |
strict_numeric | Flag for strict numeric to numeric comparisons (default = TRUE). If False diffdf will cast integer to double where required for comparisons. Note that variables specified in the keys will never be casted. |
strict_factor | Flag for strict factor to character comparisons (default = TRUE). If False diffdf will cast factors to characters where required for comparisons. Note that variables specified in the keys will never be casted. |
file | Location and name of a text file to output the results to. Setting to NULL will cause no file to be produced. |
tolerance | Set tolerance for numeric comparisons. Note that comparisons fail if (x-y)/scale > tolerance. |
scale | Set scale for numeric comparisons. Note that comparisons fail if (x-y)/scale > tolerance. Setting as NULL is a slightly more efficient version of scale = 1. |
#> Warning: ‘-’ not meaningful for factorsx[1,2] <- 5 COMPARE <- diffdf( iris, x)#> Warning: #> There are rows in BASE that are not in COMPARE !! #> Not all Values Compared Equalprint( COMPARE )#> Differences found between the objects! #> #> A summary is given below. #> #> There are rows in BASE that are not in COMPARE !! #> First 10 of 149 rows are shown in table below #> #> =============== #> ..ROWNUMBER.. #> --------------- #> 2 #> 3 #> 4 #> 5 #> 6 #> 7 #> 8 #> 9 #> 10 #> 11 #> --------------- #> #> Not all Values Compared Equal #> All rows are shown in table below #> #> ================================= #> Variable No of Differences #> --------------------------------- #> Sepal.Length 1 #> Sepal.Width 1 #> Petal.Length 1 #> Petal.Width 1 #> Species 1 #> --------------------------------- #> #> #> All rows are shown in table below #> #> ============================================ #> VARIABLE ..ROWNUMBER.. BASE COMPARE #> -------------------------------------------- #> Sepal.Length 1 5.1 <NA> #> -------------------------------------------- #> #> #> All rows are shown in table below #> #> =========================================== #> VARIABLE ..ROWNUMBER.. BASE COMPARE #> ------------------------------------------- #> Sepal.Width 1 3.5 5 #> ------------------------------------------- #> #> #> All rows are shown in table below #> #> ============================================ #> VARIABLE ..ROWNUMBER.. BASE COMPARE #> -------------------------------------------- #> Petal.Length 1 1.4 <NA> #> -------------------------------------------- #> #> #> All rows are shown in table below #> #> =========================================== #> VARIABLE ..ROWNUMBER.. BASE COMPARE #> ------------------------------------------- #> Petal.Width 1 0.2 <NA> #> ------------------------------------------- #> #> #> All rows are shown in table below #> #> ========================================== #> VARIABLE ..ROWNUMBER.. BASE COMPARE #> ------------------------------------------ #> Species 1 setosa <NA> #> ------------------------------------------ #>#> Differences found between the objects! #> #> A summary is given below. #> #> There are rows in BASE that are not in COMPARE !! #> First 10 of 149 rows are shown in table below #> #> =============== #> ..ROWNUMBER.. #> --------------- #> 2 #> 3 #> 4 #> 5 #> 6 #> 7 #> 8 #> 9 #> 10 #> 11 #> --------------- #> #> Not all Values Compared Equal #> All rows are shown in table below #> #> ================================= #> Variable No of Differences #> --------------------------------- #> Sepal.Length 1 #> Sepal.Width 1 #> Petal.Length 1 #> Petal.Width 1 #> Species 1 #> --------------------------------- #> #> #> All rows are shown in table below #> #> ============================================ #> VARIABLE ..ROWNUMBER.. BASE COMPARE #> -------------------------------------------- #> Sepal.Length 1 5.1 <NA> #> -------------------------------------------- #> #> #> All rows are shown in table below #> #> =========================================== #> VARIABLE ..ROWNUMBER.. BASE COMPARE #> ------------------------------------------- #> Sepal.Width 1 3.5 5 #> ------------------------------------------- #> #> #> All rows are shown in table below #> #> ============================================ #> VARIABLE ..ROWNUMBER.. BASE COMPARE #> -------------------------------------------- #> Petal.Length 1 1.4 <NA> #> -------------------------------------------- #> #> #> All rows are shown in table below #> #> =========================================== #> VARIABLE ..ROWNUMBER.. BASE COMPARE #> ------------------------------------------- #> Petal.Width 1 0.2 <NA> #> ------------------------------------------- #> #> #> All rows are shown in table below #> #> ========================================== #> VARIABLE ..ROWNUMBER.. BASE COMPARE #> ------------------------------------------ #> Species 1 setosa <NA> #> ------------------------------------------ #>#### Sample data frames DF1 <- data.frame( id = c(1,2,3,4,5,6), v1 = letters[1:6], v2 = c(NA , NA , 1 , 2 , 3 , NA) ) DF2 <- data.frame( id = c(1,2,3,4,5,7), v1 = letters[1:6], v2 = c(NA , NA , 1 , 2 , NA , NA), v3 = c(NA , NA , 1 , 2 , NA , 4) ) diffdf(DF1 , DF1 , keys = "id")#> No issues were found!# We can control matching with scale/location for example: DF1 <- data.frame( id = c(1,2,3,4,5,6), v1 = letters[1:6], v2 = c(1,2,3,4,5,6) ) DF2 <- data.frame( id = c(1,2,3,4,5,6), v1 = letters[1:6], v2 = c(1.1,2,3,4,5,6) ) diffdf(DF1 , DF2 , keys = "id")#> Warning: #> Not all Values Compared Equal#> Differences found between the objects! #> #> A summary is given below. #> #> Not all Values Compared Equal #> All rows are shown in table below #> #> ============================= #> Variable No of Differences #> ----------------------------- #> v2 1 #> ----------------------------- #> #> #> All rows are shown in table below #> #> ============================= #> VARIABLE id BASE COMPARE #> ----------------------------- #> v2 1 1 1.1 #> ----------------------------- #>diffdf(DF1 , DF2 , keys = "id", tolerance = 0.2)#> No issues were found!diffdf(DF1 , DF2 , keys = "id", scale = 10, tolerance = 0.2)#> No issues were found!# We can use strict_factor to compare factors with characters for example: DF1 <- data.frame( id = c(1,2,3,4,5,6), v1 = letters[1:6], v2 = c(NA , NA , 1 , 2 , 3 , NA), stringsAsFactors = FALSE ) DF2 <- data.frame( id = c(1,2,3,4,5,6), v1 = letters[1:6], v2 = c(NA , NA , 1 , 2 , 3 , NA) ) diffdf(DF1 , DF2 , keys = "id", strict_factor = TRUE)#> Warning: #> There are columns in BASE and COMPARE with different modes !! #> There are columns in BASE and COMPARE with different classes !!#> Differences found between the objects! #> #> A summary is given below. #> #> There are columns in BASE and COMPARE with different modes !! #> All rows are shown in table below #> #> ================================ #> VARIABLE MODE.BASE MODE.COMP #> -------------------------------- #> v1 character numeric #> -------------------------------- #> #> There are columns in BASE and COMPARE with different classes !! #> All rows are shown in table below #> #> ================================== #> VARIABLE CLASS.BASE CLASS.COMP #> ---------------------------------- #> v1 character factor #> ---------------------------------- #>diffdf(DF1 , DF2 , keys = "id", strict_factor = FALSE)#>#> No issues were found!