dataset_distance calculates Euclidean nearest-neighbour distances between two point datasets and summarizes these distances using some function, by default the mean.

dataset_distance(
  d1,
  d2,
  x_name = "x",
  y_name = "y",
  fun = mean,
  method = "euclidean",
  ...
)

Arguments

d1

a data.frame with (at least) columns with names given by x_name and y_name; these contain the x and y coordinates, respectively.

d2

see d1 - second set of points

x_name

name of column in d1 and d2 containing the x coordinates of points.

y_name

same for y coordinates

fun

function to be applied to the vector of nearest-neighbor distances of d1 from d2.

method

type of distance metric to be used; only 'euclidean' is currently supported.

...

additional arguments to fun.

Value

depends on fun; typically (e.g., mean) a numeric vector of length 1

Details

Nearest-neighbour distances are calculated for each point in d1, resulting in a vector of length nrow(d1), and fun is applied to this vector.

See also

Examples

df <- data.frame(x = rnorm(100), y = rnorm(100))
dataset_distance(df, df) # == 0
#> [1] 0