# Select random and unique elements from a vector

 Refresh April 2019 Views 38 time
1

Say I have a simple vector with repeated elements:

``````a <- c(1,1,1,2,2,3,3,3)
``````

Is there a way to randomly select a unique element from each of the repeated elements? I.e. one random draw pointing which elements to keep would be:

``````1,4,6 ## here I selected the first 1, the first 2 and the first 3
``````

And another:

``````1,5,8 ## here I selected the first 1, the second 2  and the third 3
``````

I could do this with a loop for each repeated elements, but I am sure there must be a faster way to do this?

EDIT:

Ideally the solution should also always select a particular element if it is already a unique element. I.e. my vector could also be:

``````b <- c(1,1,1,2,2,3,3,3,4) ## The number four is unique and should always be drawn
``````

4

Using base R `ave` we could do something like

``````unique(ave(seq_along(a), a, FUN = function(x) if(length(x) > 1) head(sample(x), 1) else x))
# 3 5 6

unique(ave(seq_along(a), a, FUN = function(x) if(length(x) > 1) head(sample(x), 1) else x))
# 3 4 7
``````

This generates an index for every value of `a`, grouped by `a` and then selects one random index value in each group.

Using same logic with `sapply` and `split`

``````sapply(split(seq_along(a), a), function(x) if(length(x) > 1) head(sample(x), 1) else x)
``````

And it would also work with `tapply`

``````tapply(seq_along(a), a, function(x) if(length(x) > 1) head(sample(x), 1) else x)
``````

The reason why we need to check the `length` (`if(length(x) > 1)`) is because from `?sample`

If x has length 1, is numeric (in the sense of is.numeric) and x >= 1, sampling via sample takes place from 1:x.

Hence, when there is only one number (`n`) in `sample()`, it takes `sample` from `1:n` (and not `n`) so we need to check it's length.