Skip to content

Implement a unique function returning only the unique values in a vector. #940

@loiseaujc

Description

@loiseaujc

Motivation

Recently, I've run into the problem of extracting unique values in a vector (of any integer, real or complex type or possibly even character). Consider for instance the following vector x = [1, 2, 3, 3, 4]. What I'd need is a function taking x as input and returning the vector y = [1, 2, 3, 4] as output. The interface for a real-valued vector could be as simple as

pure function unique(x, sorted) result(y)
     real(dp), intent(in) :: x(:)
     !! Array whose unique values need to be extracted.
     logical(lk), optional, intent(in) :: sorted
     !! Whether the output vector needs to be sorted or not (default .false. ?)
     real(dp), allocatable :: y(:)
     !! Vector containing only the unique values from x.
end function

The output vector could be sorted or not, depending on the user's choice. I know that there are no Fortran intrinsic functions for that purpose, but I ain't sure something like that is already available in stdlib. If I'm wrong, could anyone point me to the correct function?

Prior Art

  • In Matlab, there is the unique function whose description is available here.
  • Python has the set function taking as input a list and returning only the unique elements of this list.
  • Numpy has np.unique whose description is available here.
  • @jacobwilliams provides an integer-based implementation on his blog (here).

Additional Information

Both Matlab and Numpy's implementations cover a relatively large set of cases (1D-array, multidimensional arrays, different types, etc) and return values (the unique elements, the corresponding indices, indices to the reconstruct the original array from this unique set, etc).

I don't know if absolutely all these cases need to be covered (at least as a starting point). I would probably recommend to start with the simplest ones (i.e. only input vectors and output vector with the unique elements) as these are probably the most common situations where a unique function might be needed. That would include integer, real, complex and character 1D-arrays.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ideaProposition of an idea and opening an issue to discuss ittopic: algorithmssearching and sorting, merging, ...topic: container(Abstract) data structures and containers

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions