Duplicate Items In An Array

June 14, 2016

Today’s exercise is in two parts, first a commonly-seen programming exercise and then a variant on it; the origin of the exercise is certainly someone’s homework, but since school is out for the year it doesn’t matter that we do the exercise today.

First, write a program that, given an array of integers in unsorted order, finds the single duplicate number in the array. For instance, given the input [1,2,3,1,4], the correct output is 4.

Second, write a program that, given an array of integers in unsorted order, finds all of the multiple duplicate numbers in the array. For instance, given the input [1,2,3,1,2,4,1], the correct output is [1,2,1].

Your task is to write the two programs that find duplicates in an array. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

7 Comments »

7 Responses to “Duplicate Items In An Array”

James Curtis-Smith said
June 14, 2016 at 9:14 AM
Pretty much 1-liners for both of these in Perl – using core perl (+List::Util)

Part 1 – find (only) first entry…
```
use feature qw(say); 
use List::Util qw(first);
say first { exists $X{$_} || !($X{$_}=1) } @ARGV;
```
Part 1 – find all entries

[sourceode lang=”perl”]
use feature qw(say);
say join q( ), grep { exists $X{$_} || !($X{$_}=1) } @ARGV;
[/sourcecode]
John Cowan said
June 15, 2016 at 12:16 AM
If the numbers are bounded and small, you can use a bit vector, which takes less actual space than a hash table.

Daniel said

June 15, 2016 at 11:31 AM

Here’s a solution in C. It uses GLib for its hash table.

#include <stdio.h>
#include <stdlib.h>
#include <glib.h>

// return code indicates if duplicate was found
int FirstDuplicate(int arr[], size_t n, int* duplicate) {
  GHashTable* hash = g_hash_table_new(g_int_hash, g_int_equal);
  for (size_t i = 0; i < n; i++) {
    if (g_hash_table_contains(hash, arr + i)) {
      *duplicate = arr[i];
      return 1;
    } else {
      g_hash_table_insert(hash, arr + i, NULL);
    }
  }
  return 0;
}

void Duplicates(int arr[],
                size_t n,
                int* duplicates,
                size_t* duplicates_n) {
  GHashTable* hash = g_hash_table_new(g_int_hash, g_int_equal);
  for (size_t i = 0; i < n; i++) {
    if (g_hash_table_contains(hash, arr + i)) {
      duplicates[(*duplicates_n)++] = arr[i];
    } else {
      g_hash_table_insert(hash, arr + i, NULL);
    }
  }
}

int main(int argc, char* argv[]) {
  int a[] = {1,2,3,1,4};
  size_t a_n = sizeof(a)/sizeof(a[0]);

  int duplicate = 0;
  if (FirstDuplicate(a, a_n, &duplicate)) {
    printf("duplicate: %d", duplicate);
  } else {
    printf("no duplicate");
  }

  printf("\n\n");

  int b[] = {1,2,3,1,2,4,1};
  size_t b_n = sizeof(b)/sizeof(b[0]);

  int* duplicates = malloc(sizeof(int) * (b_n-1));
  size_t duplicates_n = 0;

  Duplicates(b, b_n, duplicates, &duplicates_n);

  if (duplicates_n > 0) {
    printf("duplicates: [");
    for (size_t i = 0; i < duplicates_n; i++) {
      if (i > 0) {
        printf(",");
      }
      printf("%d", duplicates[i]);
    }
    printf("]");
  } else {
    puts("no duplicates");
  }

  printf("\n");

  free(duplicates);
  return 0;
}

Here’s the output:

duplicate: 1

duplicates: [1,2,1]

The problem says “the correct output is 4”. I should seemingly say 1 (there’s only one 4, but there are two 1s).

gdejohn said

June 15, 2016 at 8:47 PM

import static java.util.Arrays.stream;

import java.util.Arrays;
import java.util.HashSet;
import java.util.OptionalInt;
import java.util.function.IntPredicate;

public class Duplicates {
    public static OptionalInt firstDuplicate(int... ints) {
        return stream(ints)
              .filter(((IntPredicate) new HashSet<>()::add).negate())
              .findFirst();
    }

    public static int[] duplicates(int... ints) {
        return stream(ints)
              .filter(((IntPredicate) new HashSet<>()::add).negate())
              .toArray();
    }

    public static void main(String[] args) {
        System.out.println(firstDuplicate(1, 2, 3, 1, 4).getAsInt());         // 1
        System.out.println(Arrays.toString(duplicates(1, 2, 3, 1, 2, 4, 1))); // [1, 2, 1]
    }
}

Globules said

June 16, 2016 at 4:32 AM

A Haskell version, favouring conciseness over efficiency.

import Data.List

-- Return Just x if and only if x is the only duplicate element.
singleDup :: Ord a => [a] -> Maybe a
singleDup xs = case filter ((>1) . length) . group . sort $ xs of
                 [x:_] -> Just x
                 _     -> Nothing

-- Return the duplicate elements of the list in the order they appear.  The
-- first occurrence is omitted.
dups :: Eq a => [a] -> [a]
dups = concat . snd . mapAccumL step []
  where step ds x = if x `elem` ds then (ds, [x]) else (x:ds, [])

main :: IO ()
main = do
  print $ singleDup [1, 2, 3, 1, 4 :: Int]
  print $ dups [1, 2, 3, 1, 2, 4, 1 :: Int]

$ ./dups 
Just 1
[1,2,1]

ijp said
June 18, 2016 at 5:29 AM
Incidentally, the first part has been an exercise on here before (albeit with an upper bound on the array elements) way back in 2011. https://programmingpraxis.com/2011/09/23/array-duplicates/
Mayank said
June 25, 2016 at 4:15 AM
A good problem

S	M	T	W	T	F	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30

Programming Praxis