Self-Organizing Lists

November 6, 2015

There are many applications where you need to keep a small number of items, a set, and check for membership in the set, and there are many algorithms for doing that: a linked list keeps the items in random order to be retrieved by linear search, an ordered array keeps the items in sequence to be retrieved by binary search, a hash table performs some kind of math on the item, various kinds of trees can store and search the items, and so on.

Today’s exercise looks at a simple algorithm that is appropriate for search in a set that isn’t too big, doesn’t change too often, and isn’t searched too many times, where the definition of “too” depends on the needs of the user. The idea is to store the items of the set in a linked list, search the list on request, and each time an item is found, move it to the front of the list. The hope is that frequently-accessed items will stay near the front of the list, so that instead of an average search that requires inspection of half the list, frequently-accessed items are found much more quickly.

Your task is to write functions that maintain a set as a self-organizing list: you should provide functions to insert and delete items from the set as well as to search within the set. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

5 Comments »

5 Responses to “Self-Organizing Lists”

Informatimago said

November 6, 2015 at 3:59 AM



#|

There is no list ADT in lisp, so it's not possible to mutate a list in
general (it's not possible to transform the symbol NIL which
represents the empty list into a CONS cell, or vice-versa,  a CONS
cell into the NIL symbol!).  We have two solutions to deal with
the mutation of list arguments:

1- introduce an ADT (eg. wrap the list in a structure),

2- use macros for the operations that can "mutate" a list into an
   empty list or vice-versa.

Since the option 1 is exemplified in the scheme solution, I'll
implement option 2.

INSERTF element list                                                Macro

    Inserts the ELEMENT into the place LIST, at the head.
    (This expands merely to a PUSH macro call).


DELETEF element list &key test key ...                              Macro

    Deletes the ELEMENT from the place LIST.
    (We use the obvious definition with all the &key arguments to the
    DELETE function)


SOL-FIND element list &key test key                                 Function

    Finds the designated ELEMENT in the LIST and returns it, while
    moving it into the first position of the LIST.  The LIST is
    mutated.  If the ELEMENT is found, a second value T is returned,
    otherwise two NIL values are returned. 
    
    Since this operation doesn't change the NULL status of the LIST
    argument, it can be implemented as a function mutating the LIST.
    The trick  is to keep the first CONS cell of the LIST in place,
    and to swap its CAR with the element found.


     list                 prev
       |                   |
       v                   v
     [a|*]-->[b|*]--> …  [p|*]-->[x|*]-->[s|*]-->

                               +-----------+
                               |           |
                               |           v
     [a|*]-+ [b|*]--> …  [p|*]-+ [x|*]-+ [s|*]--> 
           |   ^                  ^    |
           |   |                  |    |
           +---|------------------+    |
               +-----------------------+

|#


(defun sol-find (element list &key (test 'eql) (key 'identity))
  (if list
      (values (loop
                :for prev :on (cons nil list)
                :when (and (cdr prev) (funcall test element (funcall key (cadr prev))))
                  :do (unless (eq (cdr prev) list)
                        (rotatef (cddr prev) (cdr list) (cdr prev))
                        (rotatef (car list) (cadr list)))
                      (return (car list))
                :finally (return nil))
              t)
      (values nil nil)))


(defmacro insertf (list element)
  `(push ,element ,list))


;; We use define-modify-macro to define the DELETEF macro.
;; However, define-modify-macro expects that the modified place be the
;; first argument, so we have to swap the two first arguments of the
;; DELETE function:
(declaim (inline delete/swapped-arguments))
(defun delete/swapped-arguments (sequence item &rest keyword-arguments)
  (apply #'delete item sequence keyword-arguments))
(define-modify-macro deletef (item &rest remove-keywords)
  delete/swapped-arguments
  "Modify-macro for DELETE. Sets place designated by the first argument to
the result of calling DELETE with ITEM, place, and the REMOVE-KEYWORDS.")

#|

(let ((list (list 1 2 3 4 5 6)))
  (values (sol-find 16 list :test (function =) :key (lambda (x) (* x x)))
          list))
--> 4
    (4 1 2 3 5 6)

(let ((list (list 1 2 3 4 5 6)))
  (values (sol-find 6 list)
          list))
--> 6
    (6 1 2 3 4 5)

(let ((list (list 1 2 3 4 5 6)))
  (values (sol-find 1 list)
          list))
--> 1
    (1 2 3 4 5 6)


(let ((list nil))
  (insertf list 5)
  (insertf list 4)
  (insertf list 3)
  (insertf list 2)
  (insertf list 1)
  (print list)
  (sol-find 3 list)
  (print list)
  (deletef list 2)
  (print list)
  (sol-find 5 list)
  (print list))

prints:
    (1 2 3 4 5) 
    (3 1 2 4 5) 
    (3 1 4 5) 
    (5 3 1 4)

--> (5 3 1 4)

|#

mcmillhj said

November 6, 2015 at 10:06 PM

Standard ML implementation that returns a new set for every operation, so it is not very efficient but it was interesting to write with the supplied properties.

datatype 'a set = Nil | Cons of 'a * 'a set
exception EmptySet

fun merge s1 s2 = let
    fun merge' Nil s = s
      | merge' s Nil = s
      | merge' (Cons(h,Nil)) s = Cons(h, s) 
      | merge' (Cons(h,t))   s = merge' t (Cons(h, s))
in
    merge' s1 s2
end

fun reverse s = let
    fun reverse' Nil acc = acc
      | reverse' (Cons(h,t)) acc = reverse' t (Cons(h,acc))
in
    reverse' s Nil
end

fun add s e = let
    fun add' Nil e = Cons(e, Nil)
      | add' (Cons(h,Nil)) e = if h = e then s else merge (reverse s) (Cons(e,Nil))  
      | add' (Cons(h,t))   e = if h = e then s else add' t e
in
    add' s e
end

fun remove s e = let
    fun remove' Nil e acc = raise EmptySet
      | remove' (Cons(h,Nil)) e acc = if h = e then reverse acc else s
      | remove' (Cons(h,t))   e acc = if h = e then merge acc t else remove' t e (Cons(h,acc))
in
    remove' s e Nil
end

fun search s e = let
    fun search' Nil e acc = raise EmptySet
      | search' (Cons(h,Nil)) e acc = if h = e then Cons(h, (reverse acc)) else s
      | search' (Cons(h,t))   e acc = if h = e then Cons(h, (merge acc t)) else search' t e (Cons(h,acc))
in
    search' s e Nil
end

val int_set = Cons (5, Cons (6, Cons (7, Cons (8, Cons (9, Nil)))))
(* adds to back of set:
> add int_set 10;
val it = Cons (5, Cons (6, Cons (7, Cons (8, Cons (9, Cons (10, Nil)))))) : int set

operations can be chained:
> remove (add int_set 10) 10;
val it = Cons (5, Cons (6, Cons (7, Cons (8, Cons (9, Nil))))) : int set

searching moves elements to the front of the set:
> search (search int_set 9) 8;
val it = Cons (8, Cons (9, Cons (5, Cons (6, Cons (7, Nil))))) : int set

removing does not modify positions of set elements:
> remove int_set 7;
val it = Cons (5, Cons (6, Cons (8, Cons (9, Nil)))) : int set

matthew said

November 6, 2015 at 10:37 PM

Here’s a cute solution using C++ references to ensure in-place updates – since we can’t assign to reference variables it’s nice to use recursion & rely on modern compilers to turn the tail calls into a loop. It’s easier to think about if we have separate operations to remove a node from a list, and to put one in. There is something appealing about code like this, that looks functional but is, in fact, deeply imperative.

#include <iostream>

typedef int T;
struct List {
  List(T v) : value(v), next(NULL) {}
  T value;
  List *next;
};

List *insert(List *&p, List *q) {
  q->next = p; p = q;
  return p;
}

List *remove(List *&p) {
  List *t = p; p = p->next;
  return t;
}

List *lookup(List *&head, List *&p, T value) {
  if (p == NULL) return NULL;
  else if (p->value == value) return insert(head,remove(p));
  else return lookup(head,p->next,value);
}

List *lookup(List *&head, T value) {
  if (head == NULL) return NULL;
  else if (head->value == value) return head;
  else return lookup(head,head->next,value);
}

List *insert(List *&head, int n) {
  return insert(head, new List(n));
}

void print(List *p) {
  while(p) {
    std::cout << p->value << " ";
    p = p->next;
  }
  std::cout << "\n";
}

int main() {
  List *head = NULL;
  print(insert(head,3));
  print(insert(head,4));
  print(lookup(head,3));
  print(lookup(head,3));
  print(insert(head,5));
  print(lookup(head,3));
  print(lookup(head,4));
  print(lookup(head,5));
}

fisherro said

November 13, 2015 at 3:53 PM

My first attempt was an exercise in creating my own linked list using std::unique_ptr to handle the memory allocations/deallocations.

#include <memory>
#include <iostream>

template<typename T>
class List {
public:
    void add(T t)
    {
        auto node = std::make_unique<Node>();
        node->t = t;
        node->next = std::move(head);
        head = std::move(node);
    }

    bool contains(T t)
    {
        Node* parent = nullptr;
        Node* node = head.get();
        while (node) {
            if (t == node->t) {
                if (parent) {
                    //Move node to front.
                    //If List->n1->n2->n3->0; n2 is node; n1 is the parent.
                    std::unique_ptr<Node> node_ = std::move(parent->next);
                    //Now: List->n1->0; n2->n3->0
                    parent->next = std::move(node->next);
                    //Now: List->n1->n3->0; n2->0
                    node->next = std::move(head);
                    //Now: List->0; n2->n1->n3->0
                    head = std::move(node_);
                    //Now: List->n2->n1->n3->0
                }
                return true;
            }
            parent = node;
            node = node->next.get();
        }
        return false;
    }

    void dump(std::ostream& out = std::cout) const
    {
        out << "List->";
        Node* node = head.get();
        while (node) {
            out << node->t << "->";
            node = node->next.get();
        }
        out << "null\n";
    }

private:
    struct Node {
        T t;
        std::unique_ptr<Node> next;
    };

    std::unique_ptr<Node> head;
};

int main()
{
    List<int> list;
    for (int n = 5; n > 0; --n) list.add(n);
    list.dump();
    list.contains(4);
    list.dump();
}

fisherro said

November 13, 2015 at 3:58 PM

My second attempt uses std::forward_list. At first, I thought no standard algorithms would help with List::contains. (At least not without being less efficient.) Then I noticed std::adjacent_find could do the job. I don’t think I’m abusing it here.

#include <memory>
#include <iostream>
#include <forward_list>

template<typename T>
class List {
public:
    void add(T t) { list.push_front(t); }

    bool contains(T t)
    {
        if (list.front() == t) return true;
        auto i = std::adjacent_find(list.begin(), list.end(),
                [t](auto a, auto b) {
                    return b == t;
                });
        if (list.end() == i) return false;
        list.erase_after(i);
        list.push_front(t);
        return true;
    }

    void dump(std::ostream& out = std::cout) const
    {
        out << "List->";
        std::copy(list.begin(), list.end(),
                std::ostream_iterator<T>(std::cout, "->"));
        out << "null\n";
    }

private:
    std::forward_list<T> list;
};

int main()
{
    List<int> list;
    for (int n = 5; n > 0; --n) list.add(n);
    list.dump();
    list.contains(4);
    list.dump();
}

S	M	T	W	T	F	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Programming Praxis