Man Or Boy

September 16, 2016

During the development of Algol 60, Donald Knuth devised a nasty test of recursion:

There are quite a few ALGOL60 translators in existence which have been designed to handle recursion and non-local references properly, and I thought perhaps a little test-program may be of value. Hence I have written the following simple routine, which may separate the man-compilers from the boy-compilers:

begin 
 real procedure A(k, x1, x2, x3, x4, x5);
 value k; integer k;
 begin
  real procedure B;
  begin 
   k := k - 1; 
   B := A := A(k, B, x1, x2, x3, x4)
  end;
  if k ≤ then A : = x4 + x5 else B
 end
 outreal(A(10, 1, -1, -1, 1, 0))
end

This uses nothing known to be tricky or ambiguous. My question is: What should the answer be? Unfortunately, I don’t have to a man-compiler myself, and so I was forced to try hand calculations. My conjecture (probably wrong) is that the answer will be:

73 - 119 - 177 + 102 = - 121

I’d be very glad to know the right answer.

Your task is to write a program that computes the right answer. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Advertisement

Pages: 1 2

8 Responses to “Man Or Boy”

  1. Milbrae said

    Maybe you’d like to check this out: https://www.rosettacode.org/wiki/Man_or_boy_test

  2. Informatimago said
    #|
    
    It is not a test of recursion, but a test of call-by-name.
    
    You have to be careful to interpret the name of the functions in
    Algol: depending on whether they are "left-values" or "right-values",
    they will denote an assignment to a variable that will be the function
    result, or they denote a (possibly recursive) function call.
    
    Thus:
    
           B := A := A(k, B, x1, x2, x3, x4)
           ^    ^    ^    ^
           |    |    |    |
           |    |    |    +--- function call.
           |    |    +-------- function call.
           |    +------------- assignment to result variable
           +------------------ assignment to result variable
    
    
           if k ≤ then A : = x4 + x5 else B
                       ^                  ^
                       |                  |
                       |                  +---- function call
                       +----------------------- assignment to result variable
    
    All the arguments to A, but the first one which is declared to be
    by-value, are passed by-name.  This requires the creation of a thunk
    (macro byname), and the evaluation of that thunk when the value is
    eventually needed (function deref).
    
    |#
    
    
    (defun deref (x)
      (if (functionp x)
          (deref (funcall x))
          x))
    
    (defmacro byname (x)
      `(lambda () ,x))
    
    (labels ((A (k x1 x2 x3 x4 x5)
               (let ((a :uninitialized-a))
                 (labels ((b ()
                            (let ((b :uninitialized-b))
                              (decf k)
                              (setf b (setf a (a k (byname (b)) (byname x1) (byname x2) (byname x3) (byname x4))))
                              b)))
                   (if (<= k 0)
                       (setf a (+ (deref x4) (deref x5)))
                       (b)))
                 a)))
      (a 10 1 -1 -1 1 0))
    
    ;; --> -67
    
  3. namako said

    This goes up to 27…so I suppose I am a man compiler.

    ;For ml64, Windows x86-64
    ;ml64 <filename>.asm /Cp /link /ENTRY:ENTRY /SUBSYSTEM:CONSOLE kernel32.lib user32.lib /STACK:<stacksize>
    ;Got up to 27 with a /STACK:10737418240 (10GB), 13 with the defaul 1MB stack.
    
    COMMENT ~
    Some hideous ALGOL60 by Knuth:
    
    begin
    real procedure A(k, x1, x2, x3, x4, x5);
    value k; integer k;
    begin
    real procedure B;
    begin
    k := k - 1;
    B := A := A(k, B, x1, x2, x3, x4)
    end;
    if k <= 0 then A : = x4 + x5 else B
    end
    outreal(A(10, 1, -1, -1, 1, 0))
    end
    
    Algol uses "pass by name", so there's a little bit of nastiness to be had here. I'm not actually 
    sure what the second B in B := refers to.
    
    After some thinking, it looks like it refers to B in the frame when it was put in there.
    Which is a little annoying.
    
    So...I'll pass structures containing pointers to functions that take that structure -_-'.
    struct S{
    	int (*fun)(S*);
    	...
    };
    Then things should be sensible again.
    ~
    
    _TEXT SEGMENT
    
    ;int32_t A(int32_t, void *, void *, void *, void *, void *); 
    ;void* parameter: some POD with a pointer to an int32_t function taking a pointer to that POD as parameter.
    ;A's Stack plan
    ;x5				104
    ;x4				96
    ;(x3) r9		88
    ;(x2) r8		80
    ;(x1) rdx		72
    ;(k) rcx		64 
    ;return			56 ;allocate 56
    ;A				48
    ;frame for B	40
    ;pointer for B	32
    ;call space		24
    ;call space		16
    ;call space		8
    ;call space		0
    A PROC PUBLIC FRAME
    	sub rsp,56
    	.ALLOCSTACK 56
    	.ENDPROLOG
    	cmp ecx,0	;check k
    	jle sum
    	;On this branch the shadow space needs populating.
    	mov [rsp+64],rcx
    	mov [rsp+72],rdx
    	mov [rsp+80],r8
    	mov [rsp+88],r9
    	mov [rsp+40],rsp	;save the frame pointer for B
    	lea rax,B
    	mov [rsp+32],rax	;save B's address for B
    	lea rcx,[rsp+32]
    	call B				;call B on that pointer
    	mov eax,[rsp+48]	;get the return value out
    	jmp done
    sum:
    	mov rcx,[rsp+96]	
    	call QWORD PTR [rcx];call x4
    	mov [rsp+48],eax	;save return
    	mov rcx,[rsp+104]	
    	call QWORD PTR [rcx];call x5
    	add eax,[rsp+48]	;sum to return.
    done:
    	add rsp,56
    	ret
    A ENDP
    
    ;B's stack plan
    ;free			88
    ;free			80
    ;free			72
    ;(pointer)rcx	64
    ;return			56
    ;rbx			48
    ;p5				40
    ;p4				32
    ;p3				24
    ;p2				16
    ;p1				8
    ;p0				0
    B PROC PUBLIC FRAME
    	push rbx
    	.PUSHREG rbx
    	sub rsp,48
    	.ALLOCSTACK 48
    	.ENDPROLOG
    
    	mov rbx,[rcx+8]			;get the saved frame pointer for A
    	dec DWORD PTR [rbx + 64];k <- k-1 in that frame
    
    	movdqu xmm0,[rbx+88]
    	movdqa [rsp+32],xmm0	;x3 and x4
    	mov r9,[rbx+80]			;x2
    	mov r8,[rbx+72]			;x1
    	mov rdx,rcx				;B
    	mov rcx,[rbx+64]		;k
    	call A					;A(k,B,x1,x2,x3,x4)
    	mov [rbx+48],eax		;save the result (as in the original code)
    
    	add rsp,48
    	pop rbx
    	ret
    B ENDP
    _TEXT ENDS
    
    ;Business to actually do things.
    EXTERN __imp_ExitProcess:PROC
    EXTERN __imp_GetStdHandle:PROC
    EXTERN __imp_WriteFile:PROC
    EXTERN __imp_wsprintfA:PROC 
    
    CONST SEGMENT
    ALIGN 16
    pf0		dq f0
    pf1		dq f1
    pfm1	dq fm1
    
    format_string	db	"%d -> %d",0dh,0ah,0
    len_format_string = $-format_string
    end_string 		db	"Stack overflow!"
    len_end_string = $ - end_string
    CONST ENDS
    
    _BSS SEGMENT
    output_string	db	1024 dup (?)
    _BSS ENDS
    
    _TEXT SEGMENT
    
    f0:		xor eax,eax
    		ret
    f1:		mov eax,1
    		ret
    fm1:	mov eax,-1
    		ret
    
    EXTERN __imp_RtlUnwind:PROC
    EHANDLER PROC PRIVATE FRAME
    		sub rsp,8
    		.ALLOCSTACK 8
    		.ENDPROLOG
    		
    		;check what we can do with the exception
    		cmp DWORD PTR [rcx+4],0
    		jne do_nothing	;it's not something I can deal with unless it's continuable.		
    		cmp DWORD PTR [rcx],0c00000fdh 	;check for a stack overflow
    		jne do_nothing ;I can't do anything with it
    		
    		;so...a stack overflow is to be had!
    		;It could come from windows, but that would be crazy.
    		
    		mov r10,rdx	;save the exception frame
    		mov r9,0
    		mov r8,rcx
    		lea rdx,[ehandler_safe_position]
    		mov rcx,r10
    		call QWORD PTR [__imp_RtlUnwind]
    		
    		xor rax,rax
    		add rsp,8
    		ret
    		
    do_nothing:
    		mov rax,1
    		add rsp,8
    		ret
    EHANDLER ENDP 
    
    ENTRY PROC PUBLIC FRAME:EHANDLER
    		sub rsp,56
    		.ALLOCSTACK 56
    		.ENDPROLOG
    		
    		mov rcx,-11
    		call QWORD PTR [__imp_GetStdHandle]
    		mov [rsp+80],rax
    		
    		mov QWORD PTR [rsp+88],0
    next:
    		lea r11,[pf0]
    		mov [rsp+40],r11
    		lea r10,[pf1]
    		mov [rsp+32],r10
    		lea r9,[pfm1]
    		lea r8,[pfm1]
    		lea rdx,[pf1]
    		mov rcx,[rsp+88]			;here's k!
    		call A
    		
    		mov r9,rax
    		mov r8,[rsp+88]
    		lea rdx,[format_string]
    		lea rcx,[output_string]
    		call QWORD PTR [__imp_wsprintfA]
    				
    		mov QWORD PTR [rsp+32],0
    		lea r9,[rsp+64]
    		mov r8,rax
    		lea rdx,[output_string]
    		mov rcx,[rsp+80]
    		call QWORD PTR [__imp_WriteFile]
    
    		inc QWORD PTR [rsp+88]
    		jmp next
    		
    ehandler_safe_position::	
    		mov QWORD PTR [rsp+32],0
    		lea r9,[rsp+64]
    		mov r8,len_end_string
    		lea rdx,[end_string]
    		mov rcx,[rsp+80]
    		call QWORD PTR [__imp_WriteFile]
    	
    		xor rcx,rcx
    		call QWORD PTR [__imp_ExitProcess]
    		
    		add rsp,40
    		ret
    ENTRY ENDP
    _TEXT ENDS
    END
    
  4. Jussi Piitulainen said

    I modified the Algol 60 program by hand until GNU MARST would compile it: declared the xs (as integers), removed the space from inside an assignment operator, replaced the proper inequality symbol with its ASCII representation, added the 0 to compare against (huh?), and inserted a couple of semicolons. And added a “channel” where the result is printed: 1 for stdout.

    MARST processes Algol 60 according to a Modified Report, 1976, which applies a Supplement to the Revised Report. (Scheme would later follow its Revised Report, 1978, with a Revised Revised Report.)

    begin
    real procedure A(k, x1, x2, x3, x4, x5);
    value k; integer k;
    integer x1, x2, x3, x4, x5;
    begin
    real procedure B;
    begin
    k := k – 1;
    B := A := A(k, B, x1, x2, x3, x4)
    end;
    if k <= 0 then A := x4 + x5 else B
    end;
    outreal(1, A(10, 1, -1, -1, 1, 0))
    end

    MARST translates Algol 60 to C. MARST agrees with those who say that the answer should be -67.

    $ ./marst knuth.alg -o knuth.c
    $ gcc knuth.c -L .libs -lalgol -lm -o knuth
    $ LD_LIBRARY_PATH=.libs/ ./knuth
    -67 $

    MARST comes with a converter from earlier forms of the language. I haven’t tried the converter. Was that 0 really optional at some stage?

    I’m also wondering why the result should be declared real. This is clearly integer arithmetic.

  5. Jussi Piitulainen said

    I object to the eating of the indentations. Happily they weren’t all that crucial this time.

  6. kernelbob said

    This was a fun problem. Thanks!

    @namako, that was beautiful.

    Here are two solutions. The first, in Python 3, reads a lot like the original. The trickiest part was having B find the right instance of k.

    #!/usr/bin/env python3
    
    from itertools import count
    import sys
    
    def name(x):
        def v():
            return x
        return v
    
    def A(k, x1, x2, x3, x4, x5):
        k = [k[0]]
    
        def B():
            k[0] -= 1
            return A(k, B, x1, x2, x3, x4)
        if k[0] <= 0:
            return x4() + x5()
        else:
            return B()
    
    sys.setrecursionlimit(2**30)
    
    for k in count():
        print(k, A([k], name(1), name(-1), name(-1), name(1), name(0)))
    

    And here is one in C. C is really badly suited for this process, though this C programs runs faster than Our Gracious Host’s solution in Ikarus Scheme. Most Scheme compilers have to search the environment for the nonlocal variable k, I think, while the C program increments k in O(1) time.

    I tried to follow the spirit of Algol 60, using thunks and explicit stack frames to evaluate parameters with call-by-name semantics. If I limit the process’s stack size to 16 GB, I can also evaluate k=27. (My workstation only has 16 GB, so going higher is horribly slow as the virtual memory system thrashes.)

    #include <assert.h>
    #include <stdbool.h>
    #include <stdio.h>
    
    typedef struct thunk thunk;
    
    typedef int getter(thunk *);
    typedef void setter(thunk *, int);
    
    struct thunk {
        void *frame;
        getter *get;
        setter *set;
    };
    
    static void set_constant(thunk *p, int new_value)
    {
        assert(false && "can't assign to constant");
    }
    
    typedef struct A_frame {
        int A;
        int k;
        thunk x1;
        thunk x2;
        thunk x3;
        thunk x4;
        thunk x5;
    } A_frame;
    
    int A(int k, thunk *x1, thunk *x2, thunk *x3, thunk *x4, thunk *x5);
    int B(A_frame *);
    
    static int get_B(thunk *p)
    {
        return B(p->frame);
    }
    
    static void set_B(thunk *p, int new_value) {
        assert(false && "can't assign to B");
    }
    
    int B(A_frame *ap) {
        ap->k -= 1;
        thunk t1 = { ap, get_B, set_B };
        return ap->A = A(ap->k, &t1, &ap->x1, &ap->x2, &ap->x3, &ap->x4);
    }
    
    int A(int k, thunk *x1, thunk *x2, thunk *x3, thunk *x4, thunk *x5)
    {
        A_frame f;
        if (k <= 0) {
            f.A = x4->get(x4) + x5->get(x5);
        } else {
            f.k = k;
            f.x1 = *x1;
            f.x2 = *x2;
            f.x3 = *x3;
            f.x4 = *x4;
            f.x5 = *x5;
            B(&f);
        }
        return f.A;
    }
    
    static int get_plus1(thunk *p)
    {
        return +1;
    }
    
    static int get_minus1(thunk *p)
    {
        return -1;
    }
    
    static int get_zero(thunk *p)
    {
        return 0;
    }
    
    int main()
    {
        int k = 0;
        while (true) {
    
            thunk t1 = { NULL, get_plus1,  set_constant };
            thunk t2 = { NULL, get_minus1, set_constant };
            thunk t3 = { NULL, get_minus1, set_constant };
            thunk t4 = { NULL, get_plus1,  set_constant };
            thunk t5 = { NULL, get_zero,   set_constant };
            printf("k=%d, A=%d\n", k, A(k, &t1, &t2, &t3, &t4, &t5));
            k++;
        }
    }
    
  7. kernelbob said

    One more note. The first time I used Algol-60 was in the MIT introductory CS class, 6.031, in 1979. The last time I used Algol-60 was also in 6.031 in 1979. (-:

  8. Jussi Piitulainen said

    I’m stll puzzled at Knuth declaring those procedures “real”. They seem to work fine as “integer” (-67, using GNU MARST to compile).
    (And I’m hoping to have it indented as intended this time.)

    begin
        comment replaced real with integer procedures;
        comment why where they real?;
    
        integer procedure A(k, x1, x2, x3, x4, x5);
        value k; integer k;
        integer x1, x2, x3, x4, x5;
        begin
            integer procedure B;
            begin 
               k := k - 1; 
               B := A := A(k, B, x1, x2, x3, x4)
            end;
    
            if k <= 0 then A := x4 + x5 else B
        end;
    
        outstring(1, "The correct (integer!) answer is ");
        outreal(1, A(10, 1, -1, -1, 1, 0));
        outstring(1, ".\n")
    end
    

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: