Man Or Boy

September 16, 2016

During the development of Algol 60, Donald Knuth devised a nasty test of recursion:

There are quite a few ALGOL60 translators in existence which have been designed to handle recursion and non-local references properly, and I thought perhaps a little test-program may be of value. Hence I have written the following simple routine, which may separate the man-compilers from the boy-compilers:
begin 
 real procedure A(k, x1, x2, x3, x4, x5);
 value k; integer k;
 begin
  real procedure B;
  begin 
   k := k - 1; 
   B := A := A(k, B, x1, x2, x3, x4)
  end;
  if k ≤ then A : = x4 + x5 else B
 end
 outreal(A(10, 1, -1, -1, 1, 0))
end
This uses nothing known to be tricky or ambiguous. My question is: What should the answer be? Unfortunately, I don’t have to a man-compiler myself, and so I was forced to try hand calculations. My conjecture (probably wrong) is that the answer will be:
73 - 119 - 177 + 102 = - 121
I’d be very glad to know the right answer.

Your task is to write a program that computes the right answer. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.

Posted by programmingpraxis

Filed in Exercises

8 Comments »

8 Responses to “Man Or Boy”

Milbrae said
September 16, 2016 at 1:26 PM
Maybe you’d like to check this out: https://www.rosettacode.org/wiki/Man_or_boy_test

Informatimago said

September 17, 2016 at 8:08 PM

#|

It is not a test of recursion, but a test of call-by-name.

You have to be careful to interpret the name of the functions in
Algol: depending on whether they are "left-values" or "right-values",
they will denote an assignment to a variable that will be the function
result, or they denote a (possibly recursive) function call.

Thus:

       B := A := A(k, B, x1, x2, x3, x4)
       ^    ^    ^    ^
       |    |    |    |
       |    |    |    +--- function call.
       |    |    +-------- function call.
       |    +------------- assignment to result variable
       +------------------ assignment to result variable


       if k ≤ then A : = x4 + x5 else B
                   ^                  ^
                   |                  |
                   |                  +---- function call
                   +----------------------- assignment to result variable

All the arguments to A, but the first one which is declared to be
by-value, are passed by-name.  This requires the creation of a thunk
(macro byname), and the evaluation of that thunk when the value is
eventually needed (function deref).

|#


(defun deref (x)
  (if (functionp x)
      (deref (funcall x))
      x))

(defmacro byname (x)
  `(lambda () ,x))

(labels ((A (k x1 x2 x3 x4 x5)
           (let ((a :uninitialized-a))
             (labels ((b ()
                        (let ((b :uninitialized-b))
                          (decf k)
                          (setf b (setf a (a k (byname (b)) (byname x1) (byname x2) (byname x3) (byname x4))))
                          b)))
               (if (<= k 0)
                   (setf a (+ (deref x4) (deref x5)))
                   (b)))
             a)))
  (a 10 1 -1 -1 1 0))

;; --> -67

namako said

September 18, 2016 at 1:07 AM

This goes up to 27…so I suppose I am a man compiler.

;For ml64, Windows x86-64
;ml64 <filename>.asm /Cp /link /ENTRY:ENTRY /SUBSYSTEM:CONSOLE kernel32.lib user32.lib /STACK:<stacksize>
;Got up to 27 with a /STACK:10737418240 (10GB), 13 with the defaul 1MB stack.

COMMENT ~
Some hideous ALGOL60 by Knuth:

begin
real procedure A(k, x1, x2, x3, x4, x5);
value k; integer k;
begin
real procedure B;
begin
k := k - 1;
B := A := A(k, B, x1, x2, x3, x4)
end;
if k <= 0 then A : = x4 + x5 else B
end
outreal(A(10, 1, -1, -1, 1, 0))
end

Algol uses "pass by name", so there's a little bit of nastiness to be had here. I'm not actually 
sure what the second B in B := refers to.

After some thinking, it looks like it refers to B in the frame when it was put in there.
Which is a little annoying.

So...I'll pass structures containing pointers to functions that take that structure -_-'.
struct S{
	int (*fun)(S*);
	...
};
Then things should be sensible again.
~

_TEXT SEGMENT

;int32_t A(int32_t, void *, void *, void *, void *, void *); 
;void* parameter: some POD with a pointer to an int32_t function taking a pointer to that POD as parameter.
;A's Stack plan
;x5				104
;x4				96
;(x3) r9		88
;(x2) r8		80
;(x1) rdx		72
;(k) rcx		64 
;return			56 ;allocate 56
;A				48
;frame for B	40
;pointer for B	32
;call space		24
;call space		16
;call space		8
;call space		0
A PROC PUBLIC FRAME
	sub rsp,56
	.ALLOCSTACK 56
	.ENDPROLOG
	cmp ecx,0	;check k
	jle sum
	;On this branch the shadow space needs populating.
	mov [rsp+64],rcx
	mov [rsp+72],rdx
	mov [rsp+80],r8
	mov [rsp+88],r9
	mov [rsp+40],rsp	;save the frame pointer for B
	lea rax,B
	mov [rsp+32],rax	;save B's address for B
	lea rcx,[rsp+32]
	call B				;call B on that pointer
	mov eax,[rsp+48]	;get the return value out
	jmp done
sum:
	mov rcx,[rsp+96]	
	call QWORD PTR [rcx];call x4
	mov [rsp+48],eax	;save return
	mov rcx,[rsp+104]	
	call QWORD PTR [rcx];call x5
	add eax,[rsp+48]	;sum to return.
done:
	add rsp,56
	ret
A ENDP

;B's stack plan
;free			88
;free			80
;free			72
;(pointer)rcx	64
;return			56
;rbx			48
;p5				40
;p4				32
;p3				24
;p2				16
;p1				8
;p0				0
B PROC PUBLIC FRAME
	push rbx
	.PUSHREG rbx
	sub rsp,48
	.ALLOCSTACK 48
	.ENDPROLOG

	mov rbx,[rcx+8]			;get the saved frame pointer for A
	dec DWORD PTR [rbx + 64];k <- k-1 in that frame

	movdqu xmm0,[rbx+88]
	movdqa [rsp+32],xmm0	;x3 and x4
	mov r9,[rbx+80]			;x2
	mov r8,[rbx+72]			;x1
	mov rdx,rcx				;B
	mov rcx,[rbx+64]		;k
	call A					;A(k,B,x1,x2,x3,x4)
	mov [rbx+48],eax		;save the result (as in the original code)

	add rsp,48
	pop rbx
	ret
B ENDP
_TEXT ENDS

;Business to actually do things.
EXTERN __imp_ExitProcess:PROC
EXTERN __imp_GetStdHandle:PROC
EXTERN __imp_WriteFile:PROC
EXTERN __imp_wsprintfA:PROC 

CONST SEGMENT
ALIGN 16
pf0		dq f0
pf1		dq f1
pfm1	dq fm1

format_string	db	"%d -> %d",0dh,0ah,0
len_format_string = $-format_string
end_string 		db	"Stack overflow!"
len_end_string = $ - end_string
CONST ENDS

_BSS SEGMENT
output_string	db	1024 dup (?)
_BSS ENDS

_TEXT SEGMENT

f0:		xor eax,eax
		ret
f1:		mov eax,1
		ret
fm1:	mov eax,-1
		ret

EXTERN __imp_RtlUnwind:PROC
EHANDLER PROC PRIVATE FRAME
		sub rsp,8
		.ALLOCSTACK 8
		.ENDPROLOG
		
		;check what we can do with the exception
		cmp DWORD PTR [rcx+4],0
		jne do_nothing	;it's not something I can deal with unless it's continuable.		
		cmp DWORD PTR [rcx],0c00000fdh 	;check for a stack overflow
		jne do_nothing ;I can't do anything with it
		
		;so...a stack overflow is to be had!
		;It could come from windows, but that would be crazy.
		
		mov r10,rdx	;save the exception frame
		mov r9,0
		mov r8,rcx
		lea rdx,[ehandler_safe_position]
		mov rcx,r10
		call QWORD PTR [__imp_RtlUnwind]
		
		xor rax,rax
		add rsp,8
		ret
		
do_nothing:
		mov rax,1
		add rsp,8
		ret
EHANDLER ENDP 

ENTRY PROC PUBLIC FRAME:EHANDLER
		sub rsp,56
		.ALLOCSTACK 56
		.ENDPROLOG
		
		mov rcx,-11
		call QWORD PTR [__imp_GetStdHandle]
		mov [rsp+80],rax
		
		mov QWORD PTR [rsp+88],0
next:
		lea r11,[pf0]
		mov [rsp+40],r11
		lea r10,[pf1]
		mov [rsp+32],r10
		lea r9,[pfm1]
		lea r8,[pfm1]
		lea rdx,[pf1]
		mov rcx,[rsp+88]			;here's k!
		call A
		
		mov r9,rax
		mov r8,[rsp+88]
		lea rdx,[format_string]
		lea rcx,[output_string]
		call QWORD PTR [__imp_wsprintfA]
				
		mov QWORD PTR [rsp+32],0
		lea r9,[rsp+64]
		mov r8,rax
		lea rdx,[output_string]
		mov rcx,[rsp+80]
		call QWORD PTR [__imp_WriteFile]

		inc QWORD PTR [rsp+88]
		jmp next
		
ehandler_safe_position::	
		mov QWORD PTR [rsp+32],0
		lea r9,[rsp+64]
		mov r8,len_end_string
		lea rdx,[end_string]
		mov rcx,[rsp+80]
		call QWORD PTR [__imp_WriteFile]
	
		xor rcx,rcx
		call QWORD PTR [__imp_ExitProcess]
		
		add rsp,40
		ret
ENTRY ENDP
_TEXT ENDS
END

Jussi Piitulainen said
September 18, 2016 at 1:40 PM
I modified the Algol 60 program by hand until GNU MARST would compile it: declared the xs (as integers), removed the space from inside an assignment operator, replaced the proper inequality symbol with its ASCII representation, added the 0 to compare against (huh?), and inserted a couple of semicolons. And added a “channel” where the result is printed: 1 for stdout.

MARST processes Algol 60 according to a Modified Report, 1976, which applies a Supplement to the Revised Report. (Scheme would later follow its Revised Report, 1978, with a Revised Revised Report.)

begin
real procedure A(k, x1, x2, x3, x4, x5);
value k; integer k;
integer x1, x2, x3, x4, x5;
begin
real procedure B;
begin
k := k – 1;
B := A := A(k, B, x1, x2, x3, x4)
end;
if k <= 0 then A := x4 + x5 else B
end;
outreal(1, A(10, 1, -1, -1, 1, 0))
end

MARST translates Algol 60 to C. MARST agrees with those who say that the answer should be -67.

$ ./marst knuth.alg -o knuth.c
$ gcc knuth.c -L .libs -lalgol -lm -o knuth
$ LD_LIBRARY_PATH=.libs/ ./knuth
-67 $

MARST comes with a converter from earlier forms of the language. I haven’t tried the converter. Was that 0 really optional at some stage?

I’m also wondering why the result should be declared real. This is clearly integer arithmetic.
Jussi Piitulainen said
September 18, 2016 at 1:43 PM
I object to the eating of the indentations. Happily they weren’t all that crucial this time.

kernelbob said

September 19, 2016 at 4:45 AM

This was a fun problem. Thanks!

@namako, that was beautiful.

Here are two solutions. The first, in Python 3, reads a lot like the original. The trickiest part was having B find the right instance of k.

#!/usr/bin/env python3

from itertools import count
import sys

def name(x):
    def v():
        return x
    return v

def A(k, x1, x2, x3, x4, x5):
    k = [k[0]]

    def B():
        k[0] -= 1
        return A(k, B, x1, x2, x3, x4)
    if k[0] <= 0:
        return x4() + x5()
    else:
        return B()

sys.setrecursionlimit(2**30)

for k in count():
    print(k, A([k], name(1), name(-1), name(-1), name(1), name(0)))

And here is one in C. C is really badly suited for this process, though this C programs runs faster than Our Gracious Host’s solution in Ikarus Scheme. Most Scheme compilers have to search the environment for the nonlocal variable k, I think, while the C program increments k in O(1) time.

I tried to follow the spirit of Algol 60, using thunks and explicit stack frames to evaluate parameters with call-by-name semantics. If I limit the process’s stack size to 16 GB, I can also evaluate k=27. (My workstation only has 16 GB, so going higher is horribly slow as the virtual memory system thrashes.)

#include <assert.h>
#include <stdbool.h>
#include <stdio.h>

typedef struct thunk thunk;

typedef int getter(thunk *);
typedef void setter(thunk *, int);

struct thunk {
    void *frame;
    getter *get;
    setter *set;
};

static void set_constant(thunk *p, int new_value)
{
    assert(false && "can't assign to constant");
}

typedef struct A_frame {
    int A;
    int k;
    thunk x1;
    thunk x2;
    thunk x3;
    thunk x4;
    thunk x5;
} A_frame;

int A(int k, thunk *x1, thunk *x2, thunk *x3, thunk *x4, thunk *x5);
int B(A_frame *);

static int get_B(thunk *p)
{
    return B(p->frame);
}

static void set_B(thunk *p, int new_value) {
    assert(false && "can't assign to B");
}

int B(A_frame *ap) {
    ap->k -= 1;
    thunk t1 = { ap, get_B, set_B };
    return ap->A = A(ap->k, &t1, &ap->x1, &ap->x2, &ap->x3, &ap->x4);
}

int A(int k, thunk *x1, thunk *x2, thunk *x3, thunk *x4, thunk *x5)
{
    A_frame f;
    if (k <= 0) {
        f.A = x4->get(x4) + x5->get(x5);
    } else {
        f.k = k;
        f.x1 = *x1;
        f.x2 = *x2;
        f.x3 = *x3;
        f.x4 = *x4;
        f.x5 = *x5;
        B(&f);
    }
    return f.A;
}

static int get_plus1(thunk *p)
{
    return +1;
}

static int get_minus1(thunk *p)
{
    return -1;
}

static int get_zero(thunk *p)
{
    return 0;
}

int main()
{
    int k = 0;
    while (true) {

        thunk t1 = { NULL, get_plus1,  set_constant };
        thunk t2 = { NULL, get_minus1, set_constant };
        thunk t3 = { NULL, get_minus1, set_constant };
        thunk t4 = { NULL, get_plus1,  set_constant };
        thunk t5 = { NULL, get_zero,   set_constant };
        printf("k=%d, A=%d\n", k, A(k, &t1, &t2, &t3, &t4, &t5));
        k++;
    }
}

kernelbob said
September 19, 2016 at 4:50 AM
One more note. The first time I used Algol-60 was in the MIT introductory CS class, 6.031, in 1979. The last time I used Algol-60 was also in 6.031 in 1979. (-:

Jussi Piitulainen said

September 21, 2016 at 9:12 AM

I’m stll puzzled at Knuth declaring those procedures “real”. They seem to work fine as “integer” (-67, using GNU MARST to compile).
(And I’m hoping to have it indented as intended this time.)

begin
    comment replaced real with integer procedures;
    comment why where they real?;

    integer procedure A(k, x1, x2, x3, x4, x5);
    value k; integer k;
    integer x1, x2, x3, x4, x5;
    begin
        integer procedure B;
        begin 
           k := k - 1; 
           B := A := A(k, B, x1, x2, x3, x4)
        end;

        if k <= 0 then A := x4 + x5 else B
    end;

    outstring(1, "The correct (integer!) answer is ");
    outreal(1, A(10, 1, -1, -1, 1, 0));
    outstring(1, ".\n")
end

Programming Praxis

Man Or Boy

September 16, 2016

8 Responses to “Man Or Boy”

Leave a comment

Categories

Archives

Archives

Programming Praxis

Man Or Boy

September 16, 2016

Share this:

Related

8 Responses to “Man Or Boy”

Leave a comment

Categories

Archives

Archives