Concatenate N N Times

May 10, 2016

One solution is to treat the number as a string, replicate the characters then convert to a number:

(define (f n)
  (undigits (flatten (iterate n append (digits n)))))

> (f 12)
121212121212121212121212

That looks simple but is actually fairly complicated, requiring several functions from the Standard Prelude.

Concatenated numbers form one of the Smarandache sequences, and are given by the formula $\frac{n(10^{n D(n)}-1}{10^{D(n)}-1}$ , where D(n) is the number of digits in n:

(define (d n) ; number of digits in n
  (let loop ((n n) (k 0))
    (if (zero? n) k
      (loop (quotient n 10) (+ k 1)))))

(define (f n) ; concatenate n n times
  (quotient
    (* n (- (expt 10 (* n (d n))) 1))
    (- (expt 10 (d n)) 1)))

> (d 12)
2
> (f 12)
121212121212121212121212

That’s both simpler and faster than the string solution. You can run the program at http://ideone.com/kwRLZi. Besides the MathWorld page referenced above, this exercise is also discussed at A000461.

Posted by programmingpraxis

Filed in Exercises

10 Comments »

10 Responses to “Concatenate N N Times”

James Curtis-Smith said
May 10, 2016 at 9:56 AM
Tried this in perl – as perl has the built in operator… the concatenation is about 40% faster then the numeric method – and works for larger values of n….

informatimago said

May 10, 2016 at 10:03 AM

Simple iterative solution O(n):

(defun self-cat-|O(n)| (n &optional (base 10))
  (let ((factor (expt base (ceiling (log n base)))))
    (loop :repeat n
          :for result := n :then (+ (* result factor) n)
          :finally (return result))))

(list (self-cat-|O(n)| 7)
      (self-cat-|O(n)| 12)
      (self-cat-|O(n)| 1234))
--> (7777777 121212121212121212121212 1234…1234)

Notice that it performs n multiplications and n additions that quickly enough are bignum operations.

This is typically an operation that can be reduced to O(logn), by noticing that when n is even:

  f(n)=f(n/2)|f(n/2)

and when n is odd:

  f(n)=f(⸤n/2⸥)|f(⸤n/2⸥)|f(1)

Therefore we can write:

(defun self-cat-|O(logn)| (n &optional (base 10))
  (labels ((cat (factor hi lo) (+ (* factor hi) lo))
           (rep (n nxn factor)
             (cond
               ((= 1 n)   nxn)
               ((evenp n) (rep (/ n 2)
                               (cat factor nxn nxn)
                               (* factor factor)))
               (t         (cat factor
                               (rep (truncate  n 2)
                                    (cat factor nxn nxn)
                                    (* factor factor))
                               nxn)))))
    (rep n n (expt base (ceiling (log n base))))))

(list (self-cat-|O(n)| 7)
      (self-cat-|O(n)| 12)
      (self-cat-|O(n)| 35)
      (self-cat-|O(logn)| 7)
      (self-cat-|O(logn)| 12)
      (self-cat-|O(logn)| 35))
--> (
     7777777
     121212121212121212121212
     3535353535353535353535353535353535353535353535353535353535353535353535
     7777777
     121212121212121212121212
     3535353535353535353535353535353535353535353535353535353535353535353535)

informatimago said

May 10, 2016 at 11:02 AM

I doubt James tested his perl string version on non-single digit numbers; going thru strings is 6 times slower than the iterative arithmetic solution, and 1000 times slower than the dichotomy arithmetic solution:

(defun self-cat-string (n)
  (read-from-string (format nil "~V@{~A~:*~}" n n)))

(progn
  (time (progn (self-cat-string 1234) nil))
  (time (progn (self-cat-|O(n)| 1234) nil))
  (time (progn (self-cat-|O(logn)| 1234) nil)))

(progn (self-cat-string 1234) nil)
took 12,088 microseconds (0.012088 seconds) to run.
      1,798 microseconds (0.001798 seconds, 14.87%) of which was spent in GC.
During that period, and with 4 available CPU cores,
      9,303 microseconds (0.009303 seconds) were spent in user mode
      3,327 microseconds (0.003327 seconds) were spent in system mode
 10,542,560 bytes of memory allocated.
 33 minor page faults, 0 major page faults, 0 swaps.
(progn (|SELF-CAT-O(n)| 1234) nil)
took 3,842 microseconds (0.003842 seconds) to run.
       897 microseconds (0.000897 seconds, 23.35%) of which was spent in GC.
During that period, and with 4 available CPU cores,
     2,745 microseconds (0.002745 seconds) were spent in user mode
     1,336 microseconds (0.001336 seconds) were spent in system mode
 2,582,768 bytes of memory allocated.
(progn (|SELF-CAT-O(logn)| 1234) nil)
took 77 microseconds (0.000077 seconds) to run.
During that period, and with 4 available CPU cores,
     78 microseconds (0.000078 seconds) were spent in user mode
      3 microseconds (0.000003 seconds) were spent in system mode
 27,024 bytes of memory allocated.

informatimago said
May 10, 2016 at 11:09 AM
I meant the factors for 1234.

Converting the number to a string is O(logn) divisions; concatenating the strings can be made O(n^2) memory accesses (the length of the string is n*n=n^2); reading the result is O(n^2) multiplications+additions. Total, the string version is O(n^2) multiplications, while the simple iterative is only O(n) multiplication and the dichotomic one is O(logn) multiplications.

matthew said

May 10, 2016 at 5:20 PM

Here’s a couple of Python solutions, one short and snappy using strings, one using bignum arithmetic. The string solution is faster for smaller n, but with bignums winning for larger (>200) n – by n = 100000, bignums are 15 times faster, probably converting megabyte strings to ints doesn’t scale well.

import sys
import math
import timeit

def f1(n): return int(str(n)*n)

def f2(n):
    k = 1 + int(math.log10(n))
    d = 10**k
    return (d**n-1)//(d-1)*n

def test(f,n): return lambda: f(n)

for i in range(1,1000): assert(f1(i) == f2(i))

n = int(sys.argv[1])
count = int(sys.argv[2])
print(timeit.timeit(test(f1,n),number=count))
print(timeit.timeit(test(f2,n),number=count))

Daniel said

May 13, 2016 at 11:59 PM

Here’s a solution in Java that iteratively reads an additional character until the parsed integer times its string length equals the length of the entire string. I assumed inputs are valid (e.g., 333 is valid, 332 is not but my program would also return 3 for that). At return time, I could check for validity but didn’t bother.

import java.math.BigInteger;

public class Praxis {
    private static String parseN(String nn) {
        int len = nn.length();
        for (int i = 1; i <= len; i++) {
            String str = nn.substring(0, i);
            BigInteger n = new BigInteger(str);

            int chars = n.multiply(BigInteger.valueOf(str.length())).intValue();

            // assumes input is valid.
            // could alternatively check in condition below.
            if (chars == len) {
                return n.toString();
            } else if (chars > len) {
                return ""; // error
            }
        }
        return ""; // error
    }

    public static void main(String[] args) {
        System.out.println(parseN("7777777"));
        System.out.println(parseN("121212121212121212121212"));
    }
}

Output:

7
12

Daniel said

May 14, 2016 at 12:03 AM

public static void main(String[] args) {
    System.out.println(parseN("1"));
    System.out.println(parseN("22"));
}

Output:

1
2

Daniel said
May 14, 2016 at 12:09 AM
> “Your task is to write a program that calculates the number that is concatenated to itself the number of times as the number is (that’s hard to say).”

I just looked at the solution and skimmed some other solutions. I think I interpreted the task slightly differently than others. It seems like the typical interpretation is that the input is N, and the output is N concatenated N times. I thought the input was N concatenated N times, and the output was N. I re-read the task and think it can be interpreted either way (its ambiguous, like many natural language sentences).

Chris Judge said

May 16, 2016 at 3:06 PM

Building the output in an array of chars is wicked fast. The following – in c# – completes in 1 millisecond with n = 100000:

        static string viaDoubling (string input) {
            int n = int.Parse(input);
            long blockLen = input.Length;
            long outLen = n * blockLen;
            char[] output = new char[outLen];

            input.CopyTo(0, output, 0, input.Length);

            while (blockLen < outLen / 2) {
                Array.Copy(output, 0, output, blockLen, blockLen);
                blockLen *= 2;
            }

            Array.Copy(output, 0, output, blockLen, outLen - blockLen);

            return new string(output);
        }

A math solution requires approximately 40000 milliseconds for the same n = 100000:

        static string viaMath(string input) {
            int n = int.Parse(input);
            int m = (int) Math.Pow(10, input.Length);
            BigInteger s = 0;
            
            for (int i = 0; i < n; i++) {
                s *= m;
                s += n;
            }

            return s.ToString();
        }

Sinan said

May 30, 2016 at 12:35 AM

I have two solutions both using python and does substring comparison although numeric comparisons are cool.

First version is kind of greedy approach

def find_repetitions(s):
    for i in range(1, len(s) / 2 + 1):
        frag = s[:i]
        frags = s.split(frag)
        if not any(frags):
            return frag, len(frags) - 1

Second version uses indexes to compare substrings – it resembles to dynamic programming solutions.

def find_repetitions2(s):
    x = len(s[:len(s) / 2 + 1])
    y = len(s)
    for i in range(1, x):
        for j in range(i, y, i):
            if i > y - j:
                break
            for k in range(i):
                if s[k] != s[j + k]:
                    break
            else:
                continue
            break
        else:
            return s[:i], y / i

Performance comparison, measured by ipython’s timeit builtin function.

First version seems to do better at first glance

In [_]: %timeit find_repetitions('xcv' * 1024)
10000 loops, best of 3: 113 µs per loop

In [_]: %timeit find_repetitions(string.letters * 1024)
100 loops, best of 3: 6.22 ms per loop


In [_]: %timeit find_repetitions2('xcv' * 1024)
1000 loops, best of 3: 599 µs per loop

In [_]: %timeit find_repetitions2(string.letters * 1024)
100 loops, best of 3: 7.19 ms per loop

But i suspected that second version would do better when repeated substring becomes much larger, and it indeed did perform better

In [_]: %timeit find_repetitions(''.join(random.choice(string.letters) for _ in range(500)) * 1024)
1 loops, best of 3: 387 ms per loop

In [_]: %timeit find_repetitions2(''.join(random.choice(string.letters) for _ in range(500)) * 1024)
10 loops, best of 3: 87.9 ms per loop

Programming Praxis