Sum
March 25, 2011
Today’s exercise implements the old Unix Sys V R4 sum command. The original sum calculated a checksum as the sum of the bytes in the file, modulo 216−1, as well as the number of 512-byte blocks the file occupied on disk. Called with no arguments, sum read standard input and wrote the checksum and file blocks to standard output; called with one or more filename arguments, sum read each file and wrote for each a line containing the checksum, file blocks, and filename. The GNU version of the program, which implements an additional version of the checksum that was part of BSD Unix, is available at ftp://alpha.gnu.org/gnu/coreutils/textutils-2.0.22.tar.gz in file sum.c; if you run it, be sure to give the −s argument to get the original Unix Sys V R4 version of the checksum.
Your task is to write a program that implements the original Unix Sys V R4 sum command. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
I’m not sure that this is quite in the spirit of the site but:
$ /usr/local/bin/sum -s /usr/share/dict/words 19278 4858 /usr/share/dict/words $ perl -le 'local $/; print unpack("%32W*", <>) % 65535' /usr/share/dict/words 19278My Haskell solution (see http://bonsaicode.wordpress.com/2011/03/25/programming-praxis-sum/ for a version with comments):
import Data.Char import System.Environment checksum :: String -> String checksum = (\(s,b) -> show s ++ " " ++ show (div (b + 511) 512)) . foldl (\(s,b) c -> (mod (s + ord c) 65535, b + 1)) (0,0) main :: IO () main = getArgs >>= \args -> case args of [] -> interact checksum fs -> mapM_ (\f -> putStrLn . (++ ' ':f) . checksum =<< readFile f) fs[…] today’s Programming Praxis exercise, our goal is to implement a unix checksum utility. Let’s get […]
#!/usr/bin/env python import sys def checksum(f): s = b = 0 for c in f.read(): s = (s + ord(c)) % 65535 b += 1 p = 0 if (b % 512 == 0) else 1 return s, (b // 512) + p def main(args=None): if args: for arg in args: with open(arg) as f: s, b = checksum(f) print "{0}\t{1}\t{2}".format(s, b, arg) else: s, b = checksum(sys.stdin) print "{0}\t{1}".format(s, b) return None if __name__ == "__main__": main(sys.argv[1:])I implemented a version in Factor (the short version is below):
: sum-file ( path -- ) [ binary file-contents sum [ 65535 mod ] [ 512 / ceiling ] bi ] [ "%d %d %s\n" printf ] bi ;The full version is here:
http://re-factor.blogspot.com/2011/03/sum.html
import fileinput def report(checksum, bytecount, filename=''): return '{:5} {:5} {}'.format(checksum&0xffff, (bytecount+511)/512, filename) filename = None for line in fileinput.input(): if fileinput.isfirstline(): if filename is not None: print report(checksum, bytecount, filename) filename = fileinput.filename() bytecount = 0 checksum = 0 bytecount += len(line) checksum += sum(ord(c) for c in line) print report(checksum, bytecount, filename)Whoops, mis-understood the requirements – this is a fixed version for Factor:
: sum-file. ( path -- ) [ binary file-contents [ sum 65535 mod ] [ length 512 / ceiling ] bi ] [ "%d %d %s\n" printf ] bi ;Here’s a go in Common Lisp using LOOP, also had a bit of fun on tweaking the reading of the file and inlining the calculation:
(defun sv4r-sum (file)
"Calculate a Unix SV4R-style file checksum"
(with-open-file (stream file :direction :input :element-type 'unsigned-byte)
(loop :with buffer = (make-array 512 :element-type 'unsigned-byte)
:and csum = 0
:for pos = (read-sequence buffer stream)
:while (plusp pos)
:do (loop :for b :from 0 :to (1- pos)
:do (setf csum (mod (+ csum (svref buffer b)) 65535)))
:counting buffer :into c
:finally (return (values csum c)))))
You can never have to many Scheme solutions ;)
#!r6rs (import (rnrs) (wak foof-loop) ;; I ♥ foof loop (srfi :48 intermediate-format-strings)) (define (print-sum port name) (let-values (((checksum num-blocks) (sum port))) (format #t "~a ~a ~a~%" checksum num-blocks name))) (define (sum port) (loop ((for byte (in-port port get-u8)) (for num-bytes (up-from 0)) (for total (summing byte))) => (values (mod total (- (expt 2 16) 1)) (ceiling (/ num-bytes 512))))) (let ((args (cdr (command-line)))) (if (null? args) (print-sum (standard-input-port) "") (for-each (lambda (file) (call-with-port (open-file-input-port file) (lambda (out) (print-sum out file)))) args)))#include <stdio.h> void check_sum(FILE *fp); int main(int argc, char *argv[]) { FILE *fp; if (argc == 1) check_sum(stdin); else while (argc-- > 1) if ((fp = fopen(*++argv, "r"))== NULL) { fprintf(stderr, "couldn't open file %s\n", *argv); return 1; } else { check_sum(fp); fclose(fp); } return 0; } void check_sum(FILE *fp) { unsigned int bytes_sum; //Guaranteed to go at least to 65535 unsigned int bytes_count; int next_byte; bytes_count = bytes_sum = 0; while ((next_byte = getc(fp)) != EOF) { bytes_sum = (bytes_sum + next_byte) % 65535; bytes_count++; } printf("%d %d\n", bytes_sum, 1 + bytes_count / 512); }