Trailing Comments
February 23, 2021
Today’s exercise is based on a blog entry that Reddit pointed me to. Sean is trying to write a program that finds lines of code in his Python programs that have trailing comments at the ends of lines, which he considers bad style. His naive program improperly flagged a line as containing a trailing comment when the comment marker is embedded in a quoted string:
x = "# This is fine""This \# is fine too"x = "" # This is not
Sean gets a little bit sidetracked in his blog entry, and never quite solves the problem; I’m not convinced the code he writes is correct (though he seems to think it is), and I’m also not convinced that trailing comments are a bad thing. Nevertheless, the task makes an interesting exercise, especially when we also handle quoted escape sequences.
Your task is to write a program that identifies lines of code with trailing comments. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.
@programmingpraxis, I believe the solution you posted considers full comment lines (those starting with a “#”, possibly preceded by whitespace) as trailing comments. For example, the line
# commentwould seemingly be counted as a trailing comment. Have I interpreted your code correctly, and should there be a distinction between 1) full comment lines and 2) lines that have code followed by a comment? I’ve assumed that a trailing comment detector would only detect the latter.Here’s a solution in Python, which takes a Python file as input and outputs the line numbers with trailing comments. I’ve utilized Python’s built-in tokenize module. This accommodates comment markers in either single or double quoted strings, and also handles trailing comments that occur at the end of multi-line strings after the triple quotes.
import sys import tokenize assert len(sys.argv) == 2 with tokenize.open(sys.argv[1]) as f: line = -1 tokens = tokenize.generate_tokens(f.readline) for token in tokens: # A trailing comment is a comment token that starts on the same line that the # preceding token ended on. if token.type == tokenize.COMMENT and token.start[0] == line: print(line) line = token.end[0]Example usage:
[sourcode lang=”text”]
x = ‘# this is not a trailing comment’
“this # is not part of a trailing comment”
x = “” # this is a trailing comment
this is not (full comment line)
x = “””
:-) # this multi-line string text is not part of a trailing comment
“”” # this text is
[/sourcecode]
[sourcode lang=”text”]
$ python3 trailing_comments.py example.py
[/sourcecode]
Output:
I misspelled “sourcecode” in my example usage. Here’s another attempt.
Example usage:
x = ‘# this is not a trailing comment’
"this # is not part of a trailing comment"
x = "" # this is a trailing comment
# this is not (full comment line)
x = """
:-) # this multi-line string text is not part of a trailing comment
""" # this text is
$ python3 trailing_comments.py example.py
Output:
Another attempt.
x = ‘# this is not a trailing comment’
"this # is not part of a trailing comment"
x = "" # this is a trailing comment
# this is not (full comment line)
x = """
:-) # this multi-line string text is not part of a trailing comment
""" # this text is
$ python3 trailing_comments.py example.py
Output:
A last attempt to correct formatting of my example usage.
Output:
A solution in Racket:
(define (trailing-comments path) (define shebang "#!") (let next ((lines (port->lines (open-input-file path))) (count 1) (out null)) (if (null? lines) (display (~a "lines: " (if (null? out) "(none)" (string-join (map number->string (reverse out)))))) (let ((line (car lines))) (let ((chars (reverse (string->list (if (and (= count 1) (string-prefix? line shebang)) (string-trim line shebang #:right? #f) line))))) (let ((icomm (index-of chars #\#)) (iquot (index-of chars #\"))) (next (cdr lines) (add1 count) (if (and icomm (or (not iquot) (< icomm iquot))) (cons count out) out))))))))Test source code file, trailing-comments.py:
#!/usr/bin/env python # comments start with the octothorpe x = "But having one in a string, like this: #, doesn't count." "This one: \#, doesn't count either." # user input Fahrenheit = int(raw_input("Enter a temperature in Fahrenheit: ")) Celsius = (Fahrenheit - 32) * 5.0/9.0 # convert to celsius print "Temperature:", Fahrenheit, "Fahrenheit = ", Celsius, " C"Testing:
Here is my take on this using Julia 1.5: https://pastebin.com/afzBbauw
This solution works with a single line, but it’s not too difficult to adapt it to run on a whole program. Cheers