The following Python script searches through C code for division or sqrt
and prints the line of code and the line number. It skips C comments.
To use, run python find_divides.py filename.c
#!/usr/bin/python
"""find_divides.py
usage: python find_divides.py filename
"""
import re
import sys
def main():
filename = sys.argv[1]
text = open(filename).read()
lines = text.splitlines()
lines = ["%4d: %s" % (i, line) for (i, line) in enumerate(lines)]
text = "\n".join(lines)
text = remove_comments_and_strings(text)
for line in text.splitlines():
if ("/" in line) or ("sqrt" in line):
print line
def remove_comments_and_strings(text):
""" remove c-style comments and ...The Perl FAQ has an entry How do I use a regular expression to strip C style comments from a file? Since I've switched to Python, I've adapted the Perl solution to Python. This regular expression was created by Jeffrey Friedl and later modified by Fred Curtis. I'm not certain, but it appears to use the "unrolling the loop" technique described in Chapter 6 of Mastering Regular Expressions.
remove_comments.py:
import re
import sys
def remove_comments(text):
""" remove c-style comments.
text: blob of text with comments (can include newlines)
returns: text with comments removed
"""
pattern = r"""
## --------- COMMENT ...I am trying to search through various text and highlight certain search terms within that text using HTML markup. As an example, if I take a paragraph of text from Paul Prescod's essay, I would like to highlight the search terms "lisp", "python", "perl", "java", and "C" each in different colors. My first attempt at this problem looked somthing like:
for sentence in re.split(r"[?.]\s+", text):
match = re.search(r"\blisp\b", sentence, re.I)
if match:
color = 'red'
else:
match = re.search(r"\bpython\b", sentence, re.I)
if match:
color = 'blue'
else:
match = re.search ...Fredrik Lundh wrote a good article called Using Regular Expressions for Lexical Analysis which explains how to use Python regular expressions to read an input string and group characters into lexical units, or tokens. The author's first group of examples read in a simple expression, "b = 2 + a*10", and output strings classified as one of three token types: symbols (e.g. a and b), integer literals (e.g. 2 and 10), and operators (e.g. =, +, and *). His first three examples use the findall method and his fourth example uses the undocumented scanner method from the re module. Here ...
I often process text line by line using the splitlines() method with a for loop. This works great most of the time, however, sometimes, the text is not neatly divisible into lines, or, I need to match multiple items per line. This is where the re module's finditer function can help. finditer returns an iterator over all non-overlapping matches for the regular expression pattern in the string. (See docs.) It is ...
I'm Eliot and this is my notepad for programming topics such as Python, Django, Ubuntu, Emacs, etc... more »