From: Alin Pilkington (AlinP@AOL.COM)
Date: 14 Sep 98, 04:47 EST
From: Alin Pilkington <AlinP@AOL.COM>
Subject: Parsing for C function
I'm trying to parse a C file and extract the *names* of the functions
which are defined in the file. I am using as a starting point the regular
expression pattern that is the default in the wonderful text editor
Alpha. The pattern is:
^([^ \t\(#\r/@].*[ \t]+)?\*?([A-Za-z0-9~_]+(<[^>]*>)?::[-A-Za-z0-9~_+=
<>\|\*/]+|[A-Za-z0-9~_]+)[ \t\r]*\(
and I don't claim to have a clue as to how it works. This pattern is used
within an Alpha proc called
C++::parseFuncs {}
that in turn uses a built-in Alpha function (search) to parse for the
function name. My problems appear to be many but here is my approach (and
results)
Step 1) read the file in one big block
Results:
a) This step goes ok
Step 2) apply the pattern above and use the Tcl command regexp to find
the function name
Results:
a) First, the Tcl command regexpr doesn't like the pattern and I have to
go in and add backslashes to the special characters like [] so the Tcl
interpreter doesn't view them as part of a command. I think this is only
necessary because the pattern contained things like '\t', '\r' which must
be substituted. In Alpha, this pattern was delimited with {} to avoid
that problem but it can't be passed into regexp that way can it? I
thought substitution had to be done on the pattern first before passing
it into regexpr.
b) Second, after I mangle things up with all of my backslashes, I get an
error from regexp saying I have nested +?*
I'm going to post this to two mail lists and go running off to Borders
book store to try and find a good book on regular expressions so I can
try and decipher the pattern above. Any help from a kind hearted soul
would be greatly appreciated on how to extract the names of C functions
defined in a file!
Many Thanks...
Alin Pilkington