Thanks to perl(1), we can use a non-greedy match to remove several
C89-style comments in the same line. And by splitting (and slightly
modifying) the other sed(1) call, we can handle multi-line comments that
start in the same line that the previous one ends.
Now the only remaining issue is nested comments, but that's rare, so
let's ignore it for now.
$ cat com.c
/* let's see */ int foo/*how about here?*/; // meh
int bar; /* Let's see
how this
behaves */ int baz; /* like this?
like that! */ int qwe; // asdasd
$ sed_rm_ccomments <com.c
int foo;
int bar;
int baz;
int qwe;
Here's what can and will be problematic:
// /*
int foo;
/* */
Signed-off-by: Alejandro Colomar <alx@kernel.org>
# C
# sed_rm_ccomments() removes C comments.
-# It can't handle multiple comments in a single line correctly,
-# nor mixed or embedded //... and /*...*/ comments.
+# It can't handle mixed //... and /*...*/ comments.
# Use as a filter (see man_lsfunc() in this file).
sed_rm_ccomments()
{
- sed 's%/\*.*\*/%%' \
- |sed -E '\%/\*%,\%\*/%{\%(\*/|/\*)%!d; s%/\*.*%%; s%.*\*/%%;}' \
+ perl -p -e 's%/\*.*?\*/%%g' \
+ |sed -E '\%/\*%, \%\*/% {\%(\*/|/\*)%!d}' \
+ |sed -E '\%/\*% {s%/\*.*%%; n; s%.*\*/%%;}' \
+ |sed -E '\%/\*% {s%/\*.*%%; n; s%.*\*/%%;}' \
|sed 's%//.*%%';
}