Coder's Guild Mailing List

Re: Pattern matching with unix/linux

Posted by Peter Palfrader on 2000-02-26

Hi Benjamin!

the swiss army chainsaw a.k.a. perl is your friend.


how about this one:
(file is the input you gave)

weasel@marvin:~$ perl -e '
> $/=undef;
> $txt=<>;
> while ($txt=~/(\(.*?\))/gs) {
>  $tmp = $1;
>  $tmp=~s/[\s\n\r]+/ /g;
>  print $tmp,"\n";
> }; ' < file
( b c d e f )
( b )
( c d e )
( b )
( d )
weasel@marvin:~$ 

(you can write it in a single line but it would not look good in
email)



another solution using tr together with sed. The problem with either
sed or me was that sed always wanted to process strings line by line,
so no multiline expression got caught:

weasel@marvin:~$ cat file | tr '\n' ' ' | sed -e '
> s/^[^(]*(/(/g;
> s/)[^)]*$/)/g;
> s/)[^(]*(/) (/g;
> s/  */ /g'; echo
( b c d e f ) ( b ) ( c d e ) ( b ) ( d )
weasel@marvin:~$ 

note that this prints the result in a single line and no tailing
newline.


This one might be better for you:
weasel@marvin:~$ cat file | tr '\n' ' ' | sed -e '
> s/^[^(]*(/(/g;
> s/)[^)]*$/)/g;
> s/)[^(]*(/)\        <-- this is a backslash followed immediatly by
> (/g;                    a newline character. if you don't want this,
> s/  */ /g'; echo        insert an underscore "_" or something like this
( b c d e f )             and pipe sed's output through tr '_' '\n'
( b )                     if you find a way to insert a newline
( c d e )                 without a real newline in the code, please
( b )                     tell me.
( d )
weasel@marvin:~$ 


Hope that helps.

-- 
Weasel                            http://www.cosy.sbg.ac.at/~ppalfrad/
PGP/GPG encrypted messages prefered. See my site or finger -l ppalfrad
          Yes means No and No means Yes. Delete all files [Y]?
To signoff send a mail to listserver@xxxx.xx.xx with 
  "signoff tcg" in the body of your message.