grep problem

ralph.lindenfeld · April 8, 2008, 6:07pm

Hi Folks-

I’ve read through the postings here and elsewhere, but can’t seem to get a handle on this. I confess I’m a UNIX noob, for starters.

I’d like to search a text file for strings between xml tags. For starters, I’m trying to get the script to find a string in a file (in this case, my name: Ralph).


set f to choose file
set theresult to (do shell script "grep Ralph " & (quoted form of POSIX path of f))

nor does this


set f to choose file
set theresult to (do shell script "tr '\\r' '\\n' > " & quoted form of POSIX path of f & " | grep Ralph")

any help to point me in the right direction would be appreciated.

thanks!

Ralph

James_Nierodzik · April 8, 2008, 6:15pm

Hmm, your first example should work and does in fact do so for me. What is it doing for you?

StefanK · April 8, 2008, 6:30pm

Hi,

@James

grep /awk works only reliable, if the text file has LF delimiters, it fails with CR delimiters

@ Ralph

try this


set f to choose file
set theresult to (do shell script "/usr/bin/tr '\\r' '\\n' < " & (quoted form of POSIX path of f) & " | /usr/bin/grep Ralph")

ralph.lindenfeld · April 8, 2008, 6:37pm

Strangely enough, it’s ‘working’ on my end now as well. That is, it’s dumping the entire contents of the file into the results pane. Before it was completely erasing the contents of the file. Now I need to figure out how to extract text between tags and step through the file sequentially.

-Ralph

mark_hunte · April 8, 2008, 7:48pm

grep is case sensitive.

It has an option to ignore case

grep -i

set f to choose file
set theresult to (do shell script "grep -i Ralph " & (quoted form of POSIX path of f))

ralph.lindenfeld · April 8, 2008, 8:30pm

To pull a string out from between two tags it seems like I should be using the ‘sed’ command.

the following code generates “” from text between

and

eventually i will need to figure out how to escape the “/” as in reality the tags will be identical, execpt for “/”… but one thing at at t ime.


set f to choose file
set theresult to (do shell script "sed -n '/findthistag/,/findthattag/p' " & (quoted form of POSIX path of f) & " > begin_end_blocks")

any help would be greatly appreciated!

-Ralph

ralph.lindenfeld · April 9, 2008, 6:00pm

Folks-

After all this grepping, a coworker steered me in the direction of XSLT, pointing out that there are commands specifically designed to parse XML. I’ve downloaded both the Smile and LateNight OSAX files. Does anyone have a preference as to which one is better/faster/more stable?

thanks,

Ralph

Bruce_Phillips · April 9, 2008, 10:26pm

Grep example:

set test to "the following code generates \"\" from text between\n\n<findthistag> and </findthattag>\n\neventually."

do shell script "echo " & quoted form of result & " | /usr/bin/grep --only-matching '>[^<]\\+' | /usr/bin/colrm 1 1"
set example to result
--> " and "

I would try the Late Night OSAX or System Events.