Search through text ( multiline pattern)

I’m looking for a terminal command to search for a specified multiline pattern string through a text file, that returns me if it actually have found the entry or not.

Let’s say that I want to find the string:

Into a text file like:

I tried with sed command, but it searches and replaces the string, and echoes me nothing (at least I didn’t find how to)…

I also tried with grep (and its counterpart fgrep), and works nice for a single line pattern, it returns 0 if it doesn’t find the string and other numbers if the command finds it. I report the code I’m using.

But when I try to use it with a multiline pattern, it just returns me 0 no matter if the file contains the string or not.

I think I could do this using awk command, but can’t find the proper way to search for the multiline string (I don’t want to use regex since I don’t know the string that the user need to search for, and I would prefer a command straight to the terminal (so that i can call it through “do shell script”, but I think that I could also create a temporary file with the awk program).

I’m sure there are tons of ways to do this job with ease.



Model: MacBook Intel Core 2 Duo
AppleScript: 2.3 (118) AppleScript 2.1.1
Browser: Safari 531.21.10
Operating System: Mac OS X (10.6)

Hello Giovanni

Grep’s “-q” option means “quiet” (do not write anything to standard output. Exit immediately with zero status if any match is found, even if an error was detected.

Try something using:
option -i (Ignoring case distinctions in both the PATTERN and the input files)
/usr/bin/grep -i ‘my pattern’ myFile.txt

option -E (Interpret PATTERN as an extended regular expression)
/usr/bin/grep -Ei ‘my pattern’ myFile.txt

Are the lines of your file delimited by a line-feed ?

Multi-line patterns are considered individually by grep (in all modes: -F, -G, -E), so it may look like it works, but it will also give success for repeated lines, missing lines, and out-of-order lines. grep is highly line oriented.

Here is a quick and dirty way to search for literal, multiline strings with Perl. The ˜dirty’ part is that this implementation reads in the entire contents of each file instead of doing the line-by-line (or character-by-character) scan that is possible. It works like grep -l in that it prints out the names of files that match and exits with 0 if any matched (1 otherwise). The ˜pattern’ is matched literally (it is not a regular expression).

set needle to "This is
the string
that i want to search"
set haystack to "Last morning I was searching
for strings
and I was thinking
This is
the string
that i want to search

do shell script "printf %s " & quoted form of haystack & " | " & "perl -e '$p=shift;undef $/;$\\=qq(\\n);$c=0;while($t=<>){if($t=~/^\\Q$p\\E$/m){$c++;print $ARGV}};exit 0 if $c;exit 1' " & quoted form of needle
--> "-" because stdin matches

Model: iBook G4 933
AppleScript: 1.10.7
Browser: Safari 4.0.5 (4531.22.7, r55974)
Operating System: Mac OS X (10.4)

Thanks chrys and thanks clemhoff,

I think I will use the perl option, it works nice and suits my needs. Reading long text files may be a bit time consuming, I will put a progess bar :slight_smile:

In my application I’d prefer to read directly the file from the terminal (without passing it to a variable inside myapp), so I changed a bit your code.
I post the code here so that it might be useful for somebody, as yours was for me.

set needle to "This is
the string
that i want to search"
set unixPath to "text.txt"
do shell script "cat '" & unixPath & "' | " & "perl -e '$p=shift;undef $/;$\\=qq(\\n);$c=0;while($t=<>){if($t=~/^\\Q$p\\E$/m){$c++;print $ARGV}};exit 0 if $c;exit 1' " & quoted form of needle
--> "-" because stdin matches

Best Regards


Sorry, I misread the topic.

With a little adjustment to what i wrote in the last post, the script works well reading directly with the cat command the file:

set needle to "<head>This is
the string
that i want to search" -- A string with an html tag
set unixPath to "text.txt"

set upHeadOrig to "This is
the string
that i want to search"
set j to 1
set theOrigReplace to {"<head>" & (upHeadOrig as Unicode text), "<meta meta>", "<mota mota>"}
do shell script "printf % | cat " & quoted form of unixPath & " | " & "perl -e '$p=shift;undef $/;$\\=qq(\\n);$c=0;while($t=<>){if($t=~/^\\Q$p\\E$/m){$c++;print $ARGV}};exit 0 if $c;exit 1' " & quoted form of item j of theOrigReplace

I was wrong, it works perfectly, but searches for a string that matches completely the line:

so that if my file is:

And the search pattern is:

The perl script does not find anything on the file.

Is there a way to avoid this?

Thanks in advance


I solved the problem through an NSString initialization (with the contents of text) and then with a research inside of it:

set theOrigReplace to {"I always search for stuff like this","or that"}
set theOrigPath to "/I Store my Files always here/likethisone.txt" 
set MyBestText to NSString's alloc()'s initWithContentsOfFile_(theOrigPath) -- mytextFile's init
set myresult to MyBestText's rangeOfString_(item j of theOrigReplace)
set myalready to |length| of myresult
if myalready is not 0 then
display dialog " I FOUND YOU!!!"
display dialog "WHERE ARE YOU?!"
end if

A bit time/resources consuming, but works nice.

Thanks to everybody.

Giovanni Medici

Remove the ^ meta-character from the regular expression in the script. It anchors the pattern to the beginning on a line. Removing it will let your first line start matching in the middle of a line.

I see that you have gone to a AppleScriptObjC variation that is quite similar in overall nature (read entire file, search for string). They both probably have similar performance characteristics. The AppleScriptObjC version might end up being faster since there is no extra overhead for the extra processes (shell, perl, etc.).