Curl | RegEx

I am trying to get the number following n= and the valid through date for only those pages that have content other than


For this URL:
http://reg.e.staples.com/c/s/tagfrm/hBOJnz1B7SHGCB8ca0OCYWUNv53/coupon.html?n=9820

I would want:
9820
10/27/12

This gets the job done but it is ugly:

set xxx to do shell script "curl http://reg.e.staples.com/c/s/tagfrm/hBOJnz1B7SHGCB8ca0OCYWUNv53/coupon.html?n=[9820-9823] | grep -Eo 'coupon.html\\?n=[0-9]*|through [0-9][0-9]?/[0-9][0-9]?/[0-9][0-9][0-9]?[0-9]?' | grep -B 1 through | sed 's/[^0-9\\/]//g' | grep [0-9]"

Can anyone (cough Nigel) help me clean this up?

I don’t know if you’d regard this as more “beautiful”: :wink:

set xxx to do shell script "curl http://reg.e.staples.com/c/s/tagfrm/hBOJnz1B7SHGCB8ca0OCYWUNv53/coupon.html?n=[9820-9823] | sed -En '/.*coupon.html\\?n=([0-9]*)$/ {s::\\1: ; h ; } ; /.*through ([0-9]{1,2}\\/[0-9]{1,2}\\/[0-9]{2,4}).*/ {s::\\1: ; x ; G ; p ; } ;'"

Thank you Nigel! Before I sit down and try to decode this, I have a quick question. Are these two equivalent ?

set xxx to "abcdefg"
set yyy to do shell script "echo " & quoted form of xxx & " | sed -e 's/abc/cat /' -e 's/cat/dog/'"
set zzz to do shell script "echo " & quoted form of xxx & " | sed 's/abc/cat /;  s/cat/dog/'"

Yes. :slight_smile: