I am trying to get the number following n= and the valid through date for only those pages that have content other than
For this URL:
http://reg.e.staples.com/c/s/tagfrm/hBOJnz1B7SHGCB8ca0OCYWUNv53/coupon.html?n=9820
I would want:
9820
10/27/12
This gets the job done but it is ugly:
set xxx to do shell script "curl http://reg.e.staples.com/c/s/tagfrm/hBOJnz1B7SHGCB8ca0OCYWUNv53/coupon.html?n=[9820-9823] | grep -Eo 'coupon.html\\?n=[0-9]*|through [0-9][0-9]?/[0-9][0-9]?/[0-9][0-9][0-9]?[0-9]?' | grep -B 1 through | sed 's/[^0-9\\/]//g' | grep [0-9]"
Can anyone (cough Nigel) help me clean this up?
I don’t know if you’d regard this as more “beautiful”:
set xxx to do shell script "curl http://reg.e.staples.com/c/s/tagfrm/hBOJnz1B7SHGCB8ca0OCYWUNv53/coupon.html?n=[9820-9823] | sed -En '/.*coupon.html\\?n=([0-9]*)$/ {s::\\1: ; h ; } ; /.*through ([0-9]{1,2}\\/[0-9]{1,2}\\/[0-9]{2,4}).*/ {s::\\1: ; x ; G ; p ; } ;'"
Thank you Nigel! Before I sit down and try to decode this, I have a quick question. Are these two equivalent ?
set xxx to "abcdefg"
set yyy to do shell script "echo " & quoted form of xxx & " | sed -e 's/abc/cat /' -e 's/cat/dog/'"
set zzz to do shell script "echo " & quoted form of xxx & " | sed 's/abc/cat /; s/cat/dog/'"