Counting Special characters with Grep

Hello all
I am writing a script to count occurrences of all printable characters, including symbols, in a text files, then pipe that into Excel
Everything goes well until you reach the symbols. I have figured out almost all of them, backslash and “space” being the last to resist counting.
I have tried both the regular form with the escape backslash and the “quoted form of” without any result.

example(s) of my mess below
Does anyone has a clue on what I am doing wrong?

Test text: 5 spaces, 3 slash, 3 backslash.
b a
b a
a b
\
///

set the Space_count to do shell script “grep -o ‘[:space:]’ " & Posix_File_Name & " | wc -w”
– returns 3 counts . seems to count lines
set the Space_count to do shell script "grep -o " & quoted form of “\ " & " " & Posix_File_Name & " | wc -w”
– returns 0 count.

A (very nice) script was written by Bruce Phillips (included) but there is a glitch in the counting with spaces and backslash
Space: if a line contains consecutive spaces, they are counted as one, ditto for
i.e
\ =1
\ =1
\\ =1

http://macscripter.net/viewtopic.php?pid=108998#p108998

isplay dialog "Count occurrences of this text:" default answer "\\" --escape backslash
set pattern to text returned of result

choose file with prompt "Search for text in files in this folder:" --modified from original "Choose folder >> Choose file"
set inputFolder to POSIX path of result

do shell script "/usr/bin/grep --count --no-filename --fixed-strings " & ¬
	quoted form of pattern & " " & ¬
	quoted form of inputFolder & "* " & ¬
	" | /usr/bin/grep --invert-match '^0$'" & ¬
	" | /usr/bin/python -c 'from sys import stdin
matches = 0
for line in stdin:
	matches += int(line)
print matches'"
set matches to result as integer

Try:

set xxx to read "/Users/Allthegoodnamesaregone/Desktop/test.txt"
tell application "ASObjC Runner"
	look for space in xxx
	look for "/" in xxx
	look for "\\\\" in xxx
end tell

If I got you right, you are looking for simple characters which does not need grep patterns.

Plain vanilla AppleScript should be OK for that:

set f to choose file
set fh to open for access f
set t to read fh
close access fh

set od to AppleScript's text item delimiters
set AppleScript's text item delimiters to "\\"
set r to (count text items of t) - 1
set AppleScript's text item delimiters to od

r

This counts all occurrences of backslash.

Since the text item delimiters can be lists, some basic white space counter is possible as well:

set AppleScript’s text item delimiters to {" ", tab, return, character id 10}

Jürgen

Hi, John. This will of course error unless Allthegoodnamesaregone knows about “ASObjC Runner” and has it installed. :slight_smile:

Hello! :slight_smile:

This is a way with Bash:

This can be called with the do shell script command and you get the sum of special characters as the return value.

I think you should save it as utf-8 with no bom, and make it executable with chmod u+x thefile.sh before you run it, not totally sure but I’d do that.

[code]#!/bin/bash

cat $1 |tr -d ’ /' >tmpfile
a=du $1 |tail -1 |sed -n 's/\([0-9]*\)[^0-9]*/\1/p'
b=du tmpfile |tail -1 |sed -n 's/\([0-9]*\)[^0-9]*/\1/p'
res=$(($a-$b))
echo $res[/code]

\ is in AS noted as an \
\ in AS is in unix \ when it’s in quoted form
\ in AS is in unix \\ when it’s not quoted
\n in AS is in unix \n when it’s in quoted form
\n in AS is in unix \\n when it’s not quoted
\\ is in AS noted as \\\
\\\ in AS is in unix \\\ when it’s quoted
\\\ in AS is in unxi \\\\\\ when it’s not quoted

When you pipe, stdin is never quoted; only the arguments. That means that the notation in theString is correct and a count of 3 is correct as well.

set theString to "b a a a
b  a
a  b
\\\\\\\\\\\\
///"

--return a list where non printable characters contains it's name. char 32 = sp and char 10 = nl for example.
return do shell script "echo " & quoted form of theString & " | od -A n -t a | egrep -o '[^[:space:]]{1,}' | sort -n | uniq -c | sort -n -k 1 | awk '{printf(\"%s	%d\\n\", $2, $1)}'"
--return a list of decimal values and counted
return do shell script "echo " & quoted form of theString & " | od -A n -t u1 | egrep -o '[[:digit:]]{1,}' | sort -n | uniq -c | awk '{printf(\"%d	%d\\n\", $2, $1)}'"

Thanks to both of you,
As for “ASObjC Runner”, it is in my “To Google Next” box.

Thanks

Thanks to all for your input & help