Hello all
I am writing a script to count occurrences of all printable characters, including symbols, in a text files, then pipe that into Excel
Everything goes well until you reach the symbols. I have figured out almost all of them, backslash and “space” being the last to resist counting.
I have tried both the regular form with the escape backslash and the “quoted form of” without any result.
example(s) of my mess below
Does anyone has a clue on what I am doing wrong?
Test text: 5 spaces, 3 slash, 3 backslash.
b a
b a
a b
\
///
set the Space_count to do shell script “grep -o ‘[:space:]’ " & Posix_File_Name & " | wc -w”
– returns 3 counts . seems to count lines
set the Space_count to do shell script "grep -o " & quoted form of “\ " & " " & Posix_File_Name & " | wc -w”
– returns 0 count.
A (very nice) script was written by Bruce Phillips (included) but there is a glitch in the counting with spaces and backslash
Space: if a line contains consecutive spaces, they are counted as one, ditto for
i.e
\ =1
\ =1
\\ =1
isplay dialog "Count occurrences of this text:" default answer "\\" --escape backslash
set pattern to text returned of result
choose file with prompt "Search for text in files in this folder:" --modified from original "Choose folder >> Choose file"
set inputFolder to POSIX path of result
do shell script "/usr/bin/grep --count --no-filename --fixed-strings " & ¬
quoted form of pattern & " " & ¬
quoted form of inputFolder & "* " & ¬
" | /usr/bin/grep --invert-match '^0$'" & ¬
" | /usr/bin/python -c 'from sys import stdin
matches = 0
for line in stdin:
matches += int(line)
print matches'"
set matches to result as integer
set xxx to read "/Users/Allthegoodnamesaregone/Desktop/test.txt"
tell application "ASObjC Runner"
look for space in xxx
look for "/" in xxx
look for "\\\\" in xxx
end tell
If I got you right, you are looking for simple characters which does not need grep patterns.
Plain vanilla AppleScript should be OK for that:
set f to choose file
set fh to open for access f
set t to read fh
close access fh
set od to AppleScript's text item delimiters
set AppleScript's text item delimiters to "\\"
set r to (count text items of t) - 1
set AppleScript's text item delimiters to od
r
This counts all occurrences of backslash.
Since the text item delimiters can be lists, some basic white space counter is possible as well:
set AppleScript’s text item delimiters to {" ", tab, return, character id 10}
\ is in AS noted as an \
\ in AS is in unix \ when it’s in quoted form
\ in AS is in unix \\ when it’s not quoted
\n in AS is in unix \n when it’s in quoted form
\n in AS is in unix \\n when it’s not quoted
\\ is in AS noted as \\\
\\\ in AS is in unix \\\ when it’s quoted
\\\ in AS is in unxi \\\\\\ when it’s not quoted
When you pipe, stdin is never quoted; only the arguments. That means that the notation in theString is correct and a count of 3 is correct as well.
set theString to "b a a a
b a
a b
\\\\\\\\\\\\
///"
--return a list where non printable characters contains it's name. char 32 = sp and char 10 = nl for example.
return do shell script "echo " & quoted form of theString & " | od -A n -t a | egrep -o '[^[:space:]]{1,}' | sort -n | uniq -c | sort -n -k 1 | awk '{printf(\"%s %d\\n\", $2, $1)}'"
--return a list of decimal values and counted
return do shell script "echo " & quoted form of theString & " | od -A n -t u1 | egrep -o '[[:digit:]]{1,}' | sort -n | uniq -c | awk '{printf(\"%d %d\\n\", $2, $1)}'"