Getting a list of numbers out of text file?

I have a text file like this for example:
One two three 1234 four five 890 six 56 seven eight nine.

Now, I know that with this:
set theFile to choose file
set wordList to read (theFile) using delimiter {" "}
I can get a list of words like this:
One
two
three
1234
four
five
890
six
56
seven
eight
nine.
…but how I can get just a list of numbers of the text file, like this
1234
890
56

Kari


--set str to read (choose file)
--
--> "One two three 1234 four five 890 six 56 seven eight nine."

-- debugging:
set str to "One two three 1234 four five 890 six 56 seven eight nine."

set tokens to BreakAtNumbers(str)
--
--> {"One two three ", "1234", " four five ", "890", " six ", "56", " seven eight nine."}

set nums to {}

-- numbers are at even indexes:
--
repeat with i from 2 to tokens's length by 2
	set nums's end to tokens's item i
end repeat

nums --> {"1234", "890", "56"}


property kcAsc0 : ASCII character 0

on BreakAtNumbers(str)
	set k to kcAsc0 -- used as sentinal, must not be in str
	set astids to AppleScript's text item delimiters
	try
		repeat with i from 0 to 9
			set AppleScript's text item delimiters to i as string
			set str to str's text items
			set AppleScript's text item delimiters to k & i & k
			set str to str as string
		end repeat
		set AppleScript's text item delimiters to k & k
		set str to str's text items
		set AppleScript's text item delimiters to ""
		set str to str as string
		set AppleScript's text item delimiters to k
		set str to str's text items
	on error e number n from f to t partial result p
		set AppleScript's text item delimiters to astids
		error e number n from f to t partial result p
	end try
	set AppleScript's text item delimiters to astids
	return str
end BreakAtNumbers

Some better comments:


--	It's just faster to save a special character to a
--	property at compile-time than it is to call a
--	scripting addition command at runtime:
--
property kcAsc0 : ASCII character 0

on BreakAtNumbers(str)

	-- str == "abc 123 def"

	set k to kcAsc0 -- pretend == "~"

	set astids to AppleScript's text item delimiters
	try
		repeat with i from 0 to 9

			set AppleScript's text item delimiters to i as string
			set str to str's text items
			--
			--> loop 1: { "abc ", "23 def" }

			set AppleScript's text item delimiters to k & i & k
			set str to str as string
			--
			--> loop 1: "abc ~1~23 def"

		end repeat

		--> "abc ~1~~2~~3~ def"

		set AppleScript's text item delimiters to k & k -- "~~"
		set str to str's text items
		--
		--> { "abc ~1", "2", "3~ def" }

		set AppleScript's text item delimiters to ""
		set str to str as string
		--
		--> "abc ~123~ def"

		set AppleScript's text item delimiters to k
		set str to str's text items
		--
		--> { "abc ", "123", " def" }

	on error e number n from f to t partial result p
		set AppleScript's text item delimiters to astids
		error e number n from f to t partial result p
	end try
	set AppleScript's text item delimiters to astids
	return str
end BreakAtNumbers

Arthur,

Thank you for your response. A few questions:

If variable kcAsc0 would have “a real self explanatory name” what would it be?

– str == “abc 123 def”
What’s this def?

set k to kcAsc0 – pretend == “~”
What is this pretend? ASCII character 0 is null and ~is ASCII character 126.

Kari

I’m sorry. I guess I was being a bit, um, pedantic.

The ‘k’ prefix is a convention in Macintosh programming, (or at least it used to be). It indicates that a variable is a “constant,” which I use simply to mean that the property should be treated as read-only. I use ‘c’ to indicate a ‘character’, as opposed to a multi-character string ‘s’, an array or list ‘a’, a boolean ‘b’, etc. I am in the habit of using such conventions in particular situations, but I probably shouldn’t when posting to this list.

kcAsc0 == constant character ASCII 0

The point was simply to create a “sentinal”, that is, a value that was not likely to exist in your data. When you can insert a unique character into a string, it can often be used to speed up certain kinds of string-processing. The point of placing it into a script property instead of a variable was to avoid using the “ASCII character” command during the execution of the script, (it’s quite slow to repeatedly call a scripting addition command). Commands set to script properties are executed when the script is compiled, and not when the script is run.

Um… d, e, and f are the characters that follow a, b, and c in the alphabet. :wink:

I was just showing an example string, like yours:
“One two three 1234 four five 890 six 56 seven eight nine.”

I just wanted to show how the null character was being used in the script with the example string, such that when my comment showed:

→ loop 1: “abc ~1~23 def”

you would know to interpret it as:

→ loop 1: "abc " & ascii character 0 & “1” & ascii character 0 & …

It looks like I was being more helpful without comments… :wink:

Ok, I get it!

Thank you very much!

Kari