Cleaning data out of a text file?

It was in the script I posted in #78 and again in #99 :slight_smile:

And yes it’s always in the same location…

Sorry to be a pain…

You’re not.

I am just being lazy

Do you have a sample of the file?
Or is it the same as a converted one?

LOL

Chuckling away…

The file is flat ASCII and just a lit of IP addresses like:

1.9.21.100
1.9.46.177
1.20.91.197
1.20.97.214
1.20.106.165
1.23.144.150
1.30.242.33
1.30.243.191
1.34.164.204
1.34.177.7
1.34.196.227
1.34.220.216
1.47.79.218
1.55.249.92
1.59.55.187
1.109.60.10
1.109.149.149
1.109.156.58
1.109.207.108
1.119.12.115
1.160.26.243
1.162.99.242
1.168.93.10
1.171.76.226
1.171.171.60
1.178.3.11
1.186.45.250

I’ve emailed you a copy of an old list.

Here is the new version that also checks the “deny-ip-list.txt” file


use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

global IPsExisted

on run
	try
		set theFiles to choose file of type {"txt"} with multiple selections allowed
	on error
		return
	end try
	open theFiles
end run

on open theFiles
	local atid, dText, aFile, cFile, ipList, startTime
	set atid to text item delimiters
	set text item delimiters to {".txt"}
	set IPsExisted to read (((path to home folder as text) & "Library:Mobile Documents:com~apple~CloudDocs:_Daron Files:NAS:Deny List:deny-ip-list.txt") as alias)
	--set IPsExisted to paragraphs of IPsExisted -- left IPsExisted as text rather than list, sped up search
	repeat with aFile in theFiles
		try
			set cFile to open for access aFile --with write permission
		on error
			display alert "Uh-oh! Error opening file…" giving up after 10
			return false
		end try
		try
			set dText to read cFile from 1 as text
		on error
			set dText to false
			display alert "File Empty!" giving up after 10
		end try
		close access cFile
		if class of dText is not boolean then
			set ipList to parseDosIPs(dText)
			set dText to ((text items 1 thru -2 of (aFile as text)) as text)
			combSort(ipList)
			saveIPs(ipList, dText)
		end if
	end repeat
	set text item delimiters to atid
end open

on parseDosIPs(dosText)
	local atid, IPv4, tmp, ipRef
	script D
		property IPs : {}
		property dosList : paragraphs of dosText
	end script
	set ipRef to a reference to IPsExisted
	--set dosText to paragraphs of dosText
	set atid to text item delimiters
	set text item delimiters to {"remote] from ", "] from source: ", "LAN access", ", ", ":"} -- {"from source: ", ", "}
	repeat with i in D's dosList
		set tmp to contents of i
		if tmp ≠ "" then
			set tmp to text items of tmp
			if (count tmp) > 2 then
				set IPv4 to item 3 of tmp --word 1 of 
				if IPv4 does not start with "192.168.1" then
					if (IPv4 is not in D's IPs) then
						if (IPv4 is not in ipRef) then
							set end of D's IPs to IPv4
						end if
					end if
				end if
			end if
		end if
	end repeat
	set text item delimiters to atid
	return D's IPs
end parseDosIPs

on combSort(aList) -- FASTEST
	local sf, i, j, cc, ns, js, gap, pgap, sw -- ns means No Swap
	if class of aList is not list then return false
	script mL
		property nlist : aList
	end script
	set sf to 1.7
	set cc to count mL's nlist
	set gap to cc div sf
	repeat until gap = 0
		repeat with i from 1 to gap
			set js to cc - gap
			repeat until js < 1 -- do each gap till nor more swaps
				set ns to gap
				repeat with j from i to js by gap
					if (item j of mL's nlist) > (item (j + gap) of mL's nlist) then
						set sw to (item j of mL's nlist)
						set (item j of mL's nlist) to (item (j + gap) of mL's nlist)
						set (item (j + gap) of mL's nlist) to sw
						set ns to j
					end if
				end repeat
				set js to ns - gap
			end repeat
		end repeat
		set pgap to gap
		set gap to gap div sf
		if gap = 0 then -- no while using as integer
			if pgap ≠ 1 then set gap to 1
		end if
	end repeat
end combSort

on saveIPs(pList as list, pPath as string)
	local cFile, cEOF
	set cFile to pPath & "_IP_CLEANED.txt"
	try
		set cFile to open for access cFile with write permission
	on error
		display alert "Uh-oh! Error opening file…" giving up after 10
		return false
	end try
	set atid to text item delimiters
	set text item delimiters to linefeed
	try
		set cEOF to (get eof cFile) + 1
		write (pList as text) & linefeed to cFile as text starting at cEOF
	on error
		display alert "Error! Can't write to IP_CLEANED file…" giving up after 10
	end try
	set text item delimiters to atid
	close access cFile
	return true
end saveIPs

EDIT Oops, had to remove some test timing code

Sorry, I got an error:

If it helps the prior script set the path via:

set theExistingFile to "/Users/dbrewood/Library/Mobile Documents/com~apple~CloudDocs/_Daron Files/NAS/Deny List/deny-ip-list.txt"

you need to use HFS paths, not POSIX paths
HFS Paths are separated by colons “:” not slashes “/”

I had already fixed the Path but you had already downloaded the script.
(you’re to fast for me)

Try downloading it again

Hmm, means nothing to me alas. Okay I’ve re-downloaded… I tried the script again and I get a new error:

My Bad.

I left some remnants of test timing code

All better, I Hope.

Try now

Yee haaa! It works! I hate to say it but…

The IP addresses sort like:

Instead of:

That is not sorted numerically by the first IP group. Does that make sense? It’s easier to work with the raw data (sometimes I have to) if it is:

Sorry…

It is currently sorting Alphabetically, not numerically, since it is text not numerics.

It can be done, but not easy since we first have to convert each IPv4 into a list of integers
and then modify the combsort to sort all by all 4 octets.

I have a version of comSort that does sort on multiples in a list.
I’ll find it

Will the end user have BBEdit installed?

Ah ha right, I can live with it is I have to, it’s not a problem :slight_smile:
I don’t want to put you to so much trouble…

I do and even better Textmate. :slight_smile:

Ideally though I’d prefer not to use another tool. The original script did sort in the required manner. But if this is a preferred way to go with things…

Here is a version that will sort IPs numerically, not alphabetically.

I found my multi-item sort version of combSort

use AppleScript version "2.4" -- Yosemite (10.10) or later
use scripting additions

global IPsExisted

on run
	try
		set theFiles to choose file of type {"txt"} with multiple selections allowed
	on error
		return
	end try
	open theFiles
end run

on open theFiles
	local atid, dText, aFile, cFile, startTime, tmp
	script F
		property ipList : missing value
	end script
	set atid to text item delimiters
	set IPsExisted to read (((path to home folder as text) & "Library:Mobile Documents:deny-ip-list.txt") as alias)
	--set IPsExisted to paragraphs of IPsExisted -- left IPsExisted as text rather than list, sped up search
	set text item delimiters to {linefeed}
	set IPsExisted to linefeed & ((paragraphs of IPsExisted) as text) & linefeed -- gets rid of returns, only linefeeds left
	repeat with aFile in theFiles
		try
			set cFile to open for access aFile --with write permission
		on error
			display alert "Uh-oh! Error opening file…" giving up after 10
			return false
		end try
		try
			set dText to read cFile from 1 as text
		on error
			set dText to false
			display alert "File Empty!" giving up after 10
		end try
		close access cFile
		if class of dText is not boolean then
			set F's ipList to parseDosIPs(dText)
			set text item delimiters to {".txt"}
			set dText to ((text items 1 thru -2 of (aFile as text)) as text)
			set text item delimiters to {"."}
			set c to count of (text items in item 1 of F's ipList)
			repeat with j from 1 to count F's ipList
				set tmp to text items of item j of F's ipList
				repeat with i from 1 to c
					set item i of tmp to (item i of tmp) as integer
				end repeat
				set item j of F's ipList to tmp
			end repeat
			combSortM(F's ipList, {1, 2, 3, 4})
			--=ipList
			repeat with j from 1 to count F's ipList
				set item j of F's ipList to item j of F's ipList as text
			end repeat
			saveIPs(F's ipList, dText)
		end if
	end repeat
	set text item delimiters to atid
end open

on parseDosIPs(dosText)
	local atid, IPv4, ipRef, IPs, i
	script D
		property dosList : paragraphs of dosText
		property tmp : missing value
	end script
	set ipRef to a reference to IPsExisted
	set IPs to ""
	--set dosText to paragraphs of dosText
	set atid to text item delimiters
	set text item delimiters to {"remote] from ", "] from source: ", "LAN access", ", ", ":"} -- {"from source: ", ", "}
	repeat with i in D's dosList
		set IPv4 to contents of i
		if IPv4 ≠ "" then
			set D's tmp to text items of IPv4
			if (count D's tmp) > 2 then
				set IPv4 to linefeed & (item 3 of D's tmp) & linefeed --word 1 of 
				if IPv4 does not start with (linefeed & "192.168.1.") then
					if (IPv4 is not in IPs) then
						if (IPv4 is not in ipRef) then
							set IPs to IPs & IPv4 --set end of D's IPs to IPv4
						end if
					end if
				end if
			end if
		end if
	end repeat
	set text item delimiters to linefeed & linefeed
	set IPs to text items of (text 2 thru -2 of IPs) -- removes leading and trailing linefeed
	set text item delimiters to atid
	return IPs
end parseDosIPs

on combSortM(aList, sortList) -- FASTEST
	local i, j, cc, ns, js, gap, pgap, c, sw, sf, comp -- ns means No Swap
	script mL
		property nlist : aList
		property sList : {}
		property oList : {}
	end script
	set sf to 1.7 -- shrink factor
	set cc to count mL's nlist
	repeat with j in sortList
		if j > 0 then -- if positive, sort ascending
			set end of mL's sList to (contents of j)
		else -- if negative,sort descending
			set end of mL's sList to -(contents of j)
		end if
		set end of mL's oList to (j > 0)
	end repeat
	
	set gap to cc div sf
	repeat until gap = 0
		repeat with i from 1 to gap
			set js to cc - gap
			repeat until js < 1 -- do each gap till nor more swaps
				set ns to gap
				repeat with j from i to js by gap
					set comp to false
					repeat with i from 1 to count of mL's sList -- do multiple comparisons
						set c to item i of mL's sList
						if (item c of item j of mL's nlist) < (item c of item (j + gap) of mL's nlist) then
							if not (item i of mL's oList) then set comp to true -- ascending
							exit repeat
						else if (item c of item j of mL's nlist) > (item c of item (j + gap) of mL's nlist) then
							if (item i of mL's oList) then set comp to true -- descending
							exit repeat
						end if
					end repeat
					if comp then -- do the swap
						set sw to (item j of mL's nlist)
						set (item j of mL's nlist) to (item (j + gap) of mL's nlist)
						set (item (j + gap) of mL's nlist) to sw
						set ns to j
					end if
				end repeat
				set js to ns - gap
			end repeat
		end repeat
		set pgap to gap
		set gap to gap div sf
		if gap = 0 then -- no while using as integer
			if pgap ≠ 1 then set gap to 1
		end if
	end repeat
end combSortM

on saveIPs(pList as list, pPath as string)
	local cFile, cEOF
	set cFile to pPath & "_IP_CLEANED.txt"
	try
		set cFile to open for access cFile with write permission
	on error
		display alert "Uh-oh! Error opening file…" giving up after 10
		return false
	end try
	set atid to text item delimiters
	set text item delimiters to linefeed
	try
		set cEOF to (get eof cFile) + 1
		write (pList as text) & linefeed to cFile as text starting at cEOF
	on error
		display alert "Error! Can't write to IP_CLEANED file…" giving up after 10
	end try
	set text item delimiters to atid
	close access cFile
	return true
end saveIPs

EDIT I added another linefeed to the end of text from the deny-ip-plist.txt. this is so every ip is enclosed in linefeeds to prevent finding for example “41.2.3.4” in “141.2.3.4” when using text searches

EDIT I moved the ipList into a script object to speed up the parsing

Sorry, but I got an error again:

Fixed it, I replaced the line above with this one:

set IPsExisted to read (((path to home folder as text) & "Library:Mobile Documents:com~apple~CloudDocs:_Daron Files:NAS:Deny List:deny-ip-list.txt") as alias)
   --set IPsExisted to paragraphs of IPsExisted -- left IPsExisted as text rather than list, sped up search

That is perfect now! :slight_smile:

Many many thanks…

Okay, being a ‘data freak’ I did a compare of the output data from the original script I used (where I had to strip out manually the 192.168.1.xxx IP addresses) and the new script. There was a difference of 5 IP addresses, the new script output didn’t contain 5 addresses.
The IP addresses which were not picked up were from these lines:

Any ideas at all as to why these were missed?

Sorry about this…

Weird.

Of the sample you just gave only half were in the sample file you gave me.

But when I added them, they showed up for me.

Are any of them in the deny-ip-list?

p.s. I made some small edits to the script above.

Much wierdness indeed. I’ve just run the script (the updated one) over the latest log file (and today’s deny list) and this time the resultant data ties up. Very strange indeed.

I’ve sent copies of both to you via email…

Oh, I did get the alias error again so had to modify the script, changing:

set IPsExisted to read (((path to home folder as text) & "Library:Mobile Documents:deny-ip-list.txt") as alias)
	--set IPsExisted to paragraphs of IPsExisted -- left IPsExisted as text rather than list, sped up search

to

set IPsExisted to read (((path to home folder as text) & "Library:Mobile Documents:com~apple~CloudDocs:_Daron Files:NAS:Deny List:deny-ip-list.txt") as alias)
--set IPsExisted to paragraphs of IPsExisted -- left IPsExisted as text rather than list, sped up search

I was wondering about these.

Do you want just the first?

(I didn’t get the email)