Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

moonman239

Cancelled
Original poster
Mar 27, 2009
1,541
32
I have a string of items that I need to separate according to their respective categories.

Here's my code so far:

Code:
on searchForSubstring(theString, theSubstring)
	try
		set oldDelims to AppleScript's text item delimiters
		set AppleScript's text item delimiters to theSubstring
		set itemsOfString to every text item of theString
		set indexes to {}
		set theIndex to 0
		repeat with X from 1 to ((count of itemsOfString) - 1)
			set theIndex to theIndex + (length of item X of itemsOfString) + 1
			copy theIndex to end of indexes
		end repeat
		set AppleScript's text item delimiters to oldDelims
		return indexes
	on error errMsg
		log errMsg
	end try
end searchForSubstring

to switchText from t to r instead of s
	set d to text item delimiters
	set text item delimiters to s
	set t to t's text items
	set text item delimiters to r
	tell t to set t to item 1 & ({""} & rest)
	set text item delimiters to d
	t
end switchText

on separateItemsIntoCategories(theContent, theCategories, categoryID)
	set newContent to theCategories
	log (count of theCategories)
	repeat with X from 1 to (count of theCategories)
		log X
		set categoryDelimiters to (item X of theCategories) & categoryID
		set categoryIndex to item 1 of searchForSubstring(theContent, categoryDelimiters)
		set newContent to switchText from newContent to "ijkl" instead of categoryIndex
	end repeat
	set oldDelims to AppleScript's text item delimiters
	set AppleScript's text item delimiters to "ijkl"
	return every text item of newContent
end separateItemsIntoCategories
try
	set theCourses to {"Appetizers", "Breakfast", "Entrees", "Soups and Salads", "Desserts"}
	set theCuisines to {"American", "Mediterranean", "Mexican", "Asian"}
	set thefile to POSIX file "/Users/Montana/Desktop/recipes.txt"
	set fref to (open for access thefile)
	set theContent to read fref
	close access fref
	set contentByCourses to separateItemsIntoCategories(theContent, theCourses, ":")
on error msg
	log msg
end try

It's a huge block of code, but I figured I'd post everything that might be relevant.
Here's the problem I'm having. After calling "separateItemsIntoCategories", AppleScript tells me it "can't get item 1 of {}." I can't figure out why there's a blank list. Nothing's wrong with the file, and nothing's wrong with the searchForSubstring handler, as I have tested that with different parameters.
 

chown33

Moderator
Staff member
Aug 9, 2009
10,740
8,416
A sea of green
Post a test file of sample data for "recipes.txt". It should show the error, i.e. it should not parse successfully.

I'd run the posted code here, but without data, and no comments as to what the data should look like, it's too time-consuming to invent sample data.
 

moonman239

Cancelled
Original poster
Mar 27, 2009
1,541
32
Post a test file of sample data for "recipes.txt". It should show the error, i.e. it should not parse successfully.

I'd run the posted code here, but without data, and no comments as to what the data should look like, it's too time-consuming to invent sample data.

The file looks something like this:

Appetizers:

(A bunch of recipes here)

Breakfast:

(Another set of recipes here)

etc.

The script should give me something like this:
{"(All the appetizer recipes)","(All the breakfast recipes)",...}
 

chown33

Moderator
Staff member
Aug 9, 2009
10,740
8,416
A sea of green
I did the following on OS versions 10.6.8 and 10.8.0. You didn't give your OS version, so your results may differ.


I started by making this test data file:

Filename: "vikings.txt"
Code:
# Before the first category.

Appetizers:
crunchy frog
crackers and spam

Breakfast:
egg and spam
egg, bacon, and spam
spam, egg, spam, spam, bacon and spam

Entrees:
barbecued spam with pineapple and spam
Lobster Thermidor aux crevettes and spam
lutefisk
Next, I changed the file's path in your script and ran it. It failed.

I then tried to figure out what your code was doing, by adding log statements at step-wise points. The resulting program was this (see separateItemsIntoCategories in particular):
Code:
on searchForSubstring(theString, theSubstring)
	try
		set oldDelims to AppleScript's text item delimiters
		set AppleScript's text item delimiters to theSubstring
		set itemsOfString to every text item of theString
		set indexes to {}
		set theIndex to 0
		repeat with X from 1 to ((count of itemsOfString) - 1)
			set theIndex to theIndex + (length of item X of itemsOfString) + 1
			copy theIndex to end of indexes
		end repeat
		set AppleScript's text item delimiters to oldDelims
		return indexes
	on error errMsg
		log errMsg
	end try
end searchForSubstring

to switchText from t to r instead of s
	set d to text item delimiters
	set text item delimiters to s
	set t to t's text items
	set text item delimiters to r
	tell t to set t to item 1 & ({""} & rest)
	set text item delimiters to d
	t
end switchText

on separateItemsIntoCategories(theContent, theCategories, categoryID)
	set newContent to theCategories
	log (count of theCategories)
	repeat with X from 1 to (count of theCategories)
		log " ----- item " & X & " -----"
		set categoryDelimiters to (item X of theCategories) & categoryID
		log categoryDelimiters
		set categoryIndex to item 1 of searchForSubstring(theContent, categoryDelimiters)
		log categoryIndex
		set newContent to switchText from newContent to "ijkl" instead of categoryIndex
		log newContent
	end repeat
	--set oldDelims to AppleScript's text item delimiters
	--set AppleScript's text item delimiters to "ijkl"
	--return every text item of newContent
end separateItemsIntoCategories

try
	set theCourses to {"Appetizers", "Breakfast", "Entrees", "Soups and Salads", "Desserts"}
	set theCuisines to {"American", "Mediterranean", "Mexican", "Asian"}
	
	set myPath to "/Volumes/TWork/Trials/a-script/moon/vikings.txt"
	
	set thefile to POSIX file myPath
	set fref to (open for access thefile)
	set theContent to read fref
	close access fref
	
	set contentByCourses to separateItemsIntoCategories(theContent, theCourses, ":")
	
on error msg
	log msg
end try
When run, it generates a fair amount of output in the Event Log subview of AppleScript Editor.app. I won't paste it here, because you can generate the output yourself, after editing the myPath variable's pathname string to your actual file pathname.

The logged output made no sense to me. It doesn't seem to have any relationship to what it should be doing, which is splitting text at delimiter words. Instead of trying to figure it out and make it work as given, I more or less started over from the requirements and the sample data file.


Firstly, I don't understand why you're going to the trouble of writing parsing and separating handlers when AppleScript's builtin text item delimiters should be able to do the job.

The text item delimiters has always been a list of delimiters. Before 10.6, only the first string in that list was used, meaning there could be only a single delimiter. As of 10.6 (and later), the full list is used. This means you can give it a complete list of delimiters, and text items will split the text at every occurence of any item in that list.

Your variable theCourses is already a list, and each string in that list is almost a delimiter. Simply append a colon to each string and the result is the desired delimiter. Put those in a list and boom.

Based on this simple analysis of the requirements and data, I wrote this:
Code:
try
	set theCourses to {"Appetizers", "Breakfast", "Entrees", "Soups and Salads", "Desserts"}
	
	-- Build filePath using Posix form; show its info.
	set _home to POSIX path of (path to home folder)
	set _path to "/TWork/Trials/a-script/moon/"
	set filePath to POSIX file (_home & _path & "vikings.txt")
	log filePath & " -- " & (info for filePath)
	
	-- Assemble delimiter list, giving each string a colon at end.
	set delimList to {}
	repeat with course in theCourses
		set end of delimList to (course & ":")
	end repeat
	log delimList
	
	-- Read entire file at once into content.
	set contentFile to (open for access filePath)
	set content to read contentFile
	close access contentFile
	log content
	
	-- Split content into substrings.
	set _was to AppleScript's text item delimiters
	set AppleScript's text item delimiters to delimList
	set contentList to text items of content
	set AppleScript's text item delimiters to _was
	
	-- Log individual parts.
	repeat with part in contentList
		log part
	end repeat
	
	-- Result is in contentList, a list of strings.
	return contentList
	
on error msg
	log msg
	
end try
When run, this produces the desired output, with the text before the first delimiter ("Appetizers:") appearing in the first element of the resulting list.

To eliminate the first item, assuming it's not wanted, you could write additional code to remove it, or make a new list containing every item except the 1st.


I am certainly no expert at AppleScript. "Somewhat competent" would be a better description. I simply applied basic programming principles to solving the problem, starting with basic AppleScript reference docs:
https://developer.apple.com/library...ptlangguide/conceptual/ASLR_fundamentals.html

That reference says that only the first element of text item delimiters is used, but I was pretty sure that's no longer true. I can't recall where I read that, nor what OS version it became true, so I can't cite an article or URL for it [see EDIT below]. So to test whether it was true or not, I wrote this:
Code:
try
	-- Test list-of-delims vs. single delim
	set delims to {":", ",", ";"}
	
	set content to "abc: def, ghi; jkl; mno, pqr, stu: vwx: yz"
	log "content -- " & content
	
	-- Split content into substrings.
	set _was to AppleScript's text item delimiters
	set AppleScript's text item delimiters to delims
	set contentList to text items of content
	set AppleScript's text item delimiters to _was
	log contentList
	
	-- Log individual parts.
	repeat with part in contentList
		log part
	end repeat
	
on error msg
	log msg
end try
Since this produced the expected results, I was fairly sure I could apply the same approach to data from a file, using longer strings as delimiters.

If it hadn't worked, I would have used a different approach, but that wasn't necessary so I won't expand on it further.


In the course of researching and writing the above scripts, I came across Smile, which is like an extended version of AppleScript. I did not test it, because I have little interest in using AppleScript for solving problems like this. I'll just point you to it, since you said before that you like doing things like this in AppleScript.
http://www.satimage.fr/software/en/smile/index.html

Among other things, it appears to have better interactivity than AppleScript Editor, which may prove more useful than other features. Debugging by log statements alone is primitive, at best. At worst, it's completely impossible.
http://www.satimage.fr/software/en/smile/interface/as_shell.html

Smile also has XML parsing features, and a DOM document model. I mention this in regard to your earlier XML parsing adventures.


Finally, you also previously wrote that you had web-dev experience, so you might look at JavaScript OSA:
http://www.latenightsw.com/freeware/JavaScriptOSA/index.html

It's basically JavaScript with the ability to send inter-process events (i.e. AppleEvents, i.e. AppleScript events). Since JavaScript has somewhat better split/join/search/replace capabilities when compared to AppleScript, you may find it more useful than reinventing all those foundational functions.

Again, I have not tried this, because I have no interest in using JavaScript for these kinds of text-parsing tasks.


If I were doing this, I might use 'awk' and/or bash, or a combination thereof. But since you already said you didn't want to learn another language, I mention this only as a potential future reference.


EDIT

I found a reference for the change to the text item delimiters list. See the 10.6 AppleScript Release Notes:
http://developer.apple.com/library/mac/#releasenotes/AppleScript/RN-AppleScript/RN-10_6/RN-10_6.html

Under the heading Other Enhancements:
When getting the text items of a string, all the values in text item delimiters are considered. Previous versions only considered the first item in the list.
 
Last edited:

moonman239

Cancelled
Original poster
Mar 27, 2009
1,541
32
The text item delimiters has always been a list of delimiters. Before 10.6, only the first string in that list was used, meaning there could be only a single delimiter. As of 10.6 (and later), the full list is used. This means you can give it a complete list of delimiters, and text items will split the text at every occurence of any item in that list.

Your variable theCourses is already a list, and each string in that list is almost a delimiter. Simply append a colon to each string and the result is the desired delimiter. Put those in a list and boom.

Well, just about every AppleScript page on the Internet is from like a few years ago. As soon as I finish this post, I'm going to make a new thread where members can come correct old information from the Web.
 

moonman239

Cancelled
Original poster
Mar 27, 2009
1,541
32
Hallelujah, it works! Thanks a lot!

Just for the reference, here's my new code:

Code:
on separateItemsIntoCategories(theContent, theCategories, categoryID)
	set oldDelims to AppleScript's text item delimiters
	set theDelims to {}
	repeat with X from 1 to count of theCategories
		set delim to (item X of theCategories) & categoryID
		copy delim to end of theDelims
	end repeat
	set AppleScript's text item delimiters to theDelims
	set categorizedContent to rest of (every text item of theContent)
	set AppleScript's text item delimiters to oldDelims
	return categorizedContent
end separateItemsIntoCategories
try
	set theCourses to {"Appetizers", "Breakfast", "Entrees", "Soups and Salads", "Desserts"}
	set theCuisines to {"American", "Mediterranean", "Mexican", "Asian"}
	set thefile to POSIX file "/Users/Montana/Desktop/recipes.txt"
	set fref to (open for access thefile)
	set theContent to read fref
	close access fref
	set contentByCourses to separateItemsIntoCategories(theContent, theCourses, ":")
	set item1 to item 1 of contentByCourses
	display alert item1
on error msg
	log msg
end try

For those who haven't already figured it out, categoryID denotes a character that follows the category name, so the script knows when it should make a new item in the list.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.