Jessica Lares

macrumors G3
Original poster
So I have about 90 more entries of my walks to go through. I did a few manually, but I bet I could do this automatically with AppleScript or something.

This are the bits of the files I want:

<durationString>30'11"</durationString>
<distanceString>0.96 mi</distanceString>
<pace>31'30" / mi</pace>

I just want a little script that pulls out that middle data in a way so I could copy and paste that into a spreadsheet manually. So the end result would be:

30'11" 0.96 mi 31'30" / mi

Which would automatically format itself to filling three cells that I could clean up so it can be used to calculate averages in total. I'd put the date in separately.

A snippet or link to a tutorial would be appreciated. Something that dumps it all into a plain text file would be fine.

MacUser2525

Suspended
A simple bash script will do you. I took your data above and put it in an .xml file then used the following commands on it.

Code:
MacUser2525:~$nano /Volumes/Sea_To_Do/working/nike.xml MacUser2525:~$ grep duration /Volumes/Sea_To_Do/working/nike.xml
<durationString>30'11"</durationString>

MacUser2525:~$grep duration /Volumes/Sea_To_Do/working/nike.xml | cut -d ">" -f 2 30'11"</durationString MacUser2525:~$ grep duration /Volumes/Sea_To_Do/working/nike.xml | cut -d ">" -f 2 | tr -d \<\/durationString
30'11"

MacUser2525:~$grep distance /Volumes/Sea_To_Do/working/nike.xml | cut -d ">" -f 2 | tr -d \<\/distanceString 0.96 m MacUser2525:~$ grep pace /Volumes/Sea_To_Do/working/nike.xml | cut -d ">" -f 2 | tr -d \<\/pace
31'30"  mi

Now in a bash script.

Code:
MacUser2525:~$nano /Volumes/Sea_To_Do/working/nike.sh #!/bin/bash grep duration /Volumes/Sea_To_Do/working/nike.xml | cut -d ">" -f 2 | tr -d \<\/durationString grep distance /Volumes/Sea_To_Do/working/nike.xml | cut -d ">" -f 2 | tr -d \<\/distanceString grep pace /Volumes/Sea_To_Do/working/nike.xml | cut -d ">" -f 2 | tr -d \<\/pace MacUser2525:~$ chmod +x /Volumes/Sea_To_Do/working/nike.sh

Run the bash script.

Code:
MacUser2525:~$/Volumes/Sea_To_Do/working/nike.sh 30'11" 0.96 m 31'30" mi It outputs what you need a simple re-direct on the end of the grep expression will put that into a file for you. They will each be on separate line so may as well change that and the script is limited to the one hard coded nike.xml file that needs changing as well. Since I am thinking that you have individual files for each walk from your wording of the question may as well do a for loop to process each file to get the required information too. Code: #!/bin/bash for i in$(ls /Volumes/path/to/nike/xml/*.xml); do
duration=grep duration "$i" | cut -d ">" -f 2 | tr -d \<\/durationString distance=grep distance "$i" | cut -d ">" -f 2 | tr -d \<\/distanceStringm
pace=grep pace "$i" | cut -d ">" -f 2 | tr -d \<\/pacemi if [ -f nike.tsv ] ; then printf "$duration\t$distance\t$pace\n" >> nike.tsv
else
printf "Duration\tDistance\tPace\n" >> nike.tsv
printf "$duration\t$distance\t$pace\n" >> nike.tsv fi done This would give you a nike.tsv or tab separated values file you can import in the current directory it is ran from in Terminal. The script will fail if there are spaces in the names so those need to be removed before running it. Also the path needs to be changed to the directory containing the .xml files in your setup the "ls /Volumes/path/to/nike/xml/*.xml" part. The output of it run on a few files I made here containing the same data as you posted. Code: MacUser2525:~$ /Volumes/Sea_To_Do/working/nike.sh
MacUser2525:~$cat nike.tsv Duration Distance Pace 30'11" 0.96 31'30" 30'11" 0.96 31'30" 30'11" 0.96 31'30" 30'11" 0.96 31'30" As you can see I eliminated the mi in both spots that had them as it is likely to remain constant all the time so you know what it is supposed to be anyways. Edit: NameChanger an easy to use program to remove the spaces if necessary. http://www.mrrsoftware.com/MRRSoftware/NameChanger.html Last edited: Comment subsonix macrumors 68040 Here is an alternative, if you navigate to the folder where your xml files are located in the terminal. Code: for i in *.xml ; do grep -oP "[^(</*(durationString|distanceString|pace)>)][0-9\'\"mi./\s]+" "$i" | tr '\n' ';' ; echo ; done > nike+.csv

Then

Code:
open nike+.csv -a Numbers

It will create a csv file from your xml files and open it in Numbers, (which I do believe still supports csv). You can also import this manually. This is all quite brittle and may work, or not.

Comment

Jessica Lares

macrumors G3
Original poster
Thank you very much! The filenames look like "2013-01-07 17;45;54.xml", so I guess I'm either going to have to rename them to take out the space, yes (and are the ; a problem too?)? Just putting them in " and " doesn't work? Pretty sure I can just use Automator to take care of that anyway. Pretty sure I did that once already.

Could I have all of them write to the same file, and add a line everytime? I would add the date field to the first part of </*(durationString|distanceString|pace)>.

Comment

subsonix

macrumors 68040
Thank you very much! The filenames look like "2013-01-07 17;45;54.xml", so I guess I'm either going to have to rename them to take out the space, yes (and are the ; a problem too?)? Just putting them in " and " doesn't work? Pretty sure I can just use Automator to take care of that anyway. Pretty sure I did that once already.

Could I have all of them write to the same file, and add a line everytime? I would add the date field to the first part of </*(durationString|distanceString|pace)>.

The filename should not be a problem, the names are carried in the "$i" variable in the loop. I tried and used the naming style you showed here with the date and I had no problem with it. For the date field you can use the "$i" variable, and do that before the existing grep part. It's getting a bit ugly so perhaps it's better to add it to a script file if you need the date part as well.

Code:
for i in *.xml ; do echo "$i" | grep -oP "\d+-\d+-\d+" | tr '\n' ';' ; grep -oP "[^(</*(durationString|distanceString|pace)>)][0-9\'\"mi./\s]+" "$i" | tr '\n' ';' ; echo ; done > nike+.csv

BTW, everything should get added to the same file, afaik.

Comment

MacUser2525

Suspended
Thank you very much! The filenames look like "2013-01-07 17;45;54.xml", so I guess I'm either going to have to rename them to take out the space, yes (and are the ; a problem too?)? Just putting them in " and " doesn't work? Pretty sure I can just use Automator to take care of that anyway. Pretty sure I did that once already.

Could I have all of them write to the same file, and add a line everytime? I would add the date field to the first part of </*(durationString|distanceString|pace)>.

Space matters ; does not in fact we will use it to get your date rename the space to ; so you end up with file names like this "2013-01-07;17;45;54.xml". The new version of the script below.

Code:
#!/bin/bash

for i in $(ls /Volumes/path/to/nike/xml/*.xml); do filename=$i
date=echo $filename | tr -d \/Volumes\/path\to\nike\/xml\/ | cut -d ";" -f 1 duration=grep duration "$i" | cut -d ">" -f 2 | tr -d \<\/durationString
distance=grep distance "$i" | cut -d ">" -f 2 | tr -d \<\/distanceStringm pace=grep pace "$i" | cut -d ">" -f 2 | tr -d \<\/pacemi
if [ -f nike.tsv ] ; then
printf "$date\t$duration\t$distance\t$pace\n" >> nike.tsv
else
printf "Date\tDuration\tDistance\tPace\n" >> nike.tsv
printf "$date\t$duration\t$distance\t$pace\n" >> nike.tsv
fi
done

You now need to replace the path two times in this script to get the required data.

----------

Code:
for i in *.xml ; do echo "$i" | grep -oP "\d+-\d+-\d+" | tr '\n' ';' ; grep -oP "[^(</*(durationString|distanceString|pace)>)][0-9\'\"mi./\s]+" "$i" | tr '\n' ';' ; echo ; done > nike+.csv

BTW, everything should get added to the same file, afaik.

Not with a single redirect the > you need two >> to append to a file one simply overwrites the existing file every time the loop runs.

Comment

subsonix

macrumors 68040
Not with a single redirect the > you need two >> to append to a file one simply overwrites the existing file every time the loop runs.

You're wrong.

I got the impression that there was a problem to this effect, I think we would need to know more to do anything about it.

Comment

subsonix

macrumors 68040

I'm well aware of redirection, but the loop writes to the same file descriptor in one go, ie it's one operation. If you do not want to test this by actually create two files, this confirms it as well.

Code:
for i in {1..10} ; do
echo $i done > test Comment MacUser2525 Suspended I'm well aware of redirection, but the loop writes to the same file descriptor in one go, ie it's one operation. If your awareness is anything like your reading comprehension I doubt it. I said nothing about outside the for loop redirection and I for one am going to use the safest always works in all situations option every time. Comment subsonix macrumors 68040 If your awareness is anything like your reading comprehension I doubt it. I said nothing about outside the for loop redirection and I for one am going to use the safest always works in all situations option every time. What are you talking about? You specifically attributed the error to the redirection from the loop, claiming that each iteration would overwrite the last one, which is wrong. Redirection taken out of this context makes no sense here, it doesn't explain this error, adding a second '>' doesn't do anything in this case, the result is the same. Comment subsonix macrumors 68040 Having looked at one of those nike+ xml files, I got the following working. It finds the tag names, then strips the xml tags and adds delimiters for the csv file. I also noticed that there was a tag called <time> which contained a date and time so it seems like it can be added without having to use the file name, if the format is consistent with what I found. Code: for i in *.xml ; do grep -P "(time|durationString|distanceString|pace).+" "$i" | sed 's/<[^>]*>//g' | tr '\n' ';' ; echo ; done > nike+.csv

Code:
open nike+.csv -a Numbers

Comment

Jessica Lares

macrumors G3
Original poster
Having looked at one of those nike+ xml files, I got the following working. It finds the tag names, then strips the xml tags and adds delimiters for the csv file.

I also noticed that there was a tag called <time> which contained a date and time so it seems like it can be added without having to use the file name, if the format is consistent with what I found.

It throws this out 80+ times, but it does make the csv file, only it's just an empty table:

usage: grep [-abcDEFGHhIiJLlmnOoPqRSsUVvwxZ] [-A num] [-B num] [-C[num]]
[-e pattern] [-f file] [--binary-files=value] [--color=when]
[--context[=num]] [--directories=action] [--label] [--line-buffered]
[--null] [pattern] [file ...]

Comment

subsonix

macrumors 68040
It throws this out 80+ times, but it does make the csv file, only it's just an empty table:

Make sure you are copy/pasting that whole line in it's entirety. That error message is from grep. I had no problem with it here, and I tried it again just to make sure. The only thing I can test it on is what you have given here, and another part from one of those xml files which I found online. Having said that, the error would not depend on the input, but grep being used wrongly.

Comment

Jessica Lares

macrumors G3
Original poster
Make sure you are copy/pasting that whole line in it's entirety. That error message is from grep. I had no problem with it here, and I tried it again just to make sure. The only thing I can test it on is what you have given here, and another part from one of those xml files which I found online. Having said that, the error would not depend on the input, but grep being used wrongly.

I've attached a screenshot. I'm sure I copied it correctly?

Attachments

• terminal.png
28.8 KB · Views: 76
Comment

subsonix

macrumors 68040
I've attached a screenshot. I'm sure I copied it correctly?

Ok, let's break it down to the bare minimum. If I just use the grep part, I get this:

Code:
Mac% for i in lastWorkout.xml ; do grep -P "(time|durationString|distanceString|pace)" "$i" ; done <time>2006-09-05T12:53:57+01:00</time> <durationString>58:58</durationString> <distanceString>5.00 mi</distanceString> <pace>11:47 min/mi</pace> You can remove the '.+' part at the end of that regex btw, it's a left over of the first line here. Edit: If I remove the loop as well I get this: Code: Mac% grep -P "(time|durationString|distanceString|pace)" lastWorkout.xml <time>2006-09-05T12:53:57+01:00</time> <durationString>58:58</durationString> <distanceString>5.00 mi</distanceString> <pace>11:47 min/mi</pace> Comment Jessica Lares macrumors G3 Original poster The second loop doesn't work, and either does that first version. When I take off the done part (just to see what happens), it just gives me: Code: > Is there anyway to check grep and see if it's even correctly set up? I did get it to echo hello. Comment subsonix macrumors 68040 The second loop doesn't work, and either does that first version. When I take off the done part (just to see what happens), it just gives me: Code: > Is there anyway to check grep and see if it's even correctly set up? I did get it to echo hello. If you remove the done part, then it gives you a prompt to continue with the loop body. 'Done' is what finishes the loop, this enables you to write a loop over several lines. It's unrelated to the grep issue, what happens if you try that last line that starts with 'grep'? Code: grep -P "(time|durationString|distanceString|pace)" lastWorkout.xml (Change the name of the xml file to what ever name you are using). Comment Jessica Lares macrumors G3 Original poster If you remove the done part, then it gives you a prompt to continue with the loop body. Done is what finishes the loop, this enables you to write a loop over several lines. It's unrelated to the grep issue, what happens if you try that last line that starts with 'grep'? Makes sense. That last line just gives me the usage parameters again. Comment subsonix macrumors 68040 Makes sense. That last line just gives me the usage parameters again. That's odd. What happens if you do "grep -V"? You may try to replace the "-P" with "-E" the P option is for Perl regular expressions but the odd thing is that it's mentioned in that usage message you get there. Also long shot but, if you copy the line above again then do: Code: pbpaste | od -bc I get this: Code: 0000000 147 162 145 160 040 055 120 040 042 050 164 151 155 145 174 144 g r e p - P " ( t i m e | d 0000020 165 162 141 164 151 157 156 123 164 162 151 156 147 174 144 151 u r a t i o n S t r i n g | d i 0000040 163 164 141 156 143 145 123 164 162 151 156 147 174 160 141 143 s t a n c e S t r i n g | p a c 0000060 145 051 042 040 154 141 163 164 127 157 162 153 157 165 164 056 e ) " l a s t W o r k o u t . 0000100 170 155 154 x m l Which is an octal dump just to make sure there are no weird characters in there (characters up to 177 octal are valid ascii). Comment Jessica Lares macrumors G3 Original poster That's odd. What happens if you do 'grep -V'? You may try to replace the '-P' with '-E' the P option is for Perl regular expressions but the odd thing is that it's mentioned in that usage message you get there. grep (BSD grep) 2.5.1-FreeBSD And yeah, the -E instead of the -P finally gives me the data. Comment subsonix macrumors 68040 grep (BSD grep) 2.5.1-FreeBSD And yeah, the -E instead of the -P finally gives me the data. Interesting, I get "(GNU grep) 2.5.1". So, even though the same flag are supported "-P" there seems to be an incompatibility. But with that out of the way, let's go back and try the previous version again, with the -E modification this time. Code: for i in *.xml ; do grep -E "(time|durationString|distanceString|pace)" "$i" | sed 's/<[^>]*>//g' | tr '\n' ';' ; echo ; done > nike+.csv

Btw, you can add a second "tr" in there to remove leading tabs if you have them in the xml.

Comment

Jessica Lares

macrumors G3
Original poster
This is what the csv looks like.

And this is from the first cell:

10Step Workout2013-01-07T17:45:54+00:00181074230'11"1.54210.96 mi117418831'30" / mi81021390072.6MD477LL1.0.2 (37A20067)DCYJHBDRF0GP2013-01-07T17:45:54+00:002013-01-07T18:16:05-06:000

The octal dump looked okay BTW. Nothing went over.

Attachments

• Screen Shot 2013-12-30 at 11.18.00 PM.png
56.2 KB · Views: 109
Last edited:
Comment

subsonix

macrumors 68040
This is what the csv looks like.

And this is from the first cell:

I think you'll need to attach one of those xml files here then, if you can. This is what I get in Numbers, with three xml files (all the same) which I found online here: http://blog.mattmecham.com/2006/09/05/ipod-training-data-under-the-hood/. It seems to only be a part of the file btw.

Edit: Actually, you can also try to remove the redirection to the .csv file (> nike+.csv) and look what it looks like in the terminal, it should be something like this:

Code:
	2006-09-05T12:53:57+01:00;	58:58;	5.00 mi;	11:47 min/mi;
2006-09-05T12:53:57+01:00;	58:58;	5.00 mi;	11:47 min/mi;
2006-09-05T12:53:57+01:00;	58:58;	5.00 mi;	11:47 min/mi;

Attachments

• Skärmavbild 2013-12-31 kl. 06.40.36.png
8.9 KB · Views: 81
Last edited:
Comment

Jessica Lares

macrumors G3
Original poster
Comment
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.