Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

SlugBlanket

macrumors regular
Original poster
Mar 5, 2011
130
7
Hi,

I'm trying to print the output of the top command but I only want certain columns printed in a certain order. The problem I'm having is that I can't seem to get awk to recognise the second field of the top command (no flags) as a single column when there is a space in the command name. It treats the space as being a field delimiter.

I'm stripping out the header info for the time being and getting the top 40 processes. Its the top 40 processes that I want in the order PID, PPID, Command, State, UID, User, %CPU and Time

Code:
Top -l1 -u -o cpu -S | head -54 | tail -41 | awk ' { printf "%-6s \%-6s \%-16s \%-8s \%-3s \%-13s \%-5s \%-8s \n",$1,$15,$2,$16,$17,$28,$3,$4 }'

The processes that are listed with a space in the name are "Google Chrome" btw.

Any help with this is much appreciated.
 

SlugBlanket

macrumors regular
Original poster
Mar 5, 2011
130
7
Have you looked at the top man page? It already supports this.

-Lee

I had but it must have passed under my radar, thank you I will re read again right now.

I would still like to know if it is possible with awk as there are other situations where I may need to be able to format data in such a way ie get awk and printf to work to overcome that space in a variable to be output with no obvious field delimiter.
 

chown33

Moderator
Staff member
Aug 9, 2009
10,739
8,415
A sea of green
II would still like to know if it is possible with awk as there are other situations where I may need to be able to format data in such a way ie get awk and printf to work to overcome that space in a variable to be output with no obvious field delimiter.

Take a closer look at your original data, the output of:
Code:
top -l1 -u -o cpu -S | head -54 | tail -41

What I notice right away is that 'top' is using spaces for aligning columns. The first char of each column-datum is vertically aligned to the first char of the header column-label. So if you strip off the header, you've eliminated the only thing that can definitively tell you where columns start.

So step 1 is to not strip the header. Step 2 is to parse the header char-by-char, to find out what the char-offset of each column-label is. Those numbers then tell you the char-offset of each column in the subsequent data lines.

There are any number of ways to iterate chars in a string using awk. One of the simplest is to iteratively get the substring starting at position 2, i.e. the remaining string after the 1st char. You can also iteratively use match(). split() isn't too useful, because you need to know the char-offsets of the space-to-nonspace transition which marks the start of a column-label.
 

SlugBlanket

macrumors regular
Original poster
Mar 5, 2011
130
7
Take a closer look at your original data, the output of:
Code:
top -l1 -u -o cpu -S | head -54 | tail -41

What I notice right away is that 'top' is using spaces for aligning columns. The first char of each column-datum is vertically aligned to the first char of the header column-label. So if you strip off the header, you've eliminated the only thing that can definitively tell you where columns start.

So step 1 is to not strip the header. Step 2 is to parse the header char-by-char, to find out what the char-offset of each column-label is. Those numbers then tell you the char-offset of each column in the subsequent data lines.

There are any number of ways to iterate chars in a string using awk. One of the simplest is to iteratively get the substring starting at position 2, i.e. the remaining string after the 1st char. You can also iteratively use match(). split() isn't too useful, because you need to know the char-offsets of the space-to-nonspace transition which marks the start of a column-label.

Sorry when I said I was stripping the header, I meant I was stripping out the non column output which gave me the load averages etc, ( the head and tail parts of the command achieve this) NOT the column headings. The column headings I certainly want to keep and indeed the first line of my output using my command gives me the column headings :) I'm new to scripting and indeed macs so you have lost me at "iteratively get the substring..."

I thought that perhaps there was a simple error in my method or code that someone might point out.
 

chown33

Moderator
Staff member
Aug 9, 2009
10,739
8,415
A sea of green
I'm new to scripting and indeed macs so you have lost me at "iteratively get the substring..."
Please outline your programming experience. In particular, how well do you really know awk? Did you write the awk you posted, or copy it from somewhere?

"Iteratively" means "in a loop", because you'll need to do it more than once. "Get the substring" means the substr() function in awk. Read awk's man page, or any of the available awk references online.

I thought that perhaps there was a simple error in my method or code that someone might point out.
There is a major strategic error: awk considers any whitespace to be a delimiter. So the whole strategy of using $1,$15,$2,$16,$17 etc. is fundamentally flawed, if the process name (or any other field) can contain whitespace.


There is one other approach that might work in a case like this. It only works if at most one column can have data with spaces in it. What you do is put the that column last.

You still parse the header, so you know how many columns there are. Say it's 15. Then in all data lines, you know that fields 1 thru 15 are actual fields, and any fields that appear after 15 are actually part of column 15. For example, if column 15 was actually Google Chrome Helper, then field 15 in that line will be Google, field 16 is Chrome, field 17 is Helper. So fields 15 thru NF (the awk variable) are the actual process name with spaces between the words.

The obvious advantage: it doesn't require char-offsets or substrings. Obvious shortcoming: it only works when a single field contains spaces, and only when you can position it last on the line.
 
Last edited:

SlugBlanket

macrumors regular
Original poster
Mar 5, 2011
130
7
Have you looked at the top man page? It already supports this.

-Lee

Thanks for pointing this out to me Lee. After a closer look at the man pages the following is giving me exactly what I need in the format I need:

Code:
top -l1 -stats pid,ppid,command,state,uid,user,cpu,time | head 54 | tail 41

with regard to doing this with awk, I like chown33's suggestion of looking at the column headings and using the position of the leading character as a field indicator but my scripting prowess is still in its infancy (I can understand for and while loops but substrings are where I meet my Waterloo) so I'll have to fight that battle another day :(

Thanks again to you guys.
 

SlugBlanket

macrumors regular
Original poster
Mar 5, 2011
130
7
Please outline your programming experience. In particular, how well do you really know awk? Did you write the awk you posted, or copy it from somewhere?

There is a major strategic error: awk considers any whitespace to be a delimiter. So the whole strategy of using $1,$15,$2,$16,$17 etc. is fundamentally flawed, if the process name (or any other field) can contain whitespace.

I wrote it myself. I used to be an HP-UX sysadmin but suffered a horrific accident 9 years ago. The medication I have been on for the last 7 years includes oral morphine and pregabalin. As a result of not working and medication I have forgotten much of what I had learned and I find it difficult to concentrate due to both pain and medication. I suffer problems with memory as a result of my accident

I decided so as not to completely lose my skills/mind and to occupy myself, to buy a mac. I needed a new PC and a mac allowed me to have a better experience AND to practice forgotten UNIX skills.

I lurked on this forum for many months until lion was released and have so been a mac user since July 27th.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.