Linked by Thom Holwerda on Tue 27th May 2008 13:08 UTC, submitted by Ward D
General Development AWK is one of the most common UNIX tools to process text-based data in either files or datastreams. Written by Alfred Aho, Peter Weinberger, and Brian Kernighan, AWK "extensively uses the string datatype, associative arrays (that is, arrays indexed by key strings), and regular expressions." ComputerWorld interviewed Alfred Aho.
Thread beginning with comment 315848
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[2]: what an wonderful tool
by whartung on Tue 27th May 2008 22:11 UTC in reply to "RE: what an wonderful tool"
whartung
Member since:
2005-07-06

There are other great tools, little tools that simply do their job, just to mention a few: sed, grep, cut.


AWK makes me hate cut.

Why oh why oh why can't cut compress white space just like AWK does. By default, AWK separates fields based on one or more white space.

1 2 <-- There's supposed to be several spaces here, but HTML eats them

is the same as

1 2

AWK treats the spanning white space as a single delimiter.

But oh no, not cut. Nope. If you use " " as a delimiter in cut, you'll get a field for every single space.

*sigh*

Even today, modern cut can't do that -- even as an option. So, I use AWK.

Just a pet nit...

Edited 2008-05-27 22:15 UTC

Reply Parent Bookmark Score: 2

RE[3]: what an wonderful tool
by malkia on Wed 28th May 2008 00:23 in reply to "RE[2]: what an wonderful tool"
malkia Member since:
2005-07-17

First, thanks for the nice AWK tips there. I should learn more AWK.

I'm still using only cut, sed, tr, etc.... I'm working mostly on Windows (which I do not like much) but that's my job - so cygwin on the help.

As for your example for cut and white spaces, here is how it can be solved:

cut --help | sed -r "s/[ ]+/ /g" | cut "-d " -f 2-

I guess sed can be used to replace two or more spaces to one, and then preprocess...

Reply Parent Bookmark Score: 1

RE[3]: what an wonderful tool
by news7os on Wed 28th May 2008 13:13 in reply to "RE[2]: what an wonderful tool"
news7os Member since:
2008-05-28

You can "squeeze" spaces with tr (this might be GNUism, i'm not sure), which is what I do:

echo "blah blah" | tr -s ' ' | cut -f2 -d' '

Nice article. AWK! AWK! ;)

Reply Parent Bookmark Score: 1

RE[4]: what an wonderful tool
by Doc Pain on Thu 29th May 2008 20:49 in reply to "RE[3]: what an wonderful tool"
Doc Pain Member since:
2006-10-08

You can "squeeze" spaces with tr (this might be GNUism, i'm not sure), which is what I do:

echo "blah blah" | tr -s ' ' | cut -f2 -d' '


Then it would be BSDism, too, because it works in FreeBSD, as I have just checked.

Nice article. AWK! AWK! ;)


Nice sound, at least if Americans and Englishmen pronounce it correctly. :-) In Germany, awk is pronounced "ar way kar" letter-wise - shorter than "ay doubleyou kay" of course... "This is Ay Doubleyou Kay Radio 100.4 MHz, you're listening to Doctor Frasier Crane..." (see Focus Shift n - 1)... :-)

It's just sad HTML eats up all our pretty spaces so we can't demonstrate how nicely it works. It would be great to have <pre>...</pre> enabled here...

Reply Parent Bookmark Score: 3

RE[3]: what an wonderful tool
by Doc Pain on Thu 29th May 2008 20:43 in reply to "RE[2]: what an wonderful tool"
Doc Pain Member since:
2006-10-08

AWK makes me hate cut.


As you pointed out correctly, there are cases when cut isn't the best tool. But that's tht nature of a tool - use it for what's it good at, and don't use it when it creates more problems than simply using another tool.

Cases where cut is a good tool are, where
1. you just want one of n fields,
2. the field delimiter isn't a space or a tab and
3. when you don't need to care for multiple spaces or tabs.

I remember a case where all three cases were met: I needed a stupid script that would extract all the nicknames from my X-Chat log files, so I did - and don't try this at home, kids - the following stupidity:

cat ${LOGFILES} | grep "<" | grep ">" | grep -v "CTCP" | cut -d '<' -f 2 | cut -d '>' -f 1 | sort | uniq -d | xargs echo > nicklist.txt

After I entered it and saw that it worked, I thought that I'd have better used awk... :-)

Reply Parent Bookmark Score: 2

fernandotcl Member since:
2007-08-12

Use the "-F" flag to specify the separator in awk.

In my experience, awk can always replace cut with ease, but the opposite isn't always true. Still, cut is lighter on resources and also more readable (the operation you're performing seems more explicit to me).

Reply Parent Bookmark Score: 1