Linked by pkrumins on Thu 19th Feb 2009 12:17 UTC
General Development If you have ever been interested in awk and sed Unix utilities, then you probably know about the awk1line.txt and sed1line.txt files that are floating around the Internet. Each file contains around 80 idiomatic sed and awk one-liners for performing various text modification tasks.
Order by: Score:
Michael
Member since:
2005-07-01

My dad is always singing the praises of Sed and Awk but that's not the only reason I don't like them ;)

They encourage these one-liners, which just means removing the formatting from code. They use regular expressions, something I consider should be avoided at all costs on account of their impenetrable syntax. My general feeling is that in the time it takes to figure out how to do anything with these tools, you could have just written a Python script to do it.

I'm sure they were great tools in their day but I really think Python trumps them, replacing all their functionality and throwing in maintainability to boot.

All that said, these are excellent, useful and well written articles. If I ever find I'm forced to use these things, I shall be eternally greatful that something like this exists.

Reply Score: 1

Kroc Member since:
2005-11-10

Depends on the load you have to lift. If I wanted to perform a quick bucket-sort, then I’d use PHP, VBScript or any other non-compact scripting language. But if I had to make a syntax processor (like I have: http://camendesign.com/code/remarkable ), then there’s no way I’d do it without regex. I’d have to practically reimplement a hard-coded regex engine in the process of handling the byte-by-byte matching for all the use-cases.

Sure the likes of sed and awk are hard to use, but there’s wizards out there who can, and those of us who can’t -- it’s not ours to say that one tool is better than the other, when in the right hands.

Reminds me of this T-Shirt - http://www.thinkgeek.com/tshirts-apparel/unisex/frustrations/374d/?... - "Go away or I will replace you with a very small shell script."

Edited 2009-02-19 13:08 UTC

Reply Score: 1

massysett Member since:
2007-12-04

I've never considered awk hard to use. You can learn the basics in about thirty pages. Have you ever sat down and read "Effective GAWK Programming"?

http://www.gnu.org/software/gawk/manual/gawk.html

sed is not hard either. Python is great too, but for little quick jobs awk is great. awk can be just as maintainable if you write it cleanly (and I've seen python code that is a mess, because you can write messy Python too.) But for a one liner, who cares if it is maintainable? You use it once then throw it out!

Reply Score: 3

AtariFan Member since:
2009-01-15

This manual has 360 pages (pdf-version). I would like to learn the basics of awk and has already bought a German cheap reference by O'Reilly.
Do you know a shorter tutorial?

Reply Score: 1

pkrumins Member since:
2008-05-07

Do you know a shorter tutorial?


Yep: http://www.grymoire.com/Unix/Awk.html

Reply Score: 2

spiderman Member since:
2008-10-23

First off, Python is not available anywhere, but awk and sed are (almost). Maybe perl is better than python, but anyway, that is not the point.
awk and sed scripts are not there to be maintained. They are not developer tools. They are administrator tools and they are there to make it easy for you to edit or search files quickly. sed and awk are used for one-shot commands in 95% of the cases. Once you have the result, there is no need to maintain the command at all.
How many lines do you have to write in python, just to open a file and read it? In sed or awk, that's 0. The file is open and parsed. There is no way you can make it faster in python.

Edited 2009-02-19 14:22 UTC

Reply Score: 6

vivainio Member since:
2008-12-26

How many lines do you have to write in python, just to open a file and read it?


One. cont = open("file.txt").read().

Or

for line open("file.txt"): do_stuff_with(line)

If that means significant extra work for you, you need a harder problem domain ;-)

Reply Score: 3

RandomGuy Member since:
2006-07-30


for line open("file.txt"): do_stuff_with(line)

That should be:
for line in open("file.txt"): do_stuff_with(line)

If that means significant extra work for you, you need a harder problem domain

Agreed.

Reply Score: 2

pkrumins Member since:
2008-05-07


for line open("file.txt"): do_stuff_with(line)

That should be:

for line in open("file.txt"): do_stuff_with(line)


It's neither. It should be:

<pre>
with file("file.txt") as fd:
for line in fd:
do_stuff_with(line)
</pre>

Edited 2009-02-20 23:27 UTC

Reply Score: 1

vivainio Member since:
2008-12-26

It's neither. It should be: with file("file.txt") as fd:

Such pedantic diligence ('with' statement) is probably not appropriate in a thread that mentions awk/sed. Using with statement is needed when you are worrying about closing file handles in some-future-version-of-python that may not do reference counting (which could imply that the file handle would remain open until the next garbage collection cycle).

Not really something you need to worry about if you are mainly targeting your normal python installation. Furthermore, a linear script can just close() the filehandle without caring about exceptions (because it would just exit the process and close everything anyway).

Reply Score: 1

JMcCarthy Member since:
2005-08-12

Chinese (traditional/simplified) looks pretty incomprehensible to me but evidently a 1,000,000,000+ people manage to get by.

Without a doubt it's unwelcoming to those who don't understand it, but for those who do...

Reply Score: 4

pkrumins Member since:
2008-05-07

That's why I wrote the articles, to explain them.

It is unfortunate that only the first paragraph shows on hacker news website.

Click on http://www.osnews.com/story/21004/Awk_and_Sed_One-Liners_Explained... to find links to my articles!

Reply Score: 2

Googol Member since:
2006-11-24

no, there is not a billion people who can read Chinese, unlike popular believe, and there is a reason for that. Guess why they had to 'invent' simplified on top of traditional? The same applies to all other fancy writing systems.

However, the point is that different tools serve different purposes and you may get along with it without knowing it inside out. I wished I knew any which one of these to some extent, but absent an actual need, I cannot justify putting the required time into it. You will know better...

Reply Score: 1

appel Member since:
2007-12-29

Are you serious?!

Regular expressions are awesome, and definately a core technology, even when using Python.

You are either a troll or completely clueless. soz.

Reply Score: 5

poundsmack Member since:
2005-07-13

http://xkcd.com/208/

and of course:

http://xkcd.com/353/

and for what it's worth; Python > Perl

:)

Reply Score: 4

Michael Member since:
2005-07-01

Regular expressions are very powerful but they cover a limited range of problems between the the trivial (the title is the example given to trim whitespace) and the very complex.

These old UNIX tools were terific in their day when they were the only way of doing things. But I think the fact that they continue to get so much air time has more to do with the fact that it's fun play with them. They're like crossword clues.

Don't underestimate what you can do with Python. It's trivial to read in a text file and split it into an array using whatever seperator you want. Modern scripting languages have tremendously powerful string handling functions and they do all this using real words. RE's are there if you need them but you very rarely do.

My work rate isn't limited by the speed at which I type, it's limited by the speed at which I think (that is, severely limited). I think better if I'm not having to translate everything via these arcane hieroglyphics.

I wish people were better at distinguishing between a genuinely held (and valid) opinion and a troll. It is only my opinion.

Reply Score: 1

abraxas Member since:
2005-07-07

These old UNIX tools were terific in their day when they were the only way of doing things. But I think the fact that they continue to get so much air time has more to do with the fact that it's fun play with them. They're like crossword clues.


I think it's because most people still think that bringing Python into the equation for simple formatting and extracting information from output is overkill. If I'm just writing a simple filter to be used between the output of one program and the input of another program on my own machine then I'm just going to use the shell, sed, and awk. If I want to distribute some kind of application that performs all the functions of my script I'm probably going to want to develop it in a language like Python or Perl.

Reply Score: 2

vivainio Member since:
2008-12-26

Modern scripting languages have tremendously powerful string handling functions and they do all this using real words. RE's are there if you need them but you very rarely do.

Depends on how you think (hammer & nails, anyone?). I always think of a regexp solution to any given problem (that I solve in Python) first - typically, I slurp in a string, run re.findall on it, then do a for loop over the resulting tuples.

Regular expressions have the advantage of being insanely fast, and very easy to work with. I agree that for the problems that are trivial enough to solve with awk/sed, regexs may be overkill - you can just do s.replace(), s.split() and s.join().

Reply Score: 1

yiyus Member since:
2006-02-27

If you want to convince me python is better than sed/awk for this kind of tasks you should write all the one-liners in python. Then I will listen to you.

Reply Score: 1

whartung Member since:
2005-07-06

I wouldn't really say that AWK and SED encourage these one liners, rather I would say they enable one liners.

When you work with these tools to the comfort level that you can just spit out these one liners, that's where this facility becomes a powerful command line tool rather than a just a scripting language.

I don't think twice about just pounding out long pipelines in the shell, or short scripts. Similarly, if it can fit on one line, I'll consider doing the same with AWK or SED.

These are one off events that never last beyond perhaps the shell history.

Reply Score: 3

Delgarde Member since:
2008-08-19

They use regular expressions, something I consider should be avoided at all costs on account of their impenetrable syntax.


You obviously subscribe to the school of thought that says someone who uses regular expressions to solve a problem now has two problems. ;)

True to a degree - they tend to be overused, or made overly complicated by people who think that because they can read it, so will the next person who comes along. But they're also an extremely powerful tool, for which there really isn't any practical alternative - e.g if you want to validate that a string matches a pattern, you can use a simple regex to do it, or you can write your own parser. And writing your own parser is almost never the right answer.

Reply Score: 4

PDF
by dbolgheroni on Thu 19th Feb 2009 13:19 UTC
dbolgheroni
Member since:
2007-01-18

Just liked. It would be nice to have a PDF version with all concatenated.

Thank you.

Reply Score: 1

RE: PDF
by pkrumins on Thu 19th Feb 2009 13:25 UTC in reply to "PDF"
pkrumins Member since:
2008-05-07

Yep. I will publish two free ebooks, one on Awk one-liners and one on Sed one-liners.

Ps. you can subscribe to my posts on my blog so that you don't miss them.

Reply Score: 4

RE: PDF
by abraxas on Sat 21st Feb 2009 18:35 UTC in reply to "PDF"
abraxas Member since:
2005-07-07

If you use GNOME just File->Print->Print to file.

Reply Score: 2

Comment by timefortea
by timefortea on Thu 19th Feb 2009 14:28 UTC
timefortea
Member since:
2006-10-11

Excellent.

Reply Score: 1

My Favorite
by fretinator on Thu 19th Feb 2009 16:54 UTC
fretinator
Member since:
2005-07-06

fretinator ~ $ sed "W+F rU $@y!n6"

Reply Score: 3

pdf version?
by maaxx on Thu 19th Feb 2009 17:02 UTC
maaxx
Member since:
2007-11-06

Very nice! But I'm going to wait for the PDF file that'll include all the tutorials (not that I don't like your blog or anything, hehe).

Reply Score: 1

great
by jwwf on Thu 19th Feb 2009 18:08 UTC
jwwf
Member since:
2006-01-19

Great contribution, thank you!

Reply Score: 3

RE: great
by Doc Pain on Thu 19th Feb 2009 20:23 UTC in reply to "great"
Doc Pain Member since:
2006-10-08

Just finished printing the two text files. Allthough sed and awk do already belong to my usual daily tools, I like to learn something new every day. Many thanks!

Reply Score: 2

Awk and Sed One-Liners Explained?
by tupp on Thu 19th Feb 2009 18:14 UTC
tupp
Member since:
2006-11-12

If you have to explain one-liners, then they aren't funny.

Reply Score: 7

Benefits of Awk and Sed
by abraxas on Fri 20th Feb 2009 02:25 UTC
abraxas
Member since:
2005-07-07

It's been said before but you don't always have Python or Perl available and knowing Awk and Sed can be a life saver. They are also great tools for formatting output for input into other programs. An entire scripting language is overkill for that kind of work.

Reply Score: 2

Thank you for this
by alcibiades on Sat 21st Feb 2009 17:47 UTC
alcibiades
Member since:
2005-10-12

Many thanks for this - particularly for the cheat sheet which is very convenient, but the one liners are excellent and well commented too. We can always learn.

The nice thing about awk is, its lightweight, reasonably fast, terse, intuitive once you know and are used to it. What its good for, its great for. I think of it a bit like an old pruning knife with a well sharpened slightly worn blade and a handle polished with use, an Opinel for instance. Its safe, cheap, its sharp, it just fits in the hand, and you hardly think about it any more. And would be very sorry indeed if you ever lost it.

Reply Score: 2