Linked by Thom Holwerda on Mon 3rd Dec 2012 22:51 UTC
General Unix "Few tools are more indispensable to my work than Unix. Manipulating data into different formats, performing transformations, and conducting exploratory data analysis (EDA) is the lingua franca of data science.1 The coffers of Unix hold many simple tools, which by themselves are powerful, but when chained together facilitate complex data manipulations. Unix's use of functional composition eliminates much of the tedious boilerplate of I/0 and text parsing found in scripting languages. This design creates a simple and succinct interface for manipulating data and a foundation upon which custom tools can be built. Although languages like R and Python are invaluable for data analysis, I find Unix to be superior in many scenarios for quick and simple data cleaning, idea prototyping, and understanding data. This post is about how I use Unix for EDA."
Thread beginning with comment 544050
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[2]: But ....
by Hypnos on Tue 4th Dec 2012 02:15 UTC in reply to "RE: But ...."
Hypnos
Member since:
2008-11-19

I'd like to learn of a use case where the best choice of programming language is actually m4, and not because you already have a pile of bash scripts duct taped together.

Reply Parent Score: 3

RE[3]: But ....
by Soulbender on Tue 4th Dec 2012 03:20 in reply to "RE[2]: But ...."
Soulbender Member since:
2005-08-18

I'd like to learn of a use case where the best choice of programming language is actually m4


Sendmail?
Nah, just kidding. Sendmail's use of m4 is also awful.

Reply Parent Score: 2

RE[4]: But ....
by moondevil on Tue 4th Dec 2012 12:35 in reply to "RE[3]: But ...."
moondevil Member since:
2005-07-08

Don't be mean on Sendmail, after all it allowed some companies to live mainly from Sendmail configuration projects. ;)

Reply Parent Score: 2

RE[3]: But ....
by Neolander on Tue 4th Dec 2012 07:22 in reply to "RE[2]: But ...."
Neolander Member since:
2010-03-08

I'd like to learn of a use case where the best choice of programming language is actually m4, and not because you already have a pile of bash scripts duct taped together.

A mad sysadmin's dream company network, where everything runs some bare-bones variant of UNIX and the home partition is mounted in noexec mode?

Not sure if bash would still agree to run shell scripts in the latter case, though. And even if it spontaneously would, the mad sysadmin might well have patched it by hand so that it fails instead. After all, he can patch everything he wants since he never updates anything anyway.

Edited 2012-12-04 07:29 UTC

Reply Parent Score: 3

RE[4]: But ....
by moondevil on Tue 4th Dec 2012 12:34 in reply to "RE[3]: But ...."
moondevil Member since:
2005-07-08

Having worked with commercial UNIX systems that were pretty close to System V, I am not sure if I want to replicate the experience again.

Reply Parent Score: 2