Linked by Thom Holwerda on Mon 3rd Dec 2012 22:51 UTC
General Unix "Few tools are more indispensable to my work than Unix. Manipulating data into different formats, performing transformations, and conducting exploratory data analysis (EDA) is the lingua franca of data science.1 The coffers of Unix hold many simple tools, which by themselves are powerful, but when chained together facilitate complex data manipulations. Unix's use of functional composition eliminates much of the tedious boilerplate of I/0 and text parsing found in scripting languages. This design creates a simple and succinct interface for manipulating data and a foundation upon which custom tools can be built. Although languages like R and Python are invaluable for data analysis, I find Unix to be superior in many scenarios for quick and simple data cleaning, idea prototyping, and understanding data. This post is about how I use Unix for EDA."
Thread beginning with comment 544060
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[4]: But ....
by Hypnos on Tue 4th Dec 2012 03:46 UTC in reply to "RE[3]: But ...."
Member since:


To be fair, they let you run your old logger in parallel -- all you have to do is change you tried-and-true tools to use their new super-duper interface:

Are they not merciful?

Reply Parent Score: 6

RE[5]: But ....
by Delgarde on Tue 4th Dec 2012 05:00 in reply to "RE[4]: But ...."
Delgarde Member since:

More than that - not only are they not stopping you from running a traditional logger if you want one, they're providing fancy tools designed specifically for running complex queries over the binary log format.

In short, they're actually making things easier to parse logging data.

Reply Parent Score: 5

RE[6]: But ....
by Hypnos on Tue 4th Dec 2012 05:07 in reply to "RE[5]: But ...."
Hypnos Member since:

I can see how this is useful for corporate deployments with many, many machines. But it's overkill to incude by default for 99% of Linux users. It should at most be a plugin.

Reply Parent Score: 3

RE[5]: But ....
by Soulbender on Tue 4th Dec 2012 05:13 in reply to "RE[4]: But ...."
Soulbender Member since:

Ok, not as horrible as I initially thought. syslog is a pretty shitty logging system, that I can agree with.
I would have liked to see the example use more structured data though and not a freeform "user blah logged in" message.
Don't see why this should be a feature of systemd though and not a standalone system. The arguments for forcing systemd on us are pretty bogus. Tightly integrated? Yeah, right. Good thing we don't have message passing technologies these days or something.
Also, what's with being so defensive about UUId's? (I don't mind them, it's just fascinating how big a deal it seems to be)

As a big fan of Upstart (and daemontools/runit/etc) I think it's about time the abomination known as SysV init is abandoned (along with runlevels) and in that respect Systemd is a step forward. I kinda wish it didn't try to weasel in everywhere though. (A GNOME dependency? WTF?)

Reply Parent Score: 4

RE[6]: But ....
by Hypnos on Tue 4th Dec 2012 05:18 in reply to "RE[5]: But ...."
Hypnos Member since:

I've been pretty happy with OpenRC on Gentoo, though it does depend on /sbin/init :

Reply Parent Score: 2

RE[6]: But ....
by gan17 on Tue 4th Dec 2012 11:37 in reply to "RE[5]: But ...."
gan17 Member since:

Don't see why this should be a feature of systemd though and not a standalone system

Because it's made by Poettering. Everything he makes wants to take over your system and eat your brains.

Reply Parent Score: 5