"Few tools are more indispensable to my work than Unix. Manipulating data into different formats, performing transformations, and conducting exploratory data analysis (EDA) is the lingua franca of data science.1 The coffers of Unix hold many simple tools, which by themselves are powerful, but when chained together facilitate complex data manipulations. Unix's use of functional composition eliminates much of the tedious boilerplate of I/0 and text parsing found in scripting languages. This design creates a simple and succinct interface for manipulating data and a foundation upon which custom tools can be built. Although languages like R and Python are invaluable for data analysis, I find Unix to be superior in many scenarios for quick and simple data cleaning, idea prototyping, and understanding data. This post is about how I use Unix for EDA."
James Hague: "But all the little bits of complexity, all those cases where indecision caused one option that probably wasn't even needed in the first place to be replaced by two options, all those bad choices that were never remedied for fear of someone somewhere having to change a line of code... They slowly accreted until it all got out of control, and we got comfortable with systems that were impossible to understand." Counterpoint by John Cook: "Some of the growth in complexity is understandable. It's a lot easier to maintain an orthogonal design when your software isn't being used. Software that gets used becomes less orthogonal and develops diagonal shortcuts." If there's ever been a system in dire need of a complete redesign, it's UNIX and its derivatives. A mess doesn't even begin to describe it (for those already frantically reaching for the comment button, note that this applies to other systems as well).
Finally something really interesting to talk about. If you've used UNIX or any of its derivatives, you've probably wondered why there's /bin, /sbin, /usr/bin, /usr/sbin in the file system. You may even have a rationalisation for the existence of each and every one of these directories. The thing is, though - all these rationalisations were thought up after these directories were created. As it turns out, the real reasoning is pretty damn straightforward.
"One of the fun examples among all the copyright fuss is the extreme example of copyright claims made by AT&T some time in the 1980s. It's the /bin/true program. This is a dummy' library program whose main function is to make it easy to write infinite loops (while true do ...) in shells scripts. The 'true' program does nothing; it merely exits with a zero exit status. This can be done with an empty file that's marked executable, and that's what it was in the earliest unix system libraries. Such an empty file will be interpreted as a shell script that does nothing, and since it does this successfully, the shell exits with a zero exit status. But AT&T's lawyers decided that this was worthy of copyright protection." Three blank lines. Copyrighted. You can't make this stuff up.
Way back in 2002, MIT decided it needed to start teaching a course in operating system engineering. As part of this course, students would write an exokernel on x86, using Sixth Edition Unix (V6) and John Lions' commentary as course material. This, however, posed problems.
The groundbreaking work he did with Ken Thompson led to the operating system behind everything from set-top boxes to the iPhone, but who sings the praises of the late Dennis Ritchie?
Twitter is currently buzzing about the death of Dennis Ritchie, the visionary creator of UNIX and C, among other things. We hope it's just a false rumor. Story developing, we will be updating. Update: Unfortunately, it seems to be confirmed. Rob Pike, co-creator of the Plan 9 and Inferno OSes, who has worked with Ritchie in the past, and he's currently working for Google's GO language, posted this.
"The proc filesystem is a special filesystem found on most UNIX-based systems. It holds a great deal of information, in ASCII format, most of which is not very friendly to the average user. I've made a list of some of the files i find to be of most use."
Today, Ken Thompson and Dennis Ritchie, the two Bell labs scientists which began creating the Unix operating system in 1969, have been named as winners of the 2011 Japan Prize for information and communications.
What would have happened if the ST had run a BSD based UNIX rather than TOS and GEM? "To run Unix effectively we needed some hardware that was very fast, that was simple enough to put into a minor spin of the ST’s memory controller with little project risk, and that would still provide some kind of memory relocation and protection. The ability to have separate address spaces to isolate processes would be good, too."
Good news: the UNIX copyrights owned by Novell will not fall in the hands of Microsoft as part of the IP purchase by Redmond. "Novell will continue to own Novell's UNIX copyrights following completion of the merger as a subsidiary of Attachmate," states John Dragoon, Chief Marketing Officer at Novell. Yeppers.
I had the pleasure earlier this month of attending a demo day at HP's Cupertino campus to commemorate the ten year anniversary of the Superdome server, see what's new in the high-end server market and learn about what's going on with HP-UX.
"Hewlett-Packard is rolling out Update 5 for the HP-UX Unix operating system that runs its Itanium and PA-RISC lines of Integrity and HP 9000 servers, keeping to its pattern of two updates per year for its flagship operating system. As has been the case with the prior HP-UX updates, the changes are important to existing HP-UX shops, but they're probably not going to cause a stampede of buyers for HP-UX systems. It's no different with the updates to IBM's AIX or Sun Microsystems' Solaris Unixes do."
Do you know what to do when the performance of your UNIX network and the speed at which you can transfer files or connect to services suddenly comes to a stop? How do you diagnose the issues and work out where in your network the problems lie? This article looks at some quick methods for finding and identifying performance issues and the steps to start resolving them.
"The computer world is notorious for its obsession with what is new - largely thanks to the relentless engine of Moore's Law that endlessly presents programmers with more powerful machines. Given such permanent change, anything that survives for more than one generation of processors deserves a nod. Think then what the Unix operating system deserves because in August 2009, it celebrates its 40th anniversary. And it has been in use every year of those four decades and today is getting more attention than ever before."
"Earlier this year, people in many places wrote about the 40th anniversary of the moment Ken Thompson sat down and started to work on UNIX (which is actually in August). In fact, UNIX celebrates another birthday this year, even though on a slightly smaller scale. In July 1974, exactly 35 years ago, Dennis Ritchie and Ken Thompson published the first version of their seminal paper The UNIX Time-Sharing System in the Communications of the ACM."
Gary Anthes offers an overview history of Unix forty years since Ken Thompson banged out the first version in assembly language for a wimpy DEC PDP-7 minicomputer, spending one week each on the operating system, a shell, an editor, and an assembler. Also included in the package are a year-by-year time line of its evolution, and profiles of Unix giants David Korn, Rick Rashid, and Gordon Bell.
You can find out a lot about your network by using a variety of different tools. Understanding the layout of your network, and where packets are going, and what people are doing is important. This tutorial examines techniques for monitoring the traffic and content of your UNIX network and how to read and diagnose problems on your network.
Even though the old-world UNIX operating systems, like IRIX and HP-UX, have been steadily losing ground to Linux for a long time now, they do still get updated and improved. HP-UX 11i v3 is supposed to get update 4 tomorrow, with a host of new features that won't excite you if you're used to Linux, but they're still pretty useful for HP-UX users.
Take a look at some systems that enable you to trace the execution of applications and work out what they are doing without having to make any modifications to the source code, and even without having to stop and restart the application. See how with tracing alone, you can find and diagnose problems with just a few commands.