Linked by Thom Holwerda on Sat 11th May 2013 21:41 UTC
Windows "Windows is indeed slower than other operating systems in many scenarios, and the gap is worsening." That's one way to start an insider explanation of why Windows' performance isn't up to snuff. Written by someone who actually contributes code to the Windows NT kernel, the comment on Hacker News, later deleted but reposted with permission on Marc Bevand's blog, paints a very dreary picture of the state of Windows development. The root issue? Think of how Linux is developed, and you'll know the answer.
Permalink for comment 561497
To read all comments associated with this story, please click here.
RE[12]: Too funny
by satsujinka on Tue 14th May 2013 07:08 UTC in reply to "RE[11]: Too funny"
Member since:

Your argument seems very confused to me, but maybe I'm misunderstanding you.

I'm going to drop the indexing discussion after this because I'm not sufficiently studied on the topic to explain how a database does indexing. However, if we take file=table and line=row; then I would imagine we can cache rows and mark them with their table (inside the cache.) But as I said, I don't know what databases do; so this is just my guess. Also I'm not convinced that a log database would have performance issues (as there's really only 1 record type and logs don't cross reference each other too much.)

Moving back to the top:

if you were to build your own custom SQL database over top file system primitives, it's unlikely to be as flexible or accessible as an SQL database

The bold part is what you're missing. And is why you're contradicting yourself. You are literally saying that an SQL database is less flexible and accessible then an SQL database. The backend is totally unimportant for non-performance considerations.

The thing is, once you have data in a database, you wouldn't ever have a need to use the standard text tools to access the data since they're largely inferior to SQL (unless of course you didn't know SQL).

See but there are reasons why you might not want to use a query engine. You list a trivial one (that at least a professional system admin. should try to overcome, but not everyone is a professional system admin.) Here are some more reasons:
* Because I want to verify that the query engine is returning the correct results. (Query engines have bugs too!)
* Because writing out a full query is more work than greping for some keyword. (I'm lazy.)
* Because log files shouldn't exist in some magical land separate from all my other files (e.g. off in SQL land while all of my other files are in CLI land; this can also be read as "CLI is what I reach for first".)
* Because I don't want to have to hunt down a database driver just to pick some things out of my logs from within my program.
* Or from the other side of the fence, because I don't want to have to hunt down a database driver to write some logs for my program.
* Because I want to pipe my results out to some other program (this is more a comment on most SQL query engines then a real limitation.)

because it lacks alot of the more advanced features a database can normally provide

And what "advanced" features would apply to a log? There's only 1 record type. CSV provides sufficient capabilities to handle that.

Consider wikipedia's CSV page:
CSV formats are best used to represent sets or sequences of records in which each record has an identical list of fields. This corresponds to a single relation in a relational database, or to data (though not calculations) in a typical spreadsheet.

Does this not sound exactly like what an entry in a log file is?

Reply Parent Score: 2