Linked by Thom Holwerda on Sat 11th May 2013 21:41 UTC
Windows "Windows is indeed slower than other operating systems in many scenarios, and the gap is worsening." That's one way to start an insider explanation of why Windows' performance isn't up to snuff. Written by someone who actually contributes code to the Windows NT kernel, the comment on Hacker News, later deleted but reposted with permission on Marc Bevand's blog, paints a very dreary picture of the state of Windows development. The root issue? Think of how Linux is developed, and you'll know the answer.
Thread beginning with comment 561356
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE[7]: Too funny
by Alfman on Mon 13th May 2013 06:21 UTC in reply to "RE[6]: Too funny"
Alfman
Member since:
2011-01-28

cdude,

"The database is named filesystem. Binary dumps for logs? That's more stupid then a binary config registry."

I'm finding it peculiar that you'd bring up a "named filesystem" database given that it doesn't apply to logfiles.


With a database, each record exists and is manipulated independently from all other records. You cannot use file system level operators (cd, ls, find, etc) to query log files or manipulate individual records. In order to get the same level of granularity that a database gives us, you'd have to store each "record" or log event in a separate file. Another major difference is that the database records can be indexed such that queries will only read in the records that match the criteria. A text log file on the other hand has no indexes and needs to be fully scanned.


Text processing programs like sed/grep/cut/sort/etc are great tools, but SQL is far more powerful for advanced analytics.

Edit: Also, the windows registry sucks, no disagreement there. But it's not right to put all databases in the same boat as regedit. The registry has a huge gap in analytical power and structure compared to any real database.

Edited 2013-05-13 06:37 UTC

Reply Parent Score: 4

RE[8]: Too funny
by satsujinka on Mon 13th May 2013 07:00 in reply to "RE[7]: Too funny"
satsujinka Member since:
2010-03-11

While cdude was being ostentatious, he does have a point. In that, technically, a file system is a graph database...

Skipping over that and more importantly, there's no reason why you can't implement a database on top of text files. Perhaps, there might be some performance penalty due to the size of a human word and a machine word. But most other issues (i.e. indexing) are just a matter of translating from binary to what that byte actually meant.

Of course, with semi-structured text that has little embedded meta-data (i.e. syslog's logfiles,) getting adequate performance would be hard. However, I was already suggesting adding checksum meta-data; so it's not really a stretch to imagine that I'm okay with adding whatever other necessary meta-data.

Reply Parent Score: 3

RE[9]: Too funny
by Alfman on Mon 13th May 2013 15:19 in reply to "RE[8]: Too funny"
Alfman Member since:
2011-01-28

satsujinka,

"In that, technically, a file system is a graph database...Skipping over that and more importantly, there's no reason why you can't implement a database on top of text files. Perhaps, there might be some performance penalty due to the size of a human word and a machine word. But most other issues (i.e. indexing) are just a matter of translating from binary to what that byte actually meant."

I realize all of this, a file system *is* a type of database, anything with a simple key-value mapping would fit naturally. More over you could re-implement just about any other type of advanced data structure on top of it, however you'd be reinventing the wheel and probably end up with something that is slower, less flexible, and less accessible than SQL.


For SQL users, the actual data format is mostly irrelevant other than performance and integrity reasons. Mysql has a text database engine, but it isn't as good as the other engines and lacks indexing.
http://dev.mysql.com/doc/refman/5.1/en/csv-storage-engine.html

Generally speaking once you've got the data in a structured database you'll never want to revert to the text processing tools again (sed/grep/cut/etc). The main reason to convert back to text form is for data interchange with others, not for querying or manipulation.



"Of course, with semi-structured text that has little embedded meta-data (i.e. syslog's logfiles,) getting adequate performance would be hard. However, I was already suggesting adding checksum meta-data; so it's not really a stretch to imagine that I'm okay with adding whatever other necessary meta-data."

I'm not sure how much security is gained by checksuming, since if an attacker gained sufficient access to manipulate the logs, it seems they could also have sufficient access to manipulate the checksums as well. This would be true whether in binary or text.

Reply Parent Score: 4