Although you can now run a number of Linux distros natively on Windows 10, this integration has been a little tricky when it comes to handling filename case, as Linux is case sensitive and Windows is not.
In order to overcome this limitation, starting with the Windows 10 April 2018 Update (version 1803), NTFS includes a new flag that you can enable on a per-folder basis allowing the file system to treat files and folders as case sensitive.
I’m sure there are countless technical reasons as to why case sensitive is the preferred route to go, but is there a case to be made for case insensitivity being simpler and less confusing to use?
If all you have are Western languages maybe, but “case” is not a universal language construct. It’s meaningless to do a case-insensitive compare of Arabic, for example.
Magical “do what I want, not what I say” behavior in a filesystem is already dangerous enough, having that behavior hinge on the character set is even worse.
Edit: I read too fast. You’re asking for good reasons for genuine case-insensitivity in file system operations. I can’t think of any.
Edited 2018-05-29 23:59 UTC
bartgrantham,
On the other hand, I generally feel that it is bad practice for multiple files to be in a directory differing only by letter case. Same thing with URLs for that matter. Prior to tab completion and GUIs, having the type files in the right case (ie via command line or ftp, etc) was a major annoyance of unix. “Do what I want and not what I say” would generally lean towards case insensitivity IMHO.
The overwhelming majority of the time I want my text searches to match both upper and lower case letters. I find it annoying that unix text editors use binary search by default. Maybe it’s just me?
One man’s trash is another man’s treasure.
I absolutely love the fact that I can search with or without case sensitivity on pretty much every editor I use under Linux as it allows me to more easily filter out noise results.
Anyway, on any decent editor in Linux, it is a click away and can be made default. By the way, the KDE file manager has an option to do the same.
Also, I have an old habit of naming files with some characteristics with case consideration. I love it because, on directories with many files, it helps sort them the way I like.
For example, all music tracks I have are always {artist}-{title}[-{differentiator}], with initials capitalized (and I move track number, album, genre and other information to ID3/metadata tags). For songs with lyrics files, I convert things to lower case. I do it because the name of the files never get too large and also because I can filter them very fast.
I do similar things in my projects.
acobar,
Possibly. I mostly use command line tools from ssh server connections. I’ve already encountered other scenarios where changing the defaults helps (like fixing broken vim auto indentation), but it’s just a pain to have to do so over and over again across so many servers. I need a way to auto deploy my preferences when I connect, but that could confuse other users though, haha. “What the hell, I’ve done the same thing on these two servers and the results are different!”
Not just you; I also think case insensitive is a better fit to most user interactions… But for example for some reason Rockbox (free software firmware/OS for mp3 players, rockbox.org ) is by default case sensitive when sorting files, possibly because devs are on Linux… luckily, in this case the behaviour can be toggled.
That’s no excuse not having support for it in general. However, over the years, I’ve come to accept that complaining for Windows’ lack of support for lots of things doesn’t matter much. But sometimes I still get angry about dozens of small things (like this particular example) that would’ve made life on Windows easier. Yes, there’s some support for this right now, but come on, we had to wait until 2018 for this? They can add dozens of complicated features, but this was “planned” for 2018? It’s a “fun” OS where the support of such things raises to the level of news.
I believe the two main reasons are:
* Mac and Windows are both case-insensitive by default (lacking this and not using UFS in OS X, assuming that is still an option like it used to be), so it’s more “normal” to be case insensitive.
* Case doesn’t distinguish names: “Nicholas Cage” and “NICHOLAS CAGE” are clearing the same thing in meaning.
I’m not saying those are good reasons, but those are the reasons I hear the most.
Or, you could just as well say both are really dumb for not supporting it from day 1.
Mac always did support it, through UFS. I’ve had Panther installed using UFS in the past, and it worked fine. I’m sure there are applications out there that would not have worked, but I did not notice any.
Just some food for thought :
2006 https://www.hanselman.com/blog/SubversionCasesensitivityProblems.asp…
2011 https://superuser.com/questions/266110/how-do-you-make-windows-7-ful…
2014 https://github.com/owncloud/client/issues/1348
2014 https://github.com/syncthing/syncthing/issues/430
2014 https://stackoverflow.com/questions/20969987/dropbox-unicode-encodin…
2016 https://www.endpoint.com/blog/2016/01/07/file-names-same-except-for
2017 https://www.dropboxforum.com/t5/Syncing-and-uploads/Suddenly-have-ma…
Looks like an all-time problem. And you also have UTC to handle.
That isn’t universally true… I know its a nitpick, but “kb” (kilobit) and “KB” (kilobyte) mean two entirely different things. There are probably other examples regarding abbreviations, although you can blame most of them on the metric system
Or what about “Polish” and “polish”? Those are two entirely different words…
Just saying, there are actually a few corner cases where the case of letters completely changes the meaning of a word or phrase. Not many at all, and probably not enough to matter, but they do exist.
https://en.wikipedia.org/wiki/Capitonym
ps. As noted in link above, in German all nouns (not just proper nouns) are capitalized, so this is actually a very common thing to run into in German.
Edited 2018-05-30 06:22 UTC
galvanash,
You should be right about kb and kB, but but realistically we are so inconsistent in practice that assuming the author’s intended meaning based on letter case is not a sure thing. I don’t know if you realize it or not, but you took the liberty of arbitrarily flipping the case of the “K” to write “KB”, which is technically a wrong SI unit.
The pedantic interpretation of “mb” would be “milli-bits”, although it’s far more likely the author actually means megabytes and didn’t bother with letter casing. The incorrect use of letter cases is so prevalent that we can’t really make assumptions based on the case of a letter. We can blame the users, vendors, and even technical users for using the wrong case, but I think it’s futile and in hindsight it was a bad idea to have upper and lower case of “B” to represent bytes and bits.
“Begoña Fernández”, “Begoña Fernandez”, “BEGOÑA FERNANDEZ” and “BEGOÑA FERNÃNDEZ” are the same person too. Also “BEGONA FERNANDEZ”, which is very wrong but enforced, for example, by airlines. And rules start not being so obvious.
In Spanish, diacriticals are often ommitted in upper case letters, though for the sake of correctness they shouldn’t. It is very bad style to ommit them from lower-case letters, but still it is sometimes done in computerworld because som many computer systems choke on them. And what do you do with the “ñ”? It is considered a real letter, not an “n” with a “~”. Oh, and sorts should always be both case and diacritical-insensitive (“ñ” is between “n” and “o”).
And that is just in Spanish, which is quite a simple case; surely Cestina, Hungarian or Viet make things vastly more complex. Not to talk about Arabic, or other languages with more than two cases. Maybe, in some, case can change meaning. Case insensitivity is a complex subject, not a just a matter of substracting 0x20 from each char.
And in Unicode you can write letters like ö in 2 different ways: as NFD (2 codepoints) or NFC (1 codepoint). Are they the same file name? For crappy (=basically all) American and UK systems I have to write my last name as Panzenboeck instead of Panzenböck. Is that the same file name?
There isn’t only the differentiation between normalizations (Normalformen, whatever), but also the fact that o != о. But that’s a whole different problem.
This makes life easier for me. Unison and Synthing are great sync tools but this is one issue that was never really solved because of the underlying filesystem case handling.
Simplest answer is its not always ideal to tie your file systems to a specific locale, and without doing that, case folding is virtually impossible (and you have to do case folding if you want a case-insensitive but case-preserving file system). Even when you DO tie to a specific locale, the rules are often complex and sometimes ambiguous.
You don’t always name your files, sometimes someone else did. For example, take these three words:
Maße
MASSE
Masse
In case-insensitive English, the first word is distinct from the others, but the last two are equal. However, in Swiss, the first and third are equivalent, but in German the first and second are (but not the third)…
Its easy to say “just use the user’s locale”, but that often leads to naming issues between their files and files that come from 3rd parties. This problem doesn’t happen in case-sensitive file systems, because all 3 of the above are distinct.
Of course this isn’t the most common problem in the world, but it IS a problem that case-sensitive file systems handily avoid.
I personally don’t find case-sensitive file systems confusing, but I understand the argument…
Edited 2018-05-30 01:11 UTC
Too bad I commented before I voted your comment up.
I actually came here to argue in favor of case-insensitivity, but your post completely changed mind. My opinion was definitely Western-centric, and specifically English-centric.
So, thanks for that.
I think that is a good argument for case preserving file naming, but not for case sensitivity.
By all means, filesystems should preserve the case that the user chose, but I think the case for case sensitive filesystems is really a case of the tail wagging the dog. Case sensitivity causes more problems than it solves in my opinion.
mkone,
+1 for bringing this up. Case preserving is obviously very useful, yet it doesn’t imply case sensitivity. I find it bad practice to store multiple files differing only in letter case. It creates confusion and doesn’t seem like something I’d want to do intentionally. The main problem is that case sensitivity can get complex in other languages. A file system that supports unicode suddenly has to deal with that. It makes it hard to find an ideal solution.
The point is that a case-insensitive/case-preserving file system must make a choice as to which locale it wants to function in, because at that point case folding becomes a mandatory operation just in order to store a file by name. Having file naming and uniqueness rules be different for different users would be a nightmare.
Case-sensitive file systems don’t have to deal with this. They can take locale into account for things like searching and sorting, but those things are in a sense UI operations, so the file system can just present them to each users according to their chosen locale (and operate in a case-sensitive manner below the hood). Its in the fundamental process of naming things where this becomes a challenge to deal with across locales, and case-sensitive file systems can happily not bother.
Case-insensitive/case-preserving file systems have to function in a system locale, because naming things (and the requisite case folding required) has impact beyond the current user on a multi-user system, and it is an operation that has to be performed even when there is no user other than the system itself (on behalf of a process for example).
What happens when a user uploads a file to a web server? Do you know what locale they operate in? Do you know what locale the user who will next look at the file operates in? Sure, you can just name the file whatever you want when stored and ignore or override the uploader’s filename on store (and that is what most systems actually do), but I’m just pointing out the challenges involved for case-insensitive file systems. All a case-sensitive file system has to do is see if a file with the exact same name exists, it doesn’t have to deal with locale until later (if it chooses to)…
Again, I’m not trying to make a strong argument either way. I’m just pointing out there are factors involved most people don’t think about.
Edited 2018-05-30 15:22 UTC
galvanash,
It certainly is easiest to stick with a binary comparison, and in the interest of simplicity it seems like the best way to go. However I am still bothered by some of the consequences, like URLs being case sensitive.
Case sensitive file systems are generally the reason why URLs are case sensitive on linux servers even though I think ideally they should not be.
Works:
http://www.osnews.com/story/30418/UTC_is_enough_for_everyone_Right_
Broken:
http://www.osnews.com/Story/30418/UTC_is_enough_for_everyone_Right_
Edited 2018-05-30 17:43 UTC
And the following is OK:
http://www.osnews.com/story/30418/UTC_IS_ENOUGH_FOR_EVERYONE_RIGHT_
That is just ridiculous in my opinion. I know this is an edge case, but it does not make any sense at all that you can’t capitalise “story” but you can capitalise just about any other part of the URL.
As you might have guessed from that, the portion between the initial / and the ?, #, or end of the URL can be any string and making it look like a hierarchical path is merely convention.
The “story/” is a literal character-sequence match, the first capture group, which only matches one or more digits, is used as a primary key for a database lookup and the second capture group, which matches / followed by zero or more of anything, is probably ignored, aside from being filled from a post slug column when generating links in order to make more human-friendly URLs.
The following URLs also work:
http://www.osnews.com/story/30418
http://www.osnews.com/story/30418/THIS_CAN_BE_ANYTHING!
http://www.osnews.com/story/30418/C:\WINDOWS\SYSTEM\BLANK.SCR
(Though Firefox does a \ to / normalization on it before loading it.)
Edited 2018-05-30 18:49 UTC
That’s the thing though… URLs are not (necessarily) case sensitive, the web servers are. Its not even really an issue with the file system being case-sensitive, its just that the web server doesn’t bridge the gap properly. Apache fixed this ages ago, and most modern web servers handle it fine (at least optionally). It very slightly impacts the response time on the first request to a static file, but once it has been served once its pretty much invisible from a performance perspective.
Again, I have no issue whatsoever with how Windows and OSX handle case-insensitivity. I get it, it works fine in most cases, and it is less confusing for users in many regards. Just playing devil’s advocate.
Edited 2018-05-30 19:03 UTC
galvanash,
There may not be a one size fits all solution, being able to configure it is probably the best that we can hope to do. While it’s not a file system, I like the way mysql handles it, allowing each table and column to set character sets and collation independently in whatever ways are needed by the application.
Edited 2018-05-30 23:06 UTC
Fair enough. I expected it would have less of an impact, even on that large a directory… I just never run into this problem though, static file serving has become almost a non-issue in my work (i.e. it so rarely happens that it just doesn’t matter much). With CDNs caching everything, my servers rarely even see them to be honest, and when they do I only have a few hundred files to scan at most. But yeah, I can see it becoming an issue with millions of files…
Edited 2018-05-31 00:26 UTC
galvanash,
This is more powerful than one might realize at first glance because you can index the result of almost any function, including those you write! It would be awesome to have a similarly powerful indexing capability inside of file systems!
When I said “at the file system level” I just meant that it was not a property of the volume as a whole, it can be toggled per file/directory. On the other hand, it is my understanding that on OSX/APFS case-sensitivity is actually a volume property that gets baked in on format.
Anyway, don’t disagree with anything. For a good example of the pain required to do proper case folding/unicode normalization, read up on the development of APFS. This guys blog covers a lot of it and how it has slowly progressed in the last couple of years to being usable.
https://eclecticlight.co/2017/07/05/high-sierra-and-filenames-apple-…
https://eclecticlight.co/2017/04/06/apfs-is-currently-unusable-with-…
https://eclecticlight.co/2017/04/07/apfs-and-macos-10-13-many-apps-a…
Edited 2018-05-31 03:21 UTC
If the kernel mode case insensitive is off then win32 api is fully case sensitive. It basically a big myth that Windows is not case sensitive. Windows NT line of operating systems is case sensitive but before windows 10 new feature it was turn case insensitively on and off system wide or use NT direct functions. Reason why the NT direct function could handle case sensitive is if you had created files with case insensitively off and they were double ups ie hello.txt and Hello.txt in the same directory both files in insensitively would not be readable to win32 applications.
There is a common mistake by windows application developers incorrectly presuming windows API is always in case insensitivity mode and this causes problems when you are attempting to tweak a windows serve for performance.
Really from a performance point of view you don’t want application code demanding case insensitivity. For user created documents there might be a reason for case insensitivity.
oiaohm,
There’s a lot more interesting tidbits, but suffice it to say that NTFS does support case insensitivity at the file system level! It’s not merely something added on top of a case sensitive file system. If you read my earlier posts it should become clear why adding case insensitive interface on top of a case sensitive file system doesn’t work too well.
Edited 2018-05-31 14:37 UTC
Everything you quoted is not really written to the NTFS disc by Windows just way open source implementation has decide to-do it.
The NTFS file system itself being the disc format is case sensitive. Ordering of file entries on disc has capitals ahead of lower case. So Abc comes ahead of abc. This is important you turn case insensitivity on with windows when you had it off it create Abc and abc result “Abc” opens and “abc” is hidden. You turn it back off and both file appear again. Windows NTFS case insensitivity is basically open the first file that matches in the index if the file was all caps.
If your driver with NTFS is changing the ordering of things based on case insensitive or case sensitive is broken. This does bring out different bugs.
oiaohm,
Your statements were proven wrong, if you have another point to make, then make it, otherwise I’m not interested in arguing over facts.
Edited 2018-05-31 15:30 UTC
Hi, I used to work on the Windows NTFS driver.
oiaohm’s comments are roughly correct. NTFS collates indexes case insensitively, then applies case sensitive matching after case insensitive matching. This means that “a” comes before “B” in a directory, for example. And I’d wholeheartedly agree with his comment that “having to rewrite the indexs just because you have turned case sensitivity on or off is just pure insanity”; obviously if you can change a registry key and reboot, and your system does in fact boot successfully, then that tells you about how NTFS collation here works. You didn’t need to reformat or rewrite all the indexes, because the current (insensitive) indexes are correct for case sensitive mode. If other implementations do something different for case sensitive behavior, they won’t interop well with Windows. Every index collation needs to go through $Upcase.
There are two things I’d disagree with oiaohm on though.
First, I don’t think the system will be more efficient in case sensitive mode. The above behavior implies it will be somewhat less efficient, because every lookup is always insensitive first (since the trees are collated insensitively), but if a case sensitive lookup is requested there’s additional work to look through the case insensitive matches for a case sensitive match. In case insensitive mode, any match is a valid match, so there’s no need to perform additional case sensitive compares.
Second, the behavior of obcaseinsensitive is a poorly understood mess. Pre-XP, the NT API is case sensitive by default, but individual opens request case insensitive behavior via OBJ_CASE_INSENSITIVE. Win32 generally requests insensitive behavior, unless FILE_FLAG_POSIX_SEMANTICS is specified, where it requests case sensitive behavior. obcaseinsensitive, added in XP, overrides the NT API and forces _all_ opens to be case insensitive. So changing the registry key doesn’t make the system case sensitive; it makes NT callers using default options case sensitive and it allows Win32 applications to request case sensitive semantics. After changing the key, most Win32 applications will be case insensitive, and the behavior of NT callers is as good as random.
Connecting these back to the earlier point though, note that case sensitive behavior is a per-open request. That also tells you that directory collation order isn’t going to change as a result of case sensitivity.
malxau,
Yes I agree that the filename index is always collated with case insensitivity. The thing about NTFS is that it is a super-set of both case sensitive and case insensitive file systems. Can we all agree on that?
One of the consequences of this is that unlike purely case sensitive file systems like ext3, the physical organization of NTFS indexes on disk is dependent upon unicode international case mappings. Different versions of windows have updated this mapping over time to accommodate unicode additions. This is NOT a property of case sensitive file systems. Ext3 doesn’t even consider what characters mean according to unicode, it’s just bytes. The NTFS indexes on the other hand have a hard dependency on the meaning of letters from the unicode standard. Future letter code points would require updates to the mapping used by NTFS, Agreed?
This is why I am not going to agree with oiaohm that NTFS is a case sensitive file system in the same sense that ext3 or other file systems are. NTFS is a hybrid who’s disk structure depends on letter cases.
Edit:
I never had access to the windows source code like you, so I’d be curious if you have any opinions on how the open source NTFS driver got things right or wrong? I’m guessing all the open source code came about through reverse engineering.
Edited 2018-05-31 21:55 UTC
I have the opposite problem, I don’t want to look at this particular piece of open source code, which makes comparison difficult. What I’m hearing though, which your earlier posts appeared to imply, is that it’s not applying $Upcase from the volume at all and is assuming the current system Unicode table matches the collation order on the volume. This would be wrong/dangerous, but without carefully looking at the code, I’m not certain that’s what’s happening.
malxau,
The fuse NTFS implimentation also loads the $upcase map on existing volumes, which makes sense for data integrity, although it means we have to reformat NTFS volumes to incorporate any unicode updates.
So it won’t “contaminate” you?
This is not a unique feature of case insensitive file systems. You do find case sensitive file systems ordering there look-ups based on many different rules.
There is a downside to this and this downside explains why its not a popular feature. When unicode mappings get updated/ordering rules change the data on disc can now now be wrong. Please note fat that is a pure case insensitive file system does not do ordering. Ordering is a look-up optimisation.
Most of the Unix/Linux/BSD file systems don’t both with look-up optimisations that can change over time due to trouble it can bring. Please note I said most you do see rare ones that have done ordering based on unicode and other things.
There was an attempt to added unicode ordering to xfs and unicode case insensitivity this was dropped due to the multi levels of broken.
As people have stated depending on what langauge you are if a letter in unicode is a uppercase or lowercase in fact change. If you are writing a file system to have a constant disc format between different users with different languages you cannot do case insensitive with Unicode.
Letter upper/lower case categories of unicode standard is only rough guide. Yes it mostly is this a large written char(uppercase) or a small written char(lowercase) in the unicode standard. Not that it makes any sense for the language the person is using. So when you have a language that the small written char has a different meaning to the large written char the unicode upper/lower case screws you over.
Windows NTFS usage does not magically fix the broken.
https://www.fileformat.info/info/unicode/category/Ll/list.htm
This is the lower case list of unicode. Have a good look at how many times you have a letter “a” with a different unicode number. Yes even case folding upper and lower you are still left with a headache and a half.
Lot of the problem starts with the introduction of the printing press and the standard char-sets. This resulted in upper chars being abused as lower in particular languages mostly to avoid the book production house requiring to make a unique set of letters for different languages. So a historic short cut comes forwards today.
oiaohm,
NTFS clearly is not a pure case sensitive file system nor is it a pure case insensitive file system. It is a hybrid built around unicode case mappings with properties of both. If you want to disagree over semantics then so be it, let’s agree to disagree.
Edited 2018-06-01 02:15 UTC
There is a older non documented flag in first version of windows NT that does from NT to 2000 to force case insensitive off. It is still a on/off flag. XP on provide it standard in a different location in registry just to be fun. Yes it was a win32 subsystem flag.
It is in fact required because there is a nice little issue with case insensitive due to the upper and lower case being bases on local setting not written to early NTFS so you had a drive from a different country/locale setting install of Windows and you could have problems seeing the files. So work around for this was always with NT turn everything case sensitive. XP the work around was formalised. But I do agree Windows implementation of case sensitive and case insensitive can be highly random at times what happens.
You do see overhead on NTFS when applications are stupid and don’t keep their case constant. Yes case insensitive is on by default but the fast path is always the case sensitive look up. If an application has it file names wrong it will be costing you performance in case insensitive mode and fails when you are in case sensitive mode. This is why people developing applications really check in case sensitive mode if their program bites it there is a problem.
I don’t believe this is true and this thread has gone into a lot of detail as to why. A file called “a” is collated before “B”. If an open for a file on disk that is called “a” arrives as “a”, it still needs to be upcased to “A” to locate the file in the index. The fact that case matches won’t make the index lookup any faster, because the index lookup isn’t based on the original case of the file. The only time a case sensitive compare needs to occur is to open a file in case sensitive mode – ie., this is purely a performance tax on case sensitive operations. I’d really love to see a benchmark that shows case sensitive being faster, because all the code I’ve seen will not behave that way.
I could believe there is other code in the universe that has such a “fast path” (eg. a case sensitive hash table based cache) which causes case mismatch to trigger a slow path, but that is not code in NTFS.
That’s why I love discussions like this on osnews… I felt like I knew a bit about how NTFS worked, only to find out my knowledge only scratched the surface.
Thanks for all the info everyone.
I am possibly quite ignorant about this, but I wasn’t suggesting that the filesystem shouldn’t be case sensitive “behind the scenes”. Only that the user shouldn’t have to deal with case sensitivity in the filesystem that they are exposed to. I would add that even programmers should not have to deal with case sensitive filesystems. Therefore, having a file called “Names.txt” in a folder should preclude having a file called “names.TXT” in the same folder – the OS should prevent that. Therefore the uniqueness rules would be the same for all users. I think users can handle the OS telling them that they can’t use that filename because there is already another filenames with the same (or very similar) filenames.
The problem people keep covering is that different locales have different rules for which glyphs form lowercase-uppercase pairs, and, among other things, different users on the same system may have selected different locales.
The most famous example being that, in most locales, i and I are a lowercase-uppercase pair but, in Turkish, they aren’t because Turkish adds a dotless “i” and a dotted “I”.
Imagine if a Turkish person had “i.txt” and “I.txt” and they were allowed, because they’re not lowercase/uppercase variations on each other in Turkish, then you log in with another locale.
…or do you think Turkish people should be confusingly forced to obey another language’s case-folding rules?
It’s not about Turkish people being forced to obey another language’s case folding rules. It’s about a computer making certain things unambiguous for the user. Again, I am not an expert in this, but I can’t see why you couldn’t have a glyph replacements for a Turkish locale so that the dotted i is the i we know and love in the “west”, while the dotless i is a different letter completely. However, in that case, the dotted i is never capitalised to the dotless i and the dotless i is never “lower cased” to the dotted i.
But then how do you implement your case-insensitive filesystem semantics if two different users of the same filesystem are using locales with contradictory interpretations of characters in filenames?
https://superuser.com/questions/266110/how-do-you-make-windows-7-ful…
Reality is all versions Windows from the NT line using NTFS have had the ability to turn case sensitive on by turn case insensitivity off.
It is fun with wine to see how many windows applications the code has build and only tested with case insensitivity on.
Yes the fun problem is not all windows programs are able to work on case sensitive file systems due to programmer coding issues. Some of these can be nice never ending loops like the following
1) Create file “abc”
2) Check for file existence but using “ABC” file does not exist
3) since file does not exist was return attempt to create file “abc” fail with error because “abc” exists. Goto 2.
Yes there are windows programs out there that are that stupid. Of course they should have failed quality control if quality control was done with case insensitivity off. Maybe this change will get all applications testing with case insensitivity off.
It isn’t really fair to blame Windows programmers.
While NTFS is case-sensitive, Win32 is not. If you want case-sensitivity in your file handling, you have to step outside of Win32 and deal with NT system calls, which means your software wouldn’t be guaranteed to be portable across Windows versions.
While this isn’t really an issue now since everything is NT, this wasn’t the case when Win9x was around, and might not always be the case.
Although prior to win95 there was no such thing as a lowercase filename, all filenames were entirely in uppercase, so rather than introduce sensible filename support they did a kludge which has resulted in further problems down the line.
No, long case-preserving filenames in Windows existed before 9x – specifically, Windows NT 3.1
And, what problems has it caused, other than “It isn’t the same as other OSs?” Because, while not being the same is an inconvenience, it isn’t the same as being a problem, at least a problem that falls squarely on Microsoft.
Inconveniece IS a problem.
Apparently, you have your own definition of a “problem”?
What’s inconvenient for one person is convenient for another.
I want to know about any actual problems other than “It isn’t the way my preferred system does things.”
As in, what actual problems exist by the nature of being case-insensitive that aren’t merely “It isn’t case sensitive”
I find case sensitivity an inconvenience. From my perspective, the problem exists in Unix, and the Windows way of doing things solves that problem.
By that logic, random OS crashes is merely an inconvenience. You know, they may be an inconvenience for you, but for me it provides a nice break to go make some coffee.
I have always assumed case-sensitive filesystem to be a dumb thing from users perspective. As a user, one will always get annoyed by case-sensitivity in file/folder names. Maybe its important for programmers, but as end-user I feel its stupid to assume two file names are different just because of case even though they contain exactly the same string of letters. Does the meaning of the word change if you capitalize first letter? No? Does it change if you write it all in CAPS? No? The why the hell would you think otherwise in filenames?
Let’s take 3 cases of folder name:
1. “My Vacation Pictures”
2. “my vacation pictures”
3. “My vacation pictures”
Who could ever think these should be 3 separate folders??? That’s just… Dumb.
Edited 2018-05-30 05:48 UTC
It’s all about freedom. Some are used about capitalization, hence more confident with case sensitiveness.
But cannot cope with space in file/folder names and replace them with whatever else (dots, underscores, etc) It’s like the endianess war.
I don’t care at all about case in file name as like you put it, WFT is the same as wtc, the meaning is not the filename, it’s its content.
Again, only programmers/admins/geeks care about things you mentined. I am confident that majority of regular consumers don’t give a rats ass about case sensitivity and would be very baffled if Windows (or Mac) FS became case-sensitive by default. It’s just counter intuitive and you need to learn/be conditioned to be OK with it.
Actually, that’s because shell script has such a screwy approach to quoting (especially the “split apart by default” handling of variables), some other Unixy things inherited that mistake via the system(3) function for executing subprocesses via strings rather than arrays, and, if you don’t account for that, you find mysterious errors in the darndest places.
(eg. My mother uses LyX as a “looks great by default” way to typeset books. She was getting a mysterious error message. On a hunch, I tried replacing the spaces in the filename with underscores. Sure enough, somewhere deep in the maze of LaTeX scripts and config files, there was a spot that couldn’t handle filenames with spaces.)
Edited 2018-05-30 09:03 UTC
Well, system(3) is inherited from C. The UNIXy way is with posix_spawn.
ARE YOU TELLING ME THAT NO ADDITIONAL INFORMATION CAN EVER BE RELAYED BY CHANGING THE CASE? SO THIS COMMENT DOESN’T SEEM LIKE I”M SHOUTING AT FULL VOLUME INTO YOUR EAR DRUMS? STRANGE. OK. HAVE A NICE DAY, AND DON’T FORGET TO GET YOUR PET SPAYED OR NEUTERED.
Case insensitive doesn’t mean you can use upper case. It means that if that is a filename, then you can’t have a file with the name “are you telling me that no additional information can ever be relayed by changing the case? so this comment doesn’t seem like i”m shouting at full volume into your ear drums? strange. ok. have a nice day, and don’t forget to get your pet spayed or neutered” in the same folder.
So now you can have a single filesystem where some locations are case sensitive and some are not… And nodoubt lots of software which is not case sensitive and will break when it encounters a case sensitive location.
Poor design, resulting in massive complexity which leads to bugs and security problems.
Every Windows version ever.
But Windows deals pretty well with spaces in file/folder names while Linux don’t because reason (see comment a bit above)
That is also true… I am always laughing when I run into another person with this stigma of “never use spaces in file names”. I always use spaces in file names and never had a problem in my life.
Linux itself deals well with spaces in filenames, as does most of the infrastructure.
Heck, the vast majority of my stuff deals perfectly well when some kind of encoding or copy-paste goof results in filenames containing newlines, bytestrings that aren’t valid UTF-8, etc.
It’s just that, as with case-sensitivity on Windows, there are some UNIX/Linux applications which rely on old APIs and aren’t properly tested with spaces.
Edited 2018-05-30 14:55 UTC
I tend to manage my system such that it doesn’t matter. Here are some of my general rules:
1. Don’t use spaces in file names.
2. Don’t have files that differ only by the case of the name.
3. Have a standard for naming things, just like you do in programming (don’t you??). For instance, when do you capitalize the first letter, do you use camel case, or underscores to separate names, etc. Keep it consistent.
4. Avoid anything but letters, numbers and underscores.
5. When putting a date stamp in a file name, always put yyyy_mm_dd so the files sort in a natural order.
I’ve followed these rules since my early days of computing just so I could avoid these kinds of issues. It has served me well.
This one is approaching the stage of archetypal myth where everyone knows its “wrong” but no one knows why any more. Ask any regular non-techie computer user why are they avoiding spaces and/or national letters in filenames and the only thing they will be able to tell is that using spaces in filenames “will make bad things happen”.
I know why. I want to make my life easier. I know I can use spaces on all major OS’s. But I am a programmer. I have specifically had issues with scripts and code that didn’t always quote the spaces correctly. So don’t say I don’t know why. I do. Just say it doesn’t matter to you.
Maybe you should read my post again. Especially the sentence where I write “non-techie”.
Not sure about the world you are living in, but here we consider a programmer to be a pretty “techie” type.
Good point.
The biggest thing to understand here is that filesystems were originally case insensitive because the operating systems they were designed for were case insensitive. In other words, the VFS layer (or in some older cases, the input layer) itself did case folding, so the filesystem didn’t need to store letters in differing cases.
Properly case folding is hard however, just like Unicode normalization, and is very dependent on the language being used. The only reason those older systems had no issues is because they weren’t localized, they were English only, so they could get away with just ignoring bit 6 if bit 7 is set and the low 5 bits are between 1 and 26. In modern systems though, it leads to some rather complicated processing on every system call interacting with the filesystem, and the only practical reason that it’s still kept around is for compatibility.
Of course, if you want to go with ‘normalizing’ input, why not just store a language tag with every file and directory name, and accept all the different phonetizations and transliterations too? I mean, I’m sure it would be useful to be able to treat the Japanese word 富士山 the same as any of: ãµã˜ã•ã‚“, Fujisan, Huzisan, and Huzisan (it’s the same in Nihon-shiki and Kunrei-shiki romanizations), and that’s not even including wÄpuro rÅmaji, IPA phonetization, and non-romanized transliterations (using, for example, Cyrillic or Hangul).
Majority of file systems are in fact case sensitive on disc.
https://lwn.net/Articles/754508/
There are a few rare ones like xfs created with “case insensitive” that disc format is modified.
https://en.wikipedia.org/wiki/Comparison_of_file_systems
Do note the “Case sensitivity” and “Case preservation”
Most file systems that are Case insensitive on disc are also don’t have Case preservation. So all file names have to be all the same. So every file name could be like upper case and that would be stated in the file system specifications and this would be ODS-2 file system and others. This keeps case insensitive processing simple create a file normalise the file name done.
Yes there are true case insensitive file systems out there. Reality is most operating systems started with case sensitive. Of course the biggest exception was MS DOS with fat file system that was case insensitive and lacking case preservation.
But its less than 30 operating systems total that had case insensitive file systems as operating system file system with most of them being from Microsoft or clones of Microsoft.
So it turns out file system being case insensitive in operating history is a rarity.
It’s a rarity though simply because the OS did case folding. For almost all applications in existence other than some really low level forensics and data recovery stuff, what matters is whether the driver for that particular filesystem does case-folding (or the VFS layer does it), not whether or not the on-disk format is case insensitive or not (strictly speaking, the on-disk format is never case insensitive unless an encoding is used that doesn’t provide case differences, only the driver is).
There’s two other things to keep in mind though:
* You’re not accounting for really old legacy stuff that used the 1963 version of ASCII, which did not have lower case letters, as well as other systems which used similarly case insensitive and non-case-preserving encodings. Handling of those encodings is why DNS and URI’s are case insensitive (though most people don’t know this because almost all browsers inexistence case-fold URI’s to lowercase), and the same goes for almost all other case-insensitive protocols.
* It doesn’t really matter if only 30 or so OS’es were case insensitive, because the ones that were were the important ones, and therefore they had the most influence on modern systems. In particular, the original Mac OS, MS-DOS and derivatives are all case insensitive, and they have largely shaped modern client system development. The only case sensitive system that had a huge impact on modern system designs was UNIX (I mean, VMS could be counted, but it’s functionally dead and most of its modern influence was on Windows).