Researchers have produced a collision in iOS’s built-in hash function, raising new concerns about the integrity of Apple’s CSAM-scanning system. The flaw affects the hashing system, called NeuralHash, which allows Apple to check for exact matches of known child-abuse imagery without possessing any of the images or gleaning any information about non-matching pictures.
On Tuesday, a GitHub user called Asuhariet Ygvar posted code for a reconstructed Python version of NeuralHash, which he claimed to have reverse-engineered from previous versions of iOS. The GitHub post also includes instructions on how to extract the NeuralMatch files from a current macOS or iOS build.[…]
Once the code was public, more significant attacks were quickly discovered. A user called Cory Cornelius produced a collision in the algorithm: two images that generate the same hash. If the findings hold up, it will be a significant failure in the cryptography underlying Apple’s new system.
American tech media and bloggers have been shoving the valid concerns aside ever since Apple announced this new backdoor into iOS, and it’s barely been a week and we already see major tentpoles come crashing down. I try not to swear on OSNews, but there’s no other way to describe this than as a giant clusterfuck of epic proportions.
Yes, there is very little coverage, but expecting more from the media is probably not realistic.
I expect most tech journalists to be conflicted. They of course have affinity to our common interests. However if Apple says this is a good thing, and a few random people say it is not, they will probably choose the safer option.
Even though people individually would have good intentions, there would be an implicit bias to favor their favorite companies.
In game journalism, this is perceived as being “bribed”: https://www.reddit.com/r/Games/comments/sdp2p/are_game_reviewers_actually_bribed/. It is not, but there is an “access” issue. I remember seeing some outlets blacklisted for negative reviews. Same could be thought for any outlet covering a tech firm. A “Microsoft Insider” for example is unlikely to heavily criticize Microsoft and continue being an insider.
Again, this is probably at subconscious level, and not done with a bad intent.
It’s something I’ve always wondered.
Why do we place such faith in standard hash functions for uniqueness?
The first time I really thought about it was when I was first shown how GIT works. It basically says that 2 files are equal if their hashes are the same. Yes, we all operate with full knowledge that hash collisions in real life are rare. Yet, they still happen. It puzzled me why the uniqueness of a file didn’t include some actual data of the file (file size, maybe name…). Apparently, it was just always thought that those bits to store actual data of the file would be better spent making the hash-size bigger.
To me, making something like file-size a part of the hash cuts off entire vectors of attack (can’t remove or add data to a file to make the hashes the same) for example. Storing the filename help cut off any accidental file collisions that might screw up the repo at the expense that file renames might be stored as a ‘new’ object.
In this case, it seems Apple has at least designed it to expect collisions. But the details of the secondary validation system are not really known, so who knows what it really does.
Like I said, I know using hashing as uniqueness seems to be a thing we depend on in the modern age. There’s just a part of my brain that doesn’t want to accept it 100 percent. Linus would probably have a field day on me 🙂