One bad Apple

Thom Holwerda 2021-08-08 Apple 24 Comments

Dr. Neal Krawetz, one of the leading experts in the area of computer forensics research, digital photo analysis, and related topics, has penned a blog post in which he takes apart Apple’s recent announcement and the technology behind it.

He actually has a lot of experience with the very problem Apple is trying to deal with, since he is the creator of FotoForensics, and files CSAM reports to the National Center for Missing and Exploited Children (NCMEC) every day. In fact, he files more reports than Apple, and knows all the ins and outs of all the technologies involved – including reverse-engineering Microsoft’s PhotoDNA, the perceptual hash algorithm NCMEC and Apple are using.

The reason he had to reverse-engineer PhotoDNA is that NCMEC refused to countersign the NDA’s they wanted Krawetz to sign, eventually not responding to his requests altogether. Krawetz is one of the more prolific reporters of CSAM material (number 40 out of 168 in total in 2020). According to him, PhotoDNA is not as sophisticated as Apple’s and Microsoft’s documentation and claims make it out to be.

Perhaps there is a reason that they don’t want really technical people looking at PhotoDNA. Microsoft says that the “PhotoDNA hash is not reversible”. That’s not true. PhotoDNA hashes can be projected into a 26×26 grayscale image that is only a little blurry. 26×26 is larger than most desktop icons; it’s enough detail to recognize people and objects. Reversing a PhotoDNA hash is no more complicated than solving a 26×26 Sudoku puzzle; a task well-suited for computers.

The other major component of Apple’s system, an AI perceptual hash called a NeuralHash, is problematic too. The experts Apple cites have zero background in privacy or law, and while Apple’s whitepaper is “overly technical”, it “doesn’t give enough information for someone to confirm the implementation”.

Furthermore, Krawetz “calls bullshit” on Apple’s claim that there is a 1 in 1 trillion error rate. After a detailed analysis of the numbers involved, he concludes:

What is the real error rate? We don’t know. Apple doesn’t seem to know. And since they don’t know, they appear to have just thrown out a really big number. As far as I can tell, Apple’s claim of “1 in 1 trillion” is a baseless estimate. In this regard, Apple has provided misleading support for their algorithm and misleading accuracy rates.

Krawetz also takes aim at the step where Apple manually reviews possible CP material by sending them from the device in question to Apple itself. After discussing this with his attorney, he concludes:

The laws related to CSAM are very explicit. 18 U.S. Code § 2252 states that knowingly transferring CSAM material is a felony. (The only exception, in 2258A, is when it is reported to NCMEC.) In this case, Apple has a very strong reason to believe they are transferring CSAM material, and they are sending it to Apple — not NCMEC.

It does not matter that Apple will then check it and forward it to NCMEC. 18 U.S.C. § 2258A is specific: the data can only be sent to NCMEC. (With 2258A, it is illegal for a service provider to turn over CP photos to the police or the FBI; you can only send it to NCMEC. Then NCMEC will contact the police or FBI.) What Apple has detailed is the intentional distribution (to Apple), collection (at Apple), and access (viewing at Apple) of material that they strongly have reason to believe is CSAM. As it was explained to me by my attorney, that is a felony.

This whole thing looks, feels, and smells like a terribly designed system that is not only prone to errors, but also easily exploitable by people and governments with bad intentions. It also seems to be highly illegal, making one wonder why Apple were to put this out in the first place. Krawetz hints at why Apple is building this system earlier in this article:

Apple’s devices rename pictures in a way that is very distinct. (Filename ballistics spots it really well.) Based on the number of reports that I’ve submitted to NCMEC, where the image appears to have touched Apple’s devices or services, I think that Apple has a very large CP/CSAM problem.

I think this might be the real reason Apple is building this system.

About The Author

Thom Holwerda

Follow me on Mastodon @[email protected]

24 Comments

2021-08-08 10:48 pm

Alfman verbose=1
Perhaps there is a reason that they don’t want really technical people looking at PhotoDNA. Microsoft says that the “PhotoDNA hash is not reversible”. That’s not true. PhotoDNA hashes can be projected into a 26×26 grayscale image that is only a little blurry. 26×26 is larger than most desktop icons; it’s enough detail to recognize people and objects. Reversing a PhotoDNA hash is no more complicated than solving a 26×26 Sudoku puzzle; a task well-suited for computers.

This is also a major problem with biometric hashes. The fact that these kinds of applications need lossy matching by design significantly reduces their cryptographic strength to the point where brute forcing hash inputs is completely feasible. Fussy hashing can be useful in many applications, but I am in agreement with this article that the security aspects tend to be overstated. Some vendors have a financial interest to have us overlook the limitations.

What is the real error rate? We don’t know. Apple doesn’t seem to know. And since they don’t know, they appear to have just thrown out a really big number. As far as I can tell, Apple’s claim of “1 in 1 trillion” is a baseless estimate. In this regard, Apple has provided misleading support for their algorithm and misleading accuracy rates.

Who knows what math apple used, but for the sake of argument even if we assume the technology does work as advertised, it creates a slippery slope where governments can start to exploit the technology for data mining, tracking dissidence, and curbing freedom of speech. I find the potential for abuse extremely worrisome. Moreover with government gag orders we won’t necessarily know when our privacy has been invaded and to what end. Remember that the companies involved in the NSA prism program leaked by Edward Snowden all initially denied their involvement. And the government gave ATT immunity to perform warrant-less wiretapping before that.

2021-08-09 7:15 am

oiaohm
Problem here is the apple answer 1 in a trillion is not a useful answer.

This does not tell you the range of the generated hash error rates. So you could get a 1 in trillion average error rate with some with 1 to 1 error rate and some with a 1 in 2 trillion error rate. Yes averages out at 1 in a trillion.

Also 1 in a trillion does not sound that great when there were over 1 trillion unique photos taken in 2016. Yes more than that were taken in years since. 1 in a trillion with photos is almost 100 percent sure that you do have at least 1 hash collision in just taken photos in a year since 2016 without the computer and other sources. Of course every year since 2018 you are basically adding another collision. Yes each hash generated for things with the same error rate results in it having collisions.
https://blog.mylio.com/how-many-photos-will-be-taken-in-2021-stats/
Yes the year 2020 with all the lock downs we still took over 1.12 trillion unique digital photos. Yes that if you divide 7.9 billion population the adverage is 151 each Now if you restrict that to developed countries(1.3 billion) that is 923 pictures each.

I know in the last 12 months I did not take 923 pictures. The human population has some serous shutter bugs. Yes that is 2 and a half pictures a day to keep up with that rate.

Please also note that a lot of those trillion photos per year will be taken on smart phones of people taking pictures of their own children.

The reality here as you run the maths 1 in a trillion sounds like a good number. But we are talking about photos. Yes that over a trillion photos per year is without screen shots, pictures extracted from video new drawings, memes….

Reality here we are under a decade from 2 trillion photos a year taken by digital cameras.

2021-08-09 11:47 am

Anonymous
In the days of film most people didn’t have cameras or only took pictures for special events like holidays. Typically they would have 1-3 rolls of film which contained anywhere from a dozen to 36 frames. Relative to income it was expensive. People forget up until the benefits of modern technology including agriculture took off even in the 1970’s or even 1980’s food would be a third of most of people’s available income. Today a person can rattle of 100-300 images virtually cost free which is around the kind of level a pro photographer with a corporate client would do for a magazine photoshoot. Multiply that by how many people have a camera today – nearly everybody and you can imagine the total number of photographs taken per year has risen by multiple orders of magnitude. Not everybody is taking huge numbers of photos at a time but the number of photos has still risen by orders of magnitude.

Also “back in the day” not everyone who could afford a camera could afford a movie camera whether 8mm or 16mm and their variants. Forget 32mm let alone 64mm unless you were a professional in the biz. Unlike digital even with some software imposed daft limits you could only use a limited amount of reel time once.

Film also has a narrow exposure range. You have to have different film for different light conditions whetehr exposure or colour temperature. Do you have the right film on you? A flash bulb or flashgun? Gel filters? Lights?

Most people either didn’t know or didn’t care about any of this outside of enthusiasts and photography clubs or professionals. If they did they would typically hire a professional and even then it was for one off events like weddings. Today almost every digital camera once they hit 5 megapixels was good enough and now cameras with ISO in the range of tens of thousands just do it all for you. They won’t help you be a better artist but they are very forgiving. It’s almost impossible to take a bad picture.

A lot of those pictures will be forgotten about or lost or deleted so the practical number of photos taken is effectively modest. In the real world it’s probably what an upper middle class income person with plenty of disposable income would have created back in the day. That’s not just one one well off person everyone knew. Today that’s everyone so it’s still a huge rise in photographic output.

2021-08-09 12:27 pm

oiaohm
–Today almost every digital camera once they hit 5 megapixels was good enough and now cameras with ISO in the range of tens of thousands just do it all for you. They won’t help you be a better artist but they are very forgiving. It’s almost impossible to take a bad picture.–

I disagree on almost impossible to take a bad picture. There is a lot of photo taking my mother does that proves its very possible to take a bad photo. But the fact she has at least 10 128G sd cards with her when she has her camera a taking 1000 pictures to get one is not a problem. Yes each SD card is like 4,368 shots in raw. or equal 128×36 rolls in the old film base that is basically impractical to carry. Please note my mother likes taking things light jet fly by at airshows this is just for fun in retirement. Scary when person like this has been out for 1 day and they have filled all the SD cards they had with them.

Yes 10 SD cards fit in the space you would need to carry 1 36 roll of old.

I don’t take the 923 per year but my mother makes up for many people in one person. No way I am giving her a iphone any time soon.

Yes a lot of the trillion photos per year get deleted this is true. But a lot get uploaded to the cloud before being deleted.

The volume of create if photos is a lot larger than lots would presume. Yes having a decanted device for photos may come more recommend so that all those junk photos don’t end up scanned so lowering risk of a false postive. Yes a lot of people take 128G smartphone on holiday and fill the storage please note that is aprox 20,000 photos in jpeg almost as much the 5 128G SD cards with raw photos.

Digital cameras have made some people quite insane shutter bugs due to how many photos they can take so they stop caring about the ratio of bad to good if they have to take 100-1000 photos to get 1 good one they will. Yes old film you could not take 1000 photos to get one in most cases because the cost of the film and processing prevented this.

2021-08-09 3:17 pm

Anonymous
@oiaohm

Today almost every digital camera once they hit 5 megapixels was good enough and now cameras with ISO in the range of tens of thousands just do it all for you. They won’t help you be a better artist but they are very forgiving. It’s almost impossible to take a bad picture.–

I disagree on almost impossible to take a bad picture. There is a lot of photo taking my mother does that proves its very possible to take a bad photo.

Please parse what I said properly. Artistic merit and technical merit are two seperate things. While most photos are of dubious to no artistic merit it is very difficult to fail with a modern camera. Set a modern camera to “auto” and it will be perfectly exposed. With a larger sensor camera such as a DLSR you can be in appaling light conditions and the picture will come out perfectly exposed.

A lot of “gear heads” cannot take a photo to save their lives. Women tend to be more motivated by collecting memories.

“Pre-visualistion” is key to taking many great pictures. That said even a “snap” can have merit.
2021-08-09 3:46 pm

oiaohm
— While most photos are of dubious to no artistic merit it is very difficult to fail with a modern camera.–

HollyB I read it right. There are a lot of ways to take a today ruined photo. Sensor whited out because you pointed to too close to the sun. Rain or oil mangled due to water or oil on the lens. Yes eating grease food then attempting to use your phone to take photo. Part of hand over like 50 percent of the picture people yes manage todo this with a mobile phone normally while the mobile phone is also on the path to the floor so commonly the last photo before a new screen.. The back of the person head in front of you because they moved as they clicked. And the list goes on.

There is a long list of how to fail with even the most basic camera. Yes most of these are because you are being stupid as a camera operator. You think about people holding there mobile phone up in mid air and clicking away these are really doing the pot luck method.

A modern 5 pixel of greater camera with a little due care yes I will give you doing that its rare that you will take a bad photo. The reality a lot of people don’t use digital cameras that way instead they want to take as fast a possible so they don’t miss what the want in the photo leading to the machine gun method.

I was not talking about artist photos when I disagreed I was talking about general. I have run classed teaching people cleaning off the mobile phones and deal with SD cards from cameras in the last few years. This is not to pro photo taking people this is to the people who don’t give a rats about artistic photos.

There is also when they do clean up quite a few photos from mobile phones that I only guess is the inside of the pocket.

The best camera on earth will not make you a artist with it. Best camera on earth will not prevent you from using it stupidly and getting a completely garbage picture.

There is only so forgiving a camera can be.

Please note when I said a bad photo mean totally not usable for what you wanted as in the person you wanted in the photo is not in the photo at all or you cannot see them because you messed up the lens….. One my mother run into was attempting to take photos at a museum though glass that was due to the glass resulting in about only in 1 in 40 photos being usable because slightly off angle was getting reflected glare problem. Yes person with mother there had high end smart phone and was having the same problem. The funny part was human eye looking the display could not see anything wrong. Yes the camera are good but the high ISO does cause some interesting issues there such thing as being too sensitive.

The reality its lots and lots ways to take a bad photo. Some will be the sensor most will be human done something horrible to the camera/phone.
2021-08-09 4:35 pm

Anonymous
@oiaohm

I forgot how obtuse you can be… I am very very obviously talking about deliberate artistic merit of a picture and cameras technical abilities. I’m not talking about anything else.

You can pre-visualise or you can just shove camera in at random and it will take a near perfect picture of whatever you can see through the lens reflections and sticky fingers and all. I have no interest whatsoever in general conversations about some people’s bad habits and stupidities. I really don’t. That’s not a direction of conversation I want to go in so you’re on your own.
2021-08-09 6:34 pm

oiaohm
–You can pre-visualise or you can just shove camera in at random and it will take a near perfect picture of whatever you can see through the lens reflections and sticky fingers and all.–
https://www.imaging-resource.com/news/2015/03/03/should-you-clean-your-lens

HollyB no a dirty lens results in the picture appearing more and more like it majorly out of focus or lacking contrast depending on if it grease(out of focus) or dirt/dust(lack of contrast). If you have dirty lens photo next to a clean lens photo of the same subject you will be deleting the dirty lens photo. A dirty lens picture looks wrong and looks wrong to the point you will be disappointed in the photo. Pre-visualised makes no difference here you take picture it looks out of focus/cloudy you clean lens and you take photo again and if people don’t know this problem might take like 5 to 10 photos before you rub it on their shirt and now then take a photo and got a good one.

HollyB watching people with mobiles phones taking photos on security cameras at times in funny because by their action you know what the problem was and you know that they have no clue what they have done to get a good photo in a lot of cases. Yes they have brute forced the problem. This is a side effect of the cheap price of digital photos. Person would learn these issues with a film camera because film is too expensive to waste 5 to 10 photos and the long lag means you would not know you have a problem until you had missed your chance to take it.

HollyB this field I know well. Its really simple to take a bad photo. Miss treat your cameras optics. Yes this means to taking a bad photo predates digital cameras. Digital cameras let people brute force these problems. Yes brute forcing this problem has resulted in some people taking a lot more photos.

HollyB is really easy to take a bad photo. The quality of the sensor does not save you if your treatment of the optical part of the camera is sub par. Funny part these days is how many people are brute forcing it just by taking more photos.

HollyB you are not alone there are a lot of people who think that their mobile phone camera is good enough yes they don’t notice take a photo get annoyed rub back of phone on shirt pants… and then retake photo. Yes they get into the habit of doing this stuff.
2021-08-10 1:20 am

Anonymous
@oiaohm

HollyB watching people with mobiles phones taking photos on security cameras at times in funny because by their action you know what the problem was and you know that they have no clue what they have done to get a good photo in a lot of cases.

So your day job is minimum wage seat filler looking at security cameras?

HollyB this field I know well.

Oh here we go again. Just like you know operating systems very well, I suppose?

Sorry but as per discussions about operating systems I’m walking away from you and leaving you to your own devices. You’re too much brain damage to deal with.
2021-08-10 4:02 am

oiaohm
–So your day job is minimum wage seat filler looking at security cameras?–
No there is more than 1 reason why you might be there. I am a security camera installer as one of my day jobs. It is funny how many people are not aware that when you do this normally have deployed other cameras looking for people mapping out where the cameras are.

HollyB people always presume that a person sitting at a security camera desk is lowly paid. That is only after installation and calibration is done and if repairs are not being performed. Yes the most cases of people noticing the camera positions and wishing to photograph the camera locations are in the deployment stage. Yes that stage you are normally making a list of possible future thief for the lowly paid security people to watch for. Yes incorrect presume that when cameras are being installed that cameras are not active covering the work area. Yes setting up these systems so boss at home can be looking in on the security cameras include the one in the security control room. Yes those lowly paid sometime wonder how they are fired for sleeping on the job because the camera in security room is not connected to their monitors.

Really think about taking pictures of security camera positions does not need that high quality of a photo. Yet people mobile phone lens are in such mess they have to take multi attempts. Little bit of planning ahead here would go a long way.

My job does involve working with cameras. My job does involve at times watching people using cameras to take pictures of cameras while doing other things like having a person walk past the camera to double check coverage(the installation and alignment). For every about 100 people mapping out cameras only 1 will have pre planned and had their gear in order the other 99 will basically brute force it to the point that at more than 1 new camera location they will have to take many shots and do a cleaning action that is like a super big neon sign.

Yes more people have cameras but still the same amount of people know how to use them properly to get usable results without brute force method.

I have done basic phone introduction courses for the elderly as well. Yes again common to find the brute force method.

2021-08-09 2:39 pm

Alfman verbose=1
oiaohm,

Also 1 in a trillion does not sound that great when there were over 1 trillion unique photos taken in 2016. Yes more than that were taken in years since. 1 in a trillion with photos is almost 100 percent sure that you do have at least 1 hash collision in just taken photos in a year since 2016 without the computer and other sources. Of course every year since 2018 you are basically adding another collision. Yes each hash generated for things with the same error rate results in it having collisions.

Without details, we can’t put apple’s numbers into context. They’re not very meaningful. The only sure thing is that they have to make the compromise between false positives and false negatives that are inherent to these kinds of algorithms.

The other thing to consider is that attackers may try to intentionally generate collisions (as opposed to accidentally detecting collisions), which could increase the rate of false positives by several orders of magnitude.
2021-08-09 9:17 pm

mkone
I am not sure about how well the technology works and definitely conflicted about its necessity. But I do think a 1 in 1 trillion error rate would (should?) probably be acceptable, particularly if any positive matches would be subject to human review. It’s all well and good saying 1 in 1 trillion is too much, but you have much worse odds for pretty much anything else in life.

I don’t think technology needs to be perfect to be OK.

2021-08-10 12:02 am

oiaohm
mkone that the problem most people don’t understand how to work out what percent of error rate is correct.

Wintergatan the marble machine X. This is only machine to play music it already getting to be over 80000 to 1 error rate with the marbles and this is not good enough for a music playing instrument.

The problem here is 1 in 1 trillion seams like alot. Its the same as the wintergaten developer and that 99% was good enough. One human on stage count not cope with that error rate 99% even at 1 in 80000 on a dark stage that might be human hurt. 1 in a million still could be questionable as he is the only main human(hit by bus problem). This is just for a music machine.

You need to take the error rate and work out how many errored event are you likely to have to deal with. First problem 1 in a trillion is per hash. We know there is over billion known photos in the CSAM database remember this database has been collected over many years. If they give a hash to every one of these images unique hash that 1 in a trillion lowers to 1 in 1000.

1 in trillion error rate when you run this that we make more than 1 trillion unique images per year.

The average developed world person is almost taking 1000 per year. Just in that that saying you will have hits for basically the population of the earth to deal with. This is without allowing for a clash image ending up in a meme or something and being duplicated a few million times.

mkone my problem when you do the maths I don’t see at 1 in trillion how you will be able to keep on performing human review. Even apple says they will not because they know the error rate is going to be too high to-do this. So they will only review a person once they have some magical threshold number of hash hits.

Please note I said we take 1 trillion photos world wide I did not say how many images store.

ttps://blog.mylio.com/how-many-photos-will-be-taken-in-2021-stats/
Yes this is a interesting read. In 2021 we have over 7 trillion photos being archived on-line.
1 in 1000 error rate on that you are not going to have the human reviewers. One in a trillion says with just the existing pool of images archived on line per has you will have at least 6 others that are not the target.

Yes the number archived on-line is expanding at over 10 percent per year.

–you have much worse odds for pretty much anything else in life.–
That is not the case. This comes about because you are ignoring the number of events. Think about a basic dice with 6 sides the odd of rolling a 6 is 1 in 6 is not. Now you roll that dice 6 times the odds that one of those rolls has a 6 is 66.51% yes this is 6 events. The number of events with photographic images per year is insane this makes 1 in a trillion after you allow for how many images exist(as individual events) that hashs have to be created for and the size of the pool of images means that each has has odds of being right less than 1 in 7, Then when you allow for how many hashes need to be created for you as a person to flag incorrectly this being like 1 in 1000. The reality you would want at least a million times more than 1 in trillion error rate in this use case.

Please also note MD5 that is classed a broken has a 1.47*10-29 error rate. You dependable hashs in the computer world are over Septillion or a Quadrillion as in 10 to power 24 no the 10 to power 12 that apple 1 in a trillion is. The hash apple is saying they will use is a very poor quality hash.

2021-08-10 7:45 am

dsmogor
Why can’t they just invent perceptible hash that uses more data. say 1k + some redundancy to be resistant to basic image manipulation and then run crypto hash over the result?

2021-08-10 2:19 pm

Alfman verbose=1
dsmogor,

Why can’t they just invent perceptible hash that uses more data. say 1k + some redundancy to be resistant to basic image manipulation and then run crypto hash over the result?

A normal strong cryptographic hash cannot be used due to the fuzzy matching requirements of the problem. You could technically use it, but any alteration of the source material would result in the hash no longer matching, rendering cryptographic strength hashes rather useless for fuzzy matching purposes.

You are right about the redundancy, it’s why for example when you setup a fingerprint on your phone you input a multitude of scans to catch input variations. The fuzzy matching algorithm needs to allow for transformations, noise, different pressures, minute changes, etc. The analog domain is very imprecise, you could log in a million times with the raw data never once being exactly duplicated. Fuzzy matching and cryptographic hashing are kind of conflicting goals. Reducing false negatives implies making it easier to produce collisions.

Insofar as it applies to matching digital image files, we need to ask how exact are the copies expected to be and how much variation to allow for? Consider the effects of trans-coding using lossy compression, cropping/reframing, resizing, overlays/watermarks, etc. What’s the appropriate balance between false negatives and false positives? These questions are intrinsic to all fuzzy matching algorithms. A lot of this has been studied in the context of copyright infringement detection.

2021-08-09 2:47 am

Anonymous
The other major component of Apple’s system, an AI perceptual hash called a NeuralHash, is problematic too. The experts Apple cites have zero background in privacy or law, and while Apple’s whitepaper is “overly technical”, it “doesn’t give enough information for someone to confirm the implementation”.

I don’t necessarily agree with Louis Rossmann’s politics and certainly not on everything but his discussion with Daniel is actually very good for a general audience and examines both the technical and privacy and human issues. One of the point is that for all the staff at Apple having technical brilliance they are building a broken system because they have no expertise in the human rights and privacy issues and that will involve a bigger picture and knock-on effects i.e. they didn’t weigh the outcome.

https://www.youtube.com/watch?v=9ZZ5erGSKgs
Daniel Smullen is a Software Engineering Ph.D. candidate whose main areas of interest include privacy, security, machine learning, autonomous systems, software architecture, and the Internet of Things. Read for yourself: he talks the talk & walks the walk on privacy.

Perhaps there is a reason that they don’t want really technical people looking at PhotoDNA. Microsoft says that the “PhotoDNA hash is not reversible”. That’s not true. PhotoDNA hashes can be projected into a 26×26 grayscale image that is only a little blurry. 26×26 is larger than most desktop icons; it’s enough detail to recognize people and objects. Reversing a PhotoDNA hash is no more complicated than solving a 26×26 Sudoku puzzle; a task well-suited for computers.

There may be UK case law on this. Quite some years ago “pixellisation” was used by the media and other to conceal faces of victims and innocent bystanders in pictures. Research then indicated that there was enough detail that merely squinting would allow someone to perceieve a recognisable image of someones face. From that point on pixelisation” used a much coarser grid. I forget what the numbers are but someone somewhere will have a citation for this.

I’m no expert on the maths behind this but when I did high performance graphics application upscaling was trivial even with basic API functions. Then there are other developers who produced more sophisticated upscaling and smooth algorithms for console emulators. The latest AI software can essentially add detail buried deep in the maths and data of an image.

2021-08-09 12:33 pm

oiaohm
–There may be UK case law on this. Quite some years ago “pixellisation” was used by the media and other to conceal faces of victims and innocent bystanders in pictures. Research then indicated that there was enough detail that merely squinting would allow someone to perceieve a recognisable image of someones face. From that point on pixelisation” used a much coarser grid. I forget what the numbers are but someone somewhere will have a citation for this.–

https://www.youtube.com/watch?v=g-N8lfceclI&t=227s
If you are not using enough pixellisation. Pixel2Style2Pixel can give you a basic range of possible faces it could be. Yes that is past images that you could squint at and guess. Remember you get multi different pixellisation in a video feed so it possible to take like 15 frames of a person pixellised and have a solve to basically one person if the camera moved the right ways to give you key features. Again more frames more data points the more you can zone in.

2021-08-09 9:33 am

Bill Shooter of Bul Platinum Prime
Basically Apple didn’t work with people who are experts in the field, then launched something with some flaws in process and technology and managed to offend a bunch of people through that and their messaging. Everyone reasonable agrees what should happen, they just need to get on the same page and stop rage blogging at each other.

Right now the level of scrutiny the courts give “scientific evidence” in the US court system is sad, very sad. Forensic experts will testify to complete BS psuedo science like bitemarks identifying a person. I’m not saying we shouldn’t do our best to make sure a system like this doesn’t’ tag an innocent person, but also we maybe shouldn’t clutch pearls and act as if the system is perfect now if not for what Apple is doing to our otherwise flawless courts.

2021-08-10 10:34 am

Adurbe
I think thats a misrepresentation. Apple did clearly engage with, and indeed hired a number of experts in the field in building and designing the solution. This included technologists, legal experts and other expertise. What they didn’t do, nor can ever be expected to do, was consult with or hire every expert or self-proclaimed expert in the field.

What is often being conflated in the discussion is the validity of the technology and the politics that apply to its usage;

The technology Apple (and Microsoft) have created Works.

There is an error rate, as with any system. The politics of the issue is 1. is that error rate acceptable in this context / use case 2. should it be being applied universally 3. Who decides the threshold or receives the information flagged (eg gov, law enforcement, etc).

Dr. Neal Krawetz, for example, has applied his expertise to suggest the technology isnt suitably robust to perform the task, however he hasn’t actually seen the implementation by his own admission, he has backwards engineered a solution and drawn correlations from that. An OSNews equivalence is saying Windows cant run all .exe applications because WINE can’t.

2021-08-10 1:32 pm

Alfman verbose=1
Adurbe,

What is often being conflated in the discussion is the validity of the technology and the politics that apply to its usage;

The technology Apple (and Microsoft) have created Works.

Sure fuzzy hashing works, but it may not be as perfect as they’re trying to portray it and it doesn’t invalidate the privacy concerns that consumers may have.

There is an error rate, as with any system. The politics of the issue is 1. is that error rate acceptable in this context / use case 2. should it be being applied universally 3. Who decides the threshold or receives the information flagged (eg gov, law enforcement, etc).

This focuses on the “how”, but it kind of missing an important point over whether we should. We’d be making the transition from a system legal rights where consumers have a right to privacy and law enforcement needs a warrant to one where our property is backdoored for dragnet operations. Once we cross into this territory where our property can be scanned for legal compliance without any kind of warrant, we’re not going to be able to put the cat back into the bag. Authoritarian countries like China must already be salivating at this development because we already know they will exploit the technology for their political ends and apple will comply because of profits. Western politicians bent on institutionalizing their moral code will have both the technology and the legal precedent to expand on.

Dr. Neal Krawetz, for example, has applied his expertise to suggest the technology isnt suitably robust to perform the task, however he hasn’t actually seen the implementation by his own admission, he has backwards engineered a solution and drawn correlations from that. An OSNews equivalence is saying Windows cant run all .exe applications because WINE can’t.

It’s not a great comparison though. Also just because an OS is proprietary doesn’t mean outsiders can’t make observations about it’s limitations and push back against dubious claims. Think about it this way: if a scientist makes dubious claims but does not reveal their data or methods such that others can reproduce the same experiment (a lot of anti-gravity and free-energy videos come to mind here), then do we dismiss what we know about science in order to take their claims at face value? Or do we show healthy skepticism and push the importance of the scientific method?

It’s fine for apple to publish numbers, but we still need more data to put it in context before it can be taken too seriously.
2021-08-10 8:01 pm

oiaohm
–An OSNews equivalence is saying Windows cant run all .exe applications because WINE can’t.–
Adurbe except that statement is not false. Wine still can run win16 applications as part of of core code windows 10 cannot. Wine has ABI features windows has removed. So wine can absolutely run applications windows cannot. Also wine also know that their support is incompletely.
https://test.winehq.org/data/
Wine in fact makes a testsuite that is a stack of exes that they end run on windows. Yes that test suite tells us that some application that will run in Vista will not run on Windows 10.

Backwards engineered solution tells lot about what is possible,

Aburde there are some basic numbers that should have you worried.
MD5 1.47*10-29 error rate. That is a corrected number due to error in design that use to be a 10 to power of 39.
SHA256 8.64e-78 error rate.
The hash apple is talking about using 1e-12 error rate. 1 in a trillion is not a good number when it comes to hashes.
SHA256 long form error rate 0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000864
The hash rate apple has claim.
0.000000000001
There is kind of a bit of a difference. Reality here when you know the mathes even I I presume apple maths are perfect. The result is every hash has the odds of having 1 in 7 accrary with the number of images currently archive on line. Because there is over 7 trillion unique images on line before photo modification.

The reality here you would want apple claiming at east 2 times as many zeros so instead of error rate of 1 in trillion 1e-12 that is apple has claimed you would want 1e-24 error rate. This way a false positive would rare.

If you are wanting this stuff done to stop abuse of children you don’t want police wasting 7 times the recourse they need to get the people doing the wrong thing. 1 in a trillion 1e-12 sounds good but we are talking hash either for file matching this field the number you talking in is huge. Trillion is small to the point of insanely small. 1 in trillion 1e-12 for hash collision error rate is basically the equal in a real world event that you got bed without killing yourself . So close to that a collision will happen that is basically almost a absolute sure to happen.

10 to power 78 to 10 to power 82 atoms is the estimated value for observable universe. This is a surprise to a lot of people a good hash for detect items will have a error rate that is like the 1 divided by the number of atoms in the observable universe.

2021-08-10 6:02 pm

Strossen
“I think that Apple has a very large CP/CSAM problem.
I think this might be the real reason Apple is building this system.”

Exactly. Apple doesn’t want to host photos of children being raped on its servers. That’s a good thing right?

Apple is trying to do something to stop hosting photos of child being raped. This is version one of that system. It won’t be perfect but Apple can improve the system in the future. Let’s hope that they succeed. Blocking photos of children being raped is a good thing right?

What’s the alternative – do nothing?

2021-08-10 8:11 pm

oiaohm
–Exactly. Apple doesn’t want to host photos of children being raped on its servers. That’s a good thing right?–

I am not too sure of this statement. Do note that apple is talking about only investaging when a person pass a particular threshold. Low accuracy hash will generate a huge number of false positives because you run out of man power you have to have a high threshold so this means a person with a small collection of children being raped will be able to be hosted on Apples own servers and slide though the net as being false positives when in reality they are real child porn.

If I was a pedophile design a system that is people may believe will catch pedophiles that is unlikely to catch majority of pedophiles. I would design it exactly how the one apple has put forwards is.

It one thing to give up privacy for something that works. Its another thing to give up privacy for something that is poorly designed that its not going to work successfully. Yes apple doing this get them the means to claim they did something. Something that is not going to be properly effective.
2021-08-10 9:13 pm

Alfman verbose=1
Strossen,

Exactly. Apple doesn’t want to host photos of children being raped on its servers. That’s a good thing right?

Apple is trying to do something to stop hosting photos of child being raped. This is version one of that system. It won’t be perfect but Apple can improve the system in the future. Let’s hope that they succeed. Blocking photos of children being raped is a good thing right?

“Think of the children” can be used to justify any invasion of privacy. Why don’t we have apple monitor all phone calls? After all they may be sex predators. Why not scan emails and texts for the same reason? Why not have mandatory car tracking to help investigate kidnapping? Why not have all personal transactions monitored to help catch payments for sex trafficking? Have Siri monitor people’s private homes 24×7 for child abuse inside while we’re at it. A lot of things are possible and can be defended under the exact same “think of the children” argument you are using “Apple is trying to do something to stop children being abused at home….stopping kidnappers is a good thing right?”

The slippery slope inevitably follows, once a technology is deployed it will be subjected to mission creep continuing to chip away at privacy. We should not nullify privacy rights lightly. Warrants exist for a good reason and throwing their concept away open up avenues for abuse by authorities. It’s only a matter of time before privacy invading technology gets abused. Just think of the ways China would love to abuse such a system.

What’s the alternative – do nothing?

No, we should make it easier to report cases of abuse. But IMHO searches need to be backed by lawful warrants and probable cause. Also, despite our intentions, sometimes the intervention of authorities actually causes much more social harm than than the offense itself. A simple act of “sexting” can land high school students who are coming of age on a sex offender registry, which can be far more traumatizing than the committed act. Rather than having victims of a crime, we end up with people becoming primarily victims of the law. Now it may be tempting to brush this aside as not apple’s problem, the law is the law after all, but nevertheless it highlights that there are times when enforcing the law is actually worse than doing nothing.

None of this is intended to dismiss the real and serious cases of abuse, but I think we need to rebuff the notion that it’s a purely black and white issue.