Linked by Thom Holwerda on Fri 23rd Aug 2013 08:37 UTC

Pretty much for my entire career in Linux USB (eight years now?), we've been complaining about how USB device power management just sucks. We enable auto-suspend for a USB device driver, and find dozens of different USB devices that simply disconnect from the bus when auto-suspend is enabled.

For years, we've blamed those devices for being cheap, crappy, and broken. We talked about blacklists in the kernel, and ripped those out when they got too big. We've talked about whitelists in userspace, but not many distros have time to cultivate such lists.

It turns out it's not always the device's fault.

Fascinating bug.

Thread beginning with comment 570393
To read all comments associated with this story, please click here.
The fix
by WereCatf on Fri 23rd Aug 2013 08:55 UTC
Member since:

The temporary fix is apparently for Linux to wait 20ms by default instead of the previous 10ms, but apparently xHCI is supposed to emit an interrupt when the device is actually ready instead of the host initiating anything. What struck me as odd is the fact that this interrupt apparently goes completely ignored in Linux's USB-stack, even though it's mentioned in the specs. Why? What's the rationale for implementing resume wrong in the first place? Will it actually be fixed properly now that it's out, or will the devs just stick to the 20ms work-around?

Reply Score: 4

RE: The fix
by Fergy on Fri 23rd Aug 2013 09:08 in reply to "The fix"
Fergy Member since:

Will it actually be fixed properly now that it's out, or will the devs just stick to the 20ms work-around?

From the article:
This patch is not the "real fix" for solving the issues with the USB core, and I despise fixing things by tweaking timeout values, so I'll have to work on a real fix tomorrow. But at least there's a light at the end of the tunnel for USB device power management.

Reply Parent Score: 8

RE[2]: The fix
by osvil on Fri 23rd Aug 2013 10:16 in reply to "RE: The fix"
osvil Member since:

What is kind of fun is that, devices that somehow worked on other systems where dismissed as faulty. Usually if something work for others but not for your system the first suspicion should go to your system. It could be that the device is faulty and that the problems arise in your context, but over my career I've found that in many cases is the other way around.

It is like bugs in compilers. I've come across some, for sure. But most times I thought it was a compiler error, it was actually bad code on my side.

Reply Parent Score: 6