Multimedia, AV Archive
VLC media player, the popular open-source software developed by nonprofit VideoLAN, has topped 6 billion downloads worldwide and teased an AI-powered subtitle system. The new feature automatically generates real-time subtitles — which can then also be translated in many languages — for any video using open-source AI models that run locally on users’ devices, eliminating the need for internet connectivity or cloud services, VideoLAN demoed at CES. ↫ Manish Singh at TechCrunch VLC is choosing to throw users who rely on subtitles for accessibility or translation reasons under the bus. Using speech-to-text and even “AI” as a starting point for a proper accessibility expert of translator is fine, and can greatly reduce the workload. However, as anyone who works with STT and “AI” translation software knows, their output is highly variable and wildly unreliable, especially once English isn’t involved. Dumping the raw output of these tools onto people who rely on closed captions and subtitles to even be able to view videos is not only lazy, it’s deeply irresponsible and demonstrates a complete lack of respect and understanding. I was a translator for almost 15 years, with two university degrees on the subject to show for it. This is obviously a subject close to my heart, and the complete and utter lack of respect and understanding from Silicon Valley and the wider technology world for proper localisation and translation has been a thorn in my side for decades. We all know about bad translations, but it goes much deeper than that – with Silicon Valley’s utter disregard for multilingual people drawing most of my ire. Despite about 60 million people in the US alone using both English and Spanish daily, software still almost universally assumes you speak only one language at all times, often forcing fresh installs for something as simple as changing a single application’s language, or not even allowing autocorrect on a touch keyboard to work with multiple languages simultaneously. I can’t even imagine how bad things are for people who, for instance, require closed-captions for accessibility reasons. Imagine just how bad the “AI”-translated Croatian closed-captions on an Italian video are going to be – that’s two levels of “AI” brainrot between the source and the ears of the Croatian user. It seems subtitles and closed captions are going to be the next area where technology companies are going to slash costs, without realising – or, more likely, without giving a shit – that this will hurt users who require accessibility or translations more than anything. Seeing even an open source project like VLC jump onto this bandwagon is disheartening, but not entirely unexpected – the hype bubble is inescapable, and a lot more respected projects are going to throw their users under the bus before this bubble pops. …wait a second. Why is VLC at CES in the first place?
A new major release, FFmpeg 7.0 “Dijkstra”, is now available for download. The most noteworthy changes for most users are a native VVC decoder (currently experimental, until more fuzzing is done), IAMF support, or a multi-threaded ffmpeg CLI tool. This release is not backwards compatible, removing APIs deprecated before 6.0. The biggest change for most library callers will be the removal of the old bitmask-based channel layout API, replaced by the AVChannelLayout API allowing such features as custom channel ordering, or Ambisonics. Certain deprecated ffmpeg CLI options were also removed, and a C11-compliant compiler is now required to build the code. ↫ FFmpeg website I don’t think many of directly interface with FFmpeg, but we’re most likely all using it one way or another. Even Microsoft (here‘s the referenced bug report).
A few days ago, my former coworker Evan Hahn posted “The world’s smallest PNG”, an article walking through the minimum required elements of the PNG image format. He gave away the answer in the very first line: However (spoilers!) he later points out that there are several valid 67-byte PNGs, such as a 1×1 all-white image, or an 8×1 all-black image, or a 1×1 gray image. All of these exploit the fact that you can’t have less than one byte of pixel data, so you might as well use all eight bits of it. Clever! However again…are we really limited to one byte of pixel data? ↫ Jordan Rose You know where this is going.
The smallest PNG file is 67 bytes. It’s a single black pixel. Here’s what it looks like, zoomed in 200×: The rest of this post describes this file in more detail and tries to explain how PNGs work along the way. There’s a big twist at the end, if that excites you. But I hope you’re just excited to learn about PNGs. ↫ Evan Hahn I know way too much about PNGs now, information I won’t ever need but am glad to have.
When Jean-Baptiste Kempf joined École Centrale Paris as a student in 2003, he was tasked with helping run the university’s computer network. It included an unusual project: student-run open-source software that had been running on a couple of university servers for seven years. To students, the project was known as “Network 2000.” To the rest of the world, it was VLC media player. Kempf—now the president of VLC’s parent organization, the nonprofit VideoLAN—is the person who helped guide VLC’s journey from student project to ubiquitous software. (VideoLAN Client, the original name for the project, is where VLC gets its name.) On the surface, he’s laid-back, casual, and frank, though that belies a steely determination. As the person overseeing the project and its team, he sets the tone for VLC as a whole. VLC is one of those quintessential pieces of software. An outstanding application.
Karl Bode at Techdirt: But we’ve also noted that, ironically, the glut of video choices–more specifically the glut of streaming exclusivity silos–risks driving users back to piracy. Studies predict that every broadcaster and their uncle will have launched their own direct-to-consumer streaming platform by 2022. Most of these companies are understandably keen on locking their own content behind exclusivity paywalls, whether that’s HBO Now’s Game of Thrones, or CBS All Access’s Stark Trek: Discovery. But as consumers are forced to pay for more and more subscriptions to get all of the content they’re looking for, they’re not only getting frustrated by the growing costs (defeating the whole point of cutting the cord), they’re frustrated by the experience of having to hunt and peck through an endlessly shifting sea of exclusivity arrangements and licensing deals that make it difficult to track where your favorite show or film resides this month. With all kinds of series and IPs moving around from company to company these days, it’s getting impossible to keep track of where and how to watch both new and old series. It used to be quite simple – Netflix and your local streaming service for us Europeans – and you’d be pretty well set. Maybe add in HBO for Game of Thrones – usually one person in your group of friends had HBO here in Europe – and everything was covered. Now, though, things are rapidly falling apart in countless different silos, each at anywhere between €5-10/month, which is becoming unjustifiable. Piracy is definitely going to make a major comeback if this continues.
Estrada had promised a demonstration of a remarkable new instrument, one that had changed the whole way he made music. Two walls of the room were dedicated to racks of synthesizers — row after row of buttons and knobs and unwieldy wiring, a veritable museum of advanced technology spanning decades and costing thousands of dollars. Estrada ignored all of it. Instead, he plucked a small device from the spot where it was hanging from a hook. It looked like the exploded innards of a calculator, with a splat of knobs and buttons. There was no keyboard. Estrada plugged it into a set of speakers, held it in both hands and hunched over it slightly, as if handling a phone while texting, and began to play. He punched the buttons, and a rapid-fire sequence of clicks began to repeat. Then he twisted one of the knobs, and the clicks deepened into a more hollow sound, like that of a kick drum. More button punches, more knob twists, more sounds: a spacey high-hat, a background static roar, a tonal burst that altered slightly and quickly became a repeated phrase. Suddenly there was more than a beat; there was a little song. Technology changes and democratises in more ways than one, and no industry is safe from its effects. What a fascinating story and device.
As a little side-project, I have been working on putting the artificial neural networks of AI Gigapixel to the test and having them upscale another favorite thing of mine… Star Trek: Deep Space Nine (DS9). Just like Final Fantasy 7, of which I am upscaling the backgrounds, textures, and videos in Remako mod, DS9 was also relegated to a non-HD future. While the popular Original Series and The Next Generation were mostly shot on film, the mid 90s DS9 had its visual effects shots (space battles and such) shot on video. While you can rescan analog film at a higher resolution, video is digital and can’t be rescanned. This makes it much costlier to remaster this TV show, which is one of the reasons why it hasn’t happened. Fascinating methodology, and the results speak for themselves. Amazing work.
The entire experience of vinyl helps to create its appeal. Vinyl appeals to multiple senses—sight, sound, and touch—versus digital/streaming services, which appeal to just one sense (while offering the delight of instant gratification). Records are a tactile and a visual and an auditory experience. You feel a record. You hold it in your hands. It’s not just about the size of the cover art or the inclusion of accompanying booklets (not to mention the unique beauty of picture disks and colored vinyl). A record, by virtue of its size and weight, has gravitas, has heft, and the size communicates that it matters. Anyone who says vinyl sounds objectively better – using the same amplifier and speaker hardware as modern media – can hardly be taken seriously, but that doesn’t mean vinyl can’t sound subjectively better. When it comes to older music from the ’60s and ’70s, I enjoyed listening to it on vinyl records (I don’t have a record player at this moment), but that had nothing to do with sound quality, and everything to do with the more archaic, unique experience of listening to a vinyl record.
Magic Lantern is a software enhancement that offers increased functionality to the excellent Canon DSLR cameras. We have created an open framework, licensed under GPL, for developing extensions to the official firmware. Magic Lantern is not a “hack”, or a modified firmware, it is an independent program that runs alongside Canon’s own software. Each time you start your camera, Magic Lantern is loaded from your memory card. Our only modification was to enable the ability to run software from the memory card. ML is being developed by photo and video enthusiasts, adding functionality such as: HDR images and video, timelapse, motion detection, focus assist tools, manual audio controls much more. What a fascinating project. I knew you could put custom ROM images on digital cameras, but this seems like a far safer and less warranty-breaking way of extending and improving the functionality of your camera.
«System Beeps» is a music album in shape of an MS-DOS program that features original music composed for PC Speaker using the same basic old techniques like ones found in classic PC games. It follows the usual retrocomputing demoscene formula — take something rusty and obsolete, and push it to eleven — and attempts to reveal the long hidden potential of this humble little sound device. You can hear it in action and form an opinion on how successful this attempt was at Bandcamp, or in the video below. The following article is an in-depth overview of the original PC Speaker capabilities and making of the project, for those who would like to know more. What an amazing work of art, and I love the detailed description of how it was made using nothing but the PC speaker. This article is quite detailed, and the project itself is released under the CC-BY license.
VLC 4.0 is on the way, and the VLC developers have listed what they have in store for this major new release. The most obvious new user-facing feature is brand new user interfaces for each platform the media player supports, such as KDE, Gnome, Windows, macOS, and more. Work on the new VLC 4.0 user-interface is progressing, there will be GNOME and KDE adaptations, support for both server-side and client-side decorations, and great support for Wayland as well as X11 — including support for macOS, Windows, etc. With VLC 4.0, they intend to gut out support for Windows XP/Vista as well as bumping the macOS, iOS, and Android requirements. On the Linux front, they intend to require OpenGL acceleration for this media player. There’s no information yet on when this new release will be made available.
Given its appearance in one form or another in all but the cheapest audio gear produced in the last 70 years or so, you'd be forgiven for thinking that the ubiquitous VU meter is just one of those electronic add-ons that's more a result of marketing than engineering. After all, the seemingly arbitrary scale and the vague "volume units" label makes it seem like something a manufacturer would slap on a device just to make it look good. And while that no doubt happens, it turns out that the concept of a VU meter and its execution has some serious engineering behind that belies the really simple question it seeks to answer: how loud is this audio signal?
I love analog VU meters, and I'm kind of sad regular, non-professional music equipment has done away with them entirely.
In a previous blogpost we talked about the Opus codec, which offers very low bitrates. Another codec seeking to achieve even lower bitrates is Codec 2.
Codec 2 is designed for use with speech only, and although the bitrates are impressive the results aren’t as clear as Opus, as you can hear in the following audio examples. However, there is some interesting work being done with Codec 2 in combination with neural network (WaveNets) that is yielding great results.
I designed and built a Canon EF Mount for my Game Boy Camera. The GBC has a sensor size of about 3.6mm² which seems equivalent to a 1/4" sensor. This gives the GBC a crop factor of about 10.81. With my 70-200 f4 mounted on a 1.4x extender, this gives me a max equivalent focal distance of about 200x1.4x10.81=3,026.8mm.
I always wanted a Game Boy Camera when I was a kid. It still looks like magic to me today.
FFmpeg 4.0
has been released, and it's a major one. Since this particular subject matter - and its changelog - are way beyond the scope of my capabilities, I'll just leave you with the generic description of the project (in case you live under a rock).
FFmpeg is the leading multimedia framework, able to decode, encode, transcode, mux, demux, stream, filter and play pretty much anything that humans and machines have created. It supports the most obscure ancient formats up to the cutting edge. No matter if they were designed by some standards committee, the community or a corporation. It is also highly portable: FFmpeg compiles, runs, and passes our testing infrastructure FATE across Linux, Mac OS X, Microsoft Windows, the BSDs, Solaris, etc. under a wide variety of build environments, machine architectures, and configurations.
64kB intros, 64k for short, are like demos but with an added arbitrary limitation on the size: they must fit entirely within a single binary file of no more than 65536 bytes. No extra assets, no network, no extra libraries: the usual rule is that it should run on a freshly installed Windows PC with up to date drivers.
This is crazy.
Apple joining the Alliance for Open Media is a really big deal. Now all the most powerful tech companies - Google, Microsoft, Apple, Mozilla, Facebook, Amazon, Intel, AMD, ARM, Nvidia - plus content providers like Netflix and Hulu are on board. I guess there's still no guarantee Apple products will support AV1, but it would seem pointless for Apple to join AOM if they're not going to use it: apparently AOM membership obliges Apple to provide a royalty-free license to any "essential patents" it holds for AV1 usage.
It seems that the only thing that can stop AOM and AV1 eclipsing patent-encumbered codecs like HEVC is patent-infringement lawsuits (probably from HEVC-associated entities).
I can barely believe this is still a thing, and that it seems like a positive outcome.
Nilay Patel on the further disappearance of the headphone jack, and its replacement, Bluetooth:
To improve Bluetooth, platform vendors like Apple and Google are riffing on top of it, and that means they’re building custom solutions. And building custom solutions means they’re taking the opportunity to prioritize their own products, because that is a fair and rational thing for platform vendors to do.
Unfortunately, what is fair and rational for platform vendors isn’t always great for markets, competition, or consumers. And at the end of this road, we will have taken a simple, universal thing that enabled a vibrant market with tons of options for every consumer, and turned it into yet another limited market defined by ecosystem lock-in.
This is exactly what's happening, and it is turning something simple and straightforward - get headphones, plug it in literally every single piece of headphones-enabled audio equipment made in the last 100 years, and have it work - into an incompatibility nightmare. And this incompatibility nightmare is growing and getting worse, moving beyond just non-standard Bluetooth; you can't use Apple Music with speakers from Google or Amazon, and Spotify doesn't work on the Apple Watch.
Removing the headphone jack was a user-hostile move when Apple did it, and it's still a user-hostile move when Google does it.
About four years ago, we shared our plans for playing premium video in HTML5, replacing Silverlight and eliminating the extra step of installing and updating browser plug-ins. Â
Since then, we have launched HTML5 video on Chrome OS, Chrome, Internet Explorer, Safari, Opera, Firefox, and Edge on all supported operating systems.  And though we do not officially support Linux, Chrome playback has worked on that platform since late 2014.  Starting today, users of Firefox can also enjoy Netflix on Linux. This marks a huge milestone for us and our partners, including Google, Microsoft, Apple, and Mozilla that helped make it possible.
It wasn't that long ago we barely dared to imagine HTML5 video taking over from Flash and Silverlight.