VLC media player, the popular open-source software developed by nonprofit VideoLAN, has topped 6 billion downloads worldwide and teased an AI-powered subtitle system.
The new feature automatically generates real-time subtitles — which can then also be translated in many languages — for any video using open-source AI models that run locally on users’ devices, eliminating the need for internet connectivity or cloud services, VideoLAN demoed at CES.
↫ Manish Singh at TechCrunch
VLC is choosing to throw users who rely on subtitles for accessibility or translation reasons under the bus. Using speech-to-text and even “AI” as a starting point for a proper accessibility expert of translator is fine, and can greatly reduce the workload. However, as anyone who works with STT and “AI” translation software knows, their output is highly variable and wildly unreliable, especially once English isn’t involved. Dumping the raw output of these tools onto people who rely on closed captions and subtitles to even be able to view videos is not only lazy, it’s deeply irresponsible and demonstrates a complete lack of respect and understanding.
I was a translator for almost 15 years, with two university degrees on the subject to show for it. This is obviously a subject close to my heart, and the complete and utter lack of respect and understanding from Silicon Valley and the wider technology world for proper localisation and translation has been a thorn in my side for decades. We all know about bad translations, but it goes much deeper than that – with Silicon Valley’s utter disregard for multilingual people drawing most of my ire. Despite about 60 million people in the US alone using both English and Spanish daily, software still almost universally assumes you speak only one language at all times, often forcing fresh installs for something as simple as changing a single application’s language, or not even allowing autocorrect on a touch keyboard to work with multiple languages simultaneously.
I can’t even imagine how bad things are for people who, for instance, require closed-captions for accessibility reasons. Imagine just how bad the “AI”-translated Croatian closed-captions on an Italian video are going to be – that’s two levels of “AI” brainrot between the source and the ears of the Croatian user.
It seems subtitles and closed captions are going to be the next area where technology companies are going to slash costs, without realising – or, more likely, without giving a shit – that this will hurt users who require accessibility or translations more than anything. Seeing even an open source project like VLC jump onto this bandwagon is disheartening, but not entirely unexpected – the hype bubble is inescapable, and a lot more respected projects are going to throw their users under the bus before this bubble pops.
…wait a second. Why is VLC at CES in the first place?
I’m not sure I see a scenario where someone would lose their job because their client thinks “I’ll tell everyone to read the video with VLC’s local transcript/translation feature instead of paying a translator”. This is not intended for companies but for end users, and it’s not to run as a service but as local feature on a case-by-case basis. Even a obscure video content producer will keep advertising Google’s version of this feature by keeping their videos on Youtube before telling their audience to use VLC.
I know and understand your views on this matter, but they may have inflated the importance of this feature release compared to it’s real value.
On the contrary, I think accessibility-wise, it’s an (imperfect, for sure), good step, if only for transcript.
I might of course be wrong, and the future will tell.
I think this is great. All local and open source? That’s amazing for personal content that doesn’t have captions.
There are wider implications, sure, but VLC isn’t aiming this at stopping humans from doing this work. It’s for personal use and the wider industry is going to head this direction anyway, regardless of a trustworthy project like VLC.
If OSS doesn’t add capabilities like this, people will stop using it. Good for VLC!
Dont worry, the society will renormalize after the Great Reset.
“Croation” ? or “Croatian” as in, the subdialect of new shtokavian ? 🙂
Honestly, it’s merely meant to be a tool, to fill a gap for those who don’t have a real/official subtitle underhand… as you say, STT is less than good, AI or not, and it’s obvious enough… I’d say, as long as it’s clearly mentioned as automated and states that’s it’s far from precise, we’re fine… there will be enough public at large, for content that needs it, to ask for a real translation that needs quality and finesse
But yeah, Videolan at CES is kinda weird 🙂 (go go Epitech ! I do remember getting drunk in the parking they squatted under the school to hang out… it was the birthplace of a few amazing projects!)
“VLC is choosing to throw users who rely on subtitles for accessibility or translation reasons under the bus.”
What is this utter nonsense you wrote? If subtitles exist in the media its not like VLC is going not display them stop trying to shoot the messenger…
So, the introduction of “AI” has improved the quality of content on the internet? What makes you think the introduction of “AI” in subtitling/closed captioning is going to be any different?
I swear to god, pattern recognition is a lost art.
Absolutely subtitles actually reflect intent of original content now instead of being platforms for translators to inject bias. If translations arent good enough right now…. give it a few more months. Translation as a job is a decade obsolete anyway in all but very demanding senarios.
People can get better faster cheaper subs and translations today than ever.
That’s rich coming from you.
When real subtitles are absent, it’s better than nothing, Thom. I often use the automatic translation of X/Twitter for posts written in languages that are completely foreign to me, and most of the time the translation is good. Things have moved on from the days of Babelfish, Thom.
I’ve used tools far, far more advanced than Google Translate or ChatGPT. They are not even remotely capable of doing what is needed for proper accessibility, but they will still replace actually educated specialists. If you don’t think this is going to make subtitling and closed captioning worse for everyone, I would once again like to point you to the web. Has the introduction of “AI” improved the quality of web content?
If no, why do you think it’s going to be different for accessibility?
educated specialist that requires 70k payroll and still makes mistakes and injects biases vs nearly free AI translation…. 99.9999% of people and companies are gonna pick the cheaper option and deal with the usually very minor flaws.
Ai models can generate translations at much lower latency also why wait a month or more for a bad translation, that fails to covey meaning when I can get a good translation in milliseconds in a dozen languages.
I think many people will just have to change jobs… teaching languages isn’t going away but translation is dead.
I don’t have a dog in this fight, but I will say that AI is absolutely vulnerable to bias, and this has been proven time and again with ChatGPT and other content-scraping AI tools. It’s literally garbage in/garbage out, it can only learn from what it scrapes and the Internet at large is biased, period. Anything built by humans with emotions and opinions is subject to bias, including AI. The difference between a human translator and an AI translator is that the human is perceptive enough to realize they might have projected their own bias or opinion and is capable of correcting themselves, or submitting their work to another human editor for review and correction. Currently there is no AI out there that can self-review like that, because they don’t have the ability to reason, they just copy and paste in an efficient way. It’s why you see warnings everywhere AI is used that implore you to fact check and review the results, because they are often wrong.
https://www.ohchr.org/en/stories/2024/07/racism-and-ai-bias-past-leads-bias-future
https://time.com/5520558/artificial-intelligence-racial-gender-bias/
Sure its vulnerable but its also fixable at scale… human biases are not easily compensated for at scale especially when the entire industry is systemically biased.
I dont even care about some random translators personal biases…but it became a problem when they start injecting them into thier work.
Quality releases will still use professional translators. Non-quality releases (and Google services) have moved to automatic translation years ago when Google Translate became good enough. I have the language in my Google Account set to Greek, so I know this happened years before ChatGPT.
What quality releases.
Subtitles on netflish/blu-ray are terrible right now (I suspect it is already AI and a quick scim by someone who’s native language is not the lagguage being done).
We are already there.
Thom, web content and movie subtitles are completely different things. With AI and web content — I’m all with you without any moment of hesitation.
But VLC and movie subtitles… just imagine recent example (I’m european not american, just for the background) I’m learning spanish, I’m good enough to listen to podcasts or radio where people tend to move more clearly and distinctly. But with *some* movies where there are absolutely no subtitles provided on DVD (speaking of older movies) with fast speaking, I’m just struggling, so such kind of automatic AI subtitles would really be helpful. Even if lousy – my mind is flexible enough to compensate 😀
And mind you – even before you say anything – we are talking here about the cases where I don’t expect anyone (even non-professional!) creating the subtitles for some old niche movies.
VLC is basically integrating for everyone something that Linux users in the know have had for a while with program called LiveCaptions.
Lot of Anime these days is being AI translated because those so called educated specialists have allowed their personal bias to mess with translations. I am not kidding on this. Japan anime firms were more than willing to pay human translators until they were found to be changing chars sex types and so on just because this fitted the translator point of view. Yes some of those translators doing this were dumb enough to admit to doing this twitter that now X.
https://boundingintocomics.com/anime/resurfaced-video-jamie-marchi-proudly-touting-wild-changes-made-to-panty-stocking-english-dub-you-want-it-to-be-different-and-original/
Yes one such example and its not the only one..
From my point of view we need local AI translation preferred with multi different engines. Not that the human translation cannot be better but to give a baseline to see directly when the human translator is not doing their job anywhere near close.
Also this proper accessibility claim I have been deaf myself and boy when I got my hearing back and watched a few movies that I had watched with subtitles to straight up notice that the subtitles and the spoken text was totally out of alignment. Yes explained some of the disagreements I had with some of my friends at the time. Then showing them that the subtitles and the audio were out of alignment they were like what the hell and this was before AI.
Proper accessibility human can learn to read AI translated text allowing for how the AI commonly screw up with experience and without the years of training to perform the translation themselves. This AI can improve accessibility. The idea that proper accessibility need educated specialists is false. Yes one of the funniest books of all time is a human who was not an educated specialist who did a translation without knowing enough and screwed it up so badly its a joke even the title is stuffed “English As She Is Spoke”. You can work out what the person was attempting to go for even that its wrong. Yes I can see some people intentionally run something though a known poor AI translator just to see how funny it screws up the translation there is comedy here . Yes why I hope VLC implementation allows choosing AI translation engines including bad ones.
The big problem is there has been no direct legal punishments for doing international miss translation. Yes something that can provide those who 1 cannot hear or 2 cannot speak the language with fraudulent information without them being aware.
I would say some areas AI has made better other areas worse. Lot of ways I want to see court cases against AI firms for their AI containing absolutely provable false information start also being applied to so called educated specialists who are proven to intentionally miss translate stuff.
On Windows I’ve never had a reason to use anything other than MPC-HC (actively maintained by clsid2). Old school interface, intuitive controls, can fetch subtitles from Podnapisi or OpenSubtitles, no plugins required. The GOAT.
Interest rates (the price of money) have been too low/cheap for too long. A lot of businesses were created that can’t exist in a higher interest rate environment. We are now seeing those businesses get caught out with no revenue and they are rushing to automate and cut costs, or jack up prices and try to gouge their way out of it. This is another symptom of this. Enshitifcation is another symptom of this. The market is broken and it will reset to a new equilibrium but the process is going to be painful, I expect ~50% of software companies are going to go bust. AI isn’t living up to the hype, at some point the market will catch up. Engineers already know.
> A lot of businesses were created that can’t exist in a higher interest rate environment.
You mean, like Greece, the US, Japan or any other nation indept >120% of their GDP?
Bring it, raise interest and see what will be left 🙂
Mark my words: Outside of Africa (population growth >3%) you will never see high interest rates again. Because Interest are NOT the cost of money, but actually the yield of expansion. And there is no growth anymore when populations are declining.
I see this as positive news tbh. A Lot of content isn’t, and never will be translated. And this gives a “good enough” option for those who need it.
An example I might use this for is sports streams, where the commentary isn’t always in my language of choice.
I recently watched a rugby game between Georgia and Japan, but it wasn’t available with commentary from any of the 3 languages I can speak/understand. This would allow me to get an idea of what the commentary team were saying!
This is the dumbest article I’ve read this year.
This seems to me to be the precise thing that technology is good for. Taking something that was once super expensive and making it accessible to all. I don’t think anyone suggests that it should be used to translate Harry Potter, but it can certainly be useful to to transcribe and translate the odd video that no one works ever pay for a translation for.
Heck, it can even be a good tool for a professional transcriber and translator to generate a reasonable first draft, making even professional transcription and translation cheaper.
What’s the world coming to if a tech site can no longer celebrate a new technology?
@Thom is correct, Google translate and similar products have been dumbing down the world for well over a decade now, and AI generated subtitles are even worse. I can tell that no one but @Thom in these comments has any real understanding of the topic. When it comes to knowledge transfer, “good enough” translation, or 60-80% accuracy, means you are filling your mind with 20-40% false ideas and are thereby actively making yourself less intelligent.
Like it or not, it’s here to stay. Professional translation companies now use AI to accelerate the process, and you’re job is mostly sorting out the garbage it comes up with as that’s faster than translating the whole thing. This VLC thing however I don’t care much about as Japanese content sources tend to have zero subtitles in the original source, and this kind of AI on YouTube has been extremely helpful to look up native words. Actually translating is a different matter however. The way it works is still fundamentally flawed. Things like subject words dropped from sentences because the listener and speaker both already know them seems common in every language, but a translator tool tends to go haywire on those sentences.
To be fair, that’s a difficult thing even for humans.
When translating our screencasts, the most work intensive part is to correct the overzealous translations that should no be translated (when the word actually has a proper translation, but should not be translated because only the original English term is used in the context).
For example, our software comes with a few pre-configured default roles “Officer”, “Reporter”, “Manager” etc. So when explaining those roles, those always get translated by DeepL (unless manually flagged). Human translators would likely do the same mistake though.
Thom,
I’m not sure whether you checked the Google’s Android keyboard (GBoard), however it does exactly this. You can choose to enable multiple languages at the same time, and it will autocorrect and also work with swipe on both of them. (I have tried up to two as like you I’m bilingual. However it might, I mean might, work for more than two as well).
And… while I was working at Google News, supporting multiple language sources was one of our explicit goals. I would often receive news in two languages, as the system was able to identify that for me.
I would really recommend giving the tech community a chance before giving up on them altogether.
And to be fair, I checked the latest news on this (as I no longer use Android)
https://www.reddit.com/r/gboard/comments/1cpyd15/why_has_multiple_languages_behaviour_got_so_bad/
Seems like there are some bugs and regressions… My guess is them switching to a new model without proper testing. For that point (breaking perfectly working functionality), unfortunately I cannot say anything good.
sukru,
This has piqued my curiosity…what are you using and why?
Alfman,
Nothing special. My work gave me an iphone. And i realized I can’t carry two phones. So I sold my Samsung.
Sorry I don’t have a more interesting answer.
sukru,
Oh I see. No strings attached? Sounds better than the company that forced me to buy a second phone because their VPN didn’t accommodate LineageOS. I even asked them to provide a phone, but they declined. Meh.
You’ve been so nice on osnews, but you know that when you are an iphone user it entitles you to the perk of looking down on others in social circles, haha.
https://www.androidheadlines.com/2024/09/52-percent-us-android-mocked-iphone.html
Alfman,
Of course there are strings attached 🙂 . But a reasonable leeway to do my own stuff. Basically not too much to require a second phone.
Thanks.
I hope I won’t start looking down on users. But I’ve been known to criticize phone manufacturers for some of their bad choices.
“Thom,” you really need to relax about AI. It’s here to stay, and it’s only going to get better. We all have to deal with it.
As for the subtitles, as a speaker of Serbocroatian, I will tell you that translations from various languages are just fine most of the time, and I will take AI translation over no translation any day.
In the end, no one is forced to enable this, and you know it.
SME here, building financial reporting software for developing/frontiers markets with English/French/Portuguese/Bahasa Indonesia/Thai speakers.
Screencast (Not native English!) –> Whisper Transcribe –> DeepL –> Google TTS works absolutely great for us and the users.
Of course the quality depends on the language pairs (e.g. Thai translation is not as good as French or German). But when testing it against my own mother tongue, the result was outstanding.
If it supports other languages like YouTube does then I will finally be able to watch my “stolen” copy of the 2009 Italian documentary “Videocracy” for which I could not find English or Polish subtitles anywhere.
What a bizarre tirade.
VLC doesn’t sell translation services, they “sell” a video player, the argument about saving the costs of professionally made translations is misplaced at best.
An AI-made translation from Croatian (not “Croation”. Did you forget to turn on the spell checker? Or do you leave it purposefully off because it steals the job of a human proofreader?) to Italian is still immensely better than no translation at all, which is the most probable situation, so who’s “throwing users who rely on subtitles for accessibility or translation reasons under the bus” here? VLC developers who are making AI translations easily available for their users, or you advocating that when there aren’t human-made subtitles available users should just suck it up and not understand a word of what they’re watching?
Also, I don’t know which software you use that forces you to reinstall just to change language or which touch keyboard doesn’t have a multi-language spell checker, but it doesn’t seem to me they’re the norm.
Like Tom said, this hit a nerve because he worked as a translator in the past. I personally don’t find this negative for VLC. Yes, it is negative for translators, but it was also negative for them that there was already people making the close caption, subtitles and media subtitle translations for free as a community effort.
Thom, give it a rest. I like OSNews since Eugenia and I also like your content overall. The article about writing a kernel with less than 1000 LOC was just beautiful and I don’t know any other side digging up such gems.
But your crusade on AI appears bitter and makes me sad. Do yourself a favor, step back and reflect and look for opportunities instead of being bitter about spilled milk.
Cheers and good luck.
> I was a translator for almost 15 years, with two university degrees on the subject to show for it.
“Dignity and an empty back is worth … the sack”
109th rule of acquisition
One could generalize, did internet (and social media) make the quality of mass media and public debate better… looks like progress is not monotonic after all.