Internet Archive
SeaweedFS is a simple and highly scalable distributed file system. There are two objectives: to store billions of files!, to serve the files fast! SeaweedFS started as an Object Store to handle small files efficiently. Instead of managing all file metadata in a central master, the central master only manages volumes on volume servers, and these volume servers manage files and their metadata. This relieves concurrency pressure from the central master and spreads file metadata into volume servers, allowing faster file access (O(1), usually just one disk read operation). There is only 40 bytes of disk storage overhead for each file’s metadata. It is so simple with O(1) disk reads that you are welcome to challenge the performance with your actual use cases. ↫ SeaweedFS’s GitHub page It’s Apache-licensed and the code is, as usual, on GitHub.
Another month, another pile of improvement to Servo, the rendering engine written in Rust, originally a Mozilla project. This month the proof-of-concept browser UI got forward and backward buttons, making this bare-bones UI just a tiny bit more usable. Of course, the vast majority of changes and improvements are all focused on the actual rendering engine, which makes sense because Servo definitely isn’t ready for any prime time use – nor is anyone claiming it is. I’m incredibly curious to see where Servo goes in the future.
In a major move addressing European regulations, Meta will soon give users in the EU, EEA, and Switzerland significantly more control over how their data is used across Facebook and Instagram. The changes, set to begin rolling out in the coming weeks, aim to comply with the Digital Markets Act (DMA). ↫ Omer Dursun at NeoWin You’ll be able to unlink Facebook’s various services – such as Instagram and Facebook’s main social network thing – and you’ll be able to use Facebook Messenger as a standalone service without needing to have a Facebook account. Sadly, there’s no word on WhatsApp. This only applies to people in the EU/EEA. Americans need not apply.
Made to run natively on all modern operating systems and browsers, Ruffle brings Flash content back to life with no extra fuss. ↫ Ruffle website It’s using Rust and WASM, making it supposedly safer than the real Flash PLayer ever was, and of course, it’s open source too. Their most recent progress report details just how far along this project already is.
We show that content on the web is often translated into many languages, and the low quality of these multi-way translations indicates they were likely created using Machine Translation (MT). Multi-way parallel, machine generated content not only dominates the translations in lower resource languages; it also constitutes a large fraction of the total web content in those languages. We also find evidence of a selection bias in the type of content which is translated into many languages, consistent with low quality English content being translated en masse into many lower resource languages, via MT. Our work raises serious concerns about training models such as multilingual large language models on both monolingual and bilingual data scraped from the web. ↫ Brian Thompson, Mehak Preet Dhaliwal, Peter Frisch, Tobias Domhan, Marcello Federico As a translator myself, this is entirely unsurprising. Translating is a craft, a skill, and much like with any other craft, you get what you pay for. If you pay your translator(s) a good rate, you get a good translation. If you pay your translator(s) a shit rate, you get a shit translation. If you pay nothing, you get nothing. I’m definitely seeing more and more people in my industry integrate machine translations, but so far, it’s not been an actual issue – I have no qualms about accepting a job where I take a machine-translated text and whip it into shape and turn it into a human-readable, quality translation… As long as people pay me a reasonable rate for it. Working from a machine translation is often quicker and easier, so the going rate obviously reflects that. The quality of machine translations is absolutely atrocious, however, and the idea of relying on it for texts other people – customers, clients, employees, etc. – are actually supposed to read and work from is terrifying. Google Translate is an effective tool for personal use, but throwing, I don’t know, your product’s manual at it and dumping the unedited result onto your customers is borderline criminal. Pay nothing, get nothing.
Netscape Composer was my first introduction to web development. As a kid, I created my first web pages using it. Those pages never made it online, but I proudly carried them around on a floppy disk to show them off on family members’ and friends’ computers. This is likely how I got the understanding that websites are just made of files. Using Netscape Composer also taught me basic web vocabulary, such as “page” and “hyperlink”. Of course, the web landscape has evolved immensely since then. I was curious to try out that dated software again and see what its limitations were, and what the code it produces looks like from a 2024 perspective. The first thing I needed was a goal. I decided to try and reproduce the home page of my personal website as closely as the application allowed it. That seemed like a sensible aim as my website has a rather minimalistic design, with very little that should be completely out of reach for an antiquated tool. ↫ Pier-Luc Brault What a fun exercise.
NetSurf, the small and efficient browser for RISC OS, Haiku, AmigaOS 4, and obscure platforms you’ve probably never heard of like “Linux” and “macOS” has seen a new release – version 3.11. NetSurf is written in C and has its own browser engine – it’s not based on Google’s browser engines, Chromium and Firefox’ Gecko/Quantum. NetSurf 3.11 features improved page layout with CSS flex support. It also features many other optimisations and enhancements. ↫ NetSurf’s official website It’s an obvious upgrade for everyone who uses NetSurf, since if you’re using NetSurf, odds are the platform you’re using it on doesn’t really offer many alternatives.
Pink-haired Aitana Lopez is followed by more than 200,000 people on social media. She posts selfies from concerts and her bedroom, while tagging brands such as hair care line Olaplex and lingerie giant Victoria’s Secret. Brands have paid about $1,000 a post for her to promote their products on social media—despite the fact that she is entirely fictional. Aitana is a “virtual influencer” created using artificial intelligence tools, one of the hundreds of digital avatars that have broken into the growing $21 billion content creator economy. ↫ Christina Criddle for Ars Technica While there’s a ton of questions to be asked about where, exactly, this could lead, and what “AI” will mean for especially women having their likeness recreated as “AI” avatars for people to sleaze over, or worse, the concept of having “AI” influencers doing fairly mundane and harmless things like promote a brand or show some fake photos of their apartments seems fairly benign and even interesting and beneficial to me. Of course, I say this with all the caveats that this is incredibly early days, we have no idea if there are any shady businesses behind these new “AI” influencers, and so on, and so forth. We’ve all seen what technology such as this can be used for, and it ain’t pretty.
Advertisements are a part of our lives, including our digital ones. They are in the websites we browse, the search results we receive, and the online news we read. Tired of receiving so many ads, some users try to avoid them by installing an adblocker. But is this a legal practice? Is using adblockers an act of restricting market autonomy, or do they help achieve user freedom? Imagine a scenario where website owners hold copyright over their websites, including whatever ads they place, and could effectively sue for copyright infringement if users were to remove or suppress ads when visiting these websites. This hypothetical situation would enable any website copyright holder to use the legal system to stop any ordinary user on the internet who tries to bypass these ads. This would lead to an internet where unsolicited information and advertisements are imposed on users. Fortunately, recent court decisions have at least prevented this hypothetical from becoming a reality in Germany. ↫ FSFE Good. My position has always been clear: your computer, your rules. Block ads to your heart’s content. Even on OSNews – block away if you want. There are far better ways to support us, anyway (Patreon, Ko-Fi, Liberapay, merch).
A prominent disinformation scholar has accused Harvard University of dismissing her to curry favor with Facebook and its current and former executives in violation of her right to free speech. Joan Donovan claimed in a filing with the Education Department and the Massachusetts attorney general that her superiors soured on her as Harvard was getting a record $500 million pledge from Meta founder Mark Zuckerberg’s charitable arm. ↫Joseph Menn for The Washington Post This is why “voting with your wallet” is such an empty platitude, usually used by corporatists trying to absolve corporations from misdeeds and shifting the blame to us, mere consumers. How on earth can us regular folks vote with our wallet when someone like Zuckerberg can just buy the entire “election” without blinking?
A team of researchers primarily from Google’s DeepMind systematically convinced ChatGPT to reveal snippets of the data it was trained on using a new type of attack prompt which asked a production model of the chatbot to repeat specific words forever. Using this tactic, the researchers showed that there are large amounts of privately identifiable information (PII) in OpenAI’s large language models. They also showed that, on a public version of ChatGPT, the chatbot spit out large passages of text scraped verbatim from other places on the internet. So not only are these things cases of mass copyright infringement, they also violate countless privacy laws. Cool.
Our nightly example browser, servoshell, is now easier to navigate, accepting URLs without http:// or https:// both in the location bar and on the command line, and should no longer lock up when run with --no-minibrowser. Local paths can also be given on the command line, and are still preferred when the path points to a file that exists. Work is now underway to improve our embedding story and prepare Servo for integration with Tauri, starting with precompiled ANGLE for faster initial builds, better support for offscreen rendering, and support for multiple webviews. These changes haven’t landed yet, but once they do, apps will be able to open, move, resize, and interleave Servo with other widgets. I’m curious what the future will bring to Servo. It seems under very active development, but it’s not part of any of the main browser projects. Let’s hope they can keep up the momentum so that it can grow into a viable alternative. Because lord do we need one.
The PARC facility also is known for the invention of Ethernet, a networking technology that allows high-speed data transmission over coaxial cables. Ethernet has become the standard wired local area network around the world, and it is widely used in businesses and homes. It was honored this year as an IEEE Milestone, a half century after it was born. Truly one of the success stories of the technology world. Sure, those first Ethernet cables and accessories have changed a lot over the decades, but we’re still using it to this day, and we’ll be using it for many more decades to come.
Facebook has unveiled the prices it’s going to charge European users who want to have an ad-free experience on Facebook and Instagram. People in these countries will be able to subscribe for a fee to use our products without ads. Depending on where you purchase it will cost €9.99/month on the web or €12.99/month on iOS and Android. Regardless of where you purchase, the subscription will apply to all linked Facebook and Instagram accounts in a user’s Accounts Center. As is the case for many online subscriptions, the iOS and Android pricing take into account the fees that Apple and Google charge through respective purchasing policies. Until March 1, 2024, the initial subscription covers all linked accounts in a user’s Accounts Center. However, beginning March 1, 2024, an additional fee of €6/month on the web and €8/month on iOS and Android will apply for each additional account listed in a user’s Account Center. That’s a high price to pay to read your racist uncle’s rants and see the heavily photoshopped photos of some random influencer peddling vitamin pills.
MicroTCP is a TCP/IP network stack I started building as a learning exercise while attending the Computer Networking course at the Università degli Studi di Napoli Federico II. It’s just a hobby project and is intended to just be a minimal, yet complete, implementation. At this moment MicroTCP implements ARP (RFC 826, complete), IPv4 (no fragmentation), ICMP (minimum necessary to reply to pings) and TCP (complete but not stress-tested). Note that “complete” should not be intended as “fully compliant” but just as a measure of progress on all of the major features. For instance, it’s complete enough to handle HTTP traffic on a local network. People like this usually end up writing a simple operating system, so it’s interesting to see a TCP/IP stack instead. While clearly a hobby project, small, portable TCP/IP stacks can potentially be useful for very specific use cases, like bringing connectivity to ancient operating systems or other small hobby projects.
Do you know that the modern web browser can access real musical instruments? With the help of Web MIDI API, we can create a web application that can access MIDI devices connected to our computer. In this article, I will explain how I use Google Sheets as a music sequencer for composing and playing ambient music with a hardware synthesizer. Next thing you tell me browsers have an API for gamepads and joysticks connected through the game port.
A browser(/web) engine essentially takes in a URL(/etc) and gives you it rendered into a window for you to view and interact with. <shadow> does this too, almost entirely from scratch, made in JS. It runs in your browser! Node backend soon™ too? The host browser(/etc) is only used for networking (fetch) and renderer backend (<canvas>). I feel like I have opinions, but I can’t express them. This is equal parts genius and madness.
So you ended up with this JavaScript quirk where it was possible to create unique URLs that ran a bit of JavaScript on whatever page you happened to be looking at. It could even make changes to that page. Move things around. Replace words. Open links. And pretty early on, people realized that these JavaScript URLs were also bookmarkable, just like any other URL. And, crucially, easily shareable as links. I had almost forgotten about these things.
Meta is preparing to charge EU users a $14 monthly subscription fee to access Instagram on their phones unless they allow the company to use their personal information for targeted ads. The US tech giant will also charge $17 for Facebook and Instagram together for use on desktop, said two people with direct knowledge of the plans, which are likely to be rolled out in coming weeks. The move comes after discussions with regulators in the bloc who have been seeking to curb the way big tech companies profit from the data they get from their users for free, which would be a direct attack on the way groups such as Meta and Google generate their profits. Is anyone really stupid enough to think that even if you pay, Facebook won’t monetise your behaviour anyway? Sure, you might not see ads, but paying customer or not, your data is still going to be used for literally everything else Facebook does. I hope people don’t fall for this nonsense.
Twenty years ago, a group of friends shot a Matrix fan film on a limited budget. Sharing their creation with the rest of the word initially appeared to be too expensive, but then they discovered a new technology called BitTorrent. Fast forward two decades and their “Fanimatrix” release is the oldest active torrent that’s still widely shared today. That’s amazing. When reading the headline, I assumed it’d be some copyrighted blockbuster – not something the creators actually wanted to share via BitTorrent.