Today, we are launching a technical preview of GitHub Copilot, a new AI pair programmer that helps you write better code. GitHub Copilot draws context from the code you’re working on, suggesting whole lines or entire functions. It helps you quickly discover alternative ways to solve problems, write tests, and explore new APIs without having to tediously tailor a search for answers on the internet. As you type, it adapts to the way you write code—to help you complete your work faster. Sounds like a cool and useful feature, but this does raise some interesting questions about the code it generates. Sure, generated code might be entirely new, but what about possible cases where the code it “generates” is just taken from the existing projects the AI was trained on? The AI was trained on open source code available on GitHub, including a lot of code licensed under, for instance, the GPL. GitHub says in the Copilot FAQ: GitHub Copilot is a code synthesizer, not a search engine: the vast majority of the code that it suggests is uniquely generated and has never been seen before. We found that about 0.1% of the time, the suggestion may contain some snippets that are verbatim from the training set. Here is an in-depth study on the model’s behavior. Many of these cases happen when you don’t provide sufficient context (in particular, when editing an empty file), or when there is a common, perhaps even universal, solution to the problem. We are building an origin tracker to help detect the rare instances of code that is repeated from the training set, to help you make good real-time decisions about GitHub Copilot’s suggestions. That 0.1% may not sound like a lot, but that’s misleading – another way to put it is that out of every 1000 suggestions Copilot makes, 1 is copy/pasted code someone has written and selected a license for, and that license must, of course, be respected. On top of that, it’s hard to argue that code generated from a set of existing open source code doesn’t constitute a derivative work, and is thus covered by the copyright open source licenses are based on. I am not a lawyer, so I’m not going to argue Copilot is definitively a massive GPL violation, but as a layman, on the face of it, it definitely feels like a tool that’s going to strip a lot of code from their licenses – without consent and permission of the code’s authors.
General Development Archive
Getting started with developing applications for a mobile platform can be a challenging task, especially when it comes to building and testing the application on the mobile device itself. The Librem 5 makes its application development workflow extremely simple. Among other things, you can develop applications on-device, which is something sorely missing from other platforms.
Rust developers have repeatedly raised concerned about an unaddressed privacy issue over the last few years. Rust has rapidly gained momentum among developers, for its focus on performance, safety, safe concurrency, and for having a similar syntax to C++. StackOverflow’s 2020 developer survey ranked Rust first among the “most loved programming languages.” However, for the longest time developers have been bothered by their production builds leaking potentially sensitive debug information. I’ll leave this one for you folks to figure out, but from a layman’s perspective, it looks like a really dumb thing to keep paths from the developer’s machine like this in compiled binaries? At least after countless years, the Rust developers seem committed to fixing it, finally.
Well, I’ll not tell a long story, how I debug, but come directly to the bug mentioned in the title. I tracked his existence down to BASIC 2.0 as used in the VIC-20, C64 and the early PET/CBM series and it seems, that it was never detected, documented or fixed. It is related to temporary strings, the stack of descriptors for temporary strings, that has a size of 3, and the so called “garbage collection”, which in reality doesn’t collect garbage, but does a defragmentation of string storage. Fixing an ancient bug like this must be a weirdly satisfying experience.
The great unicorn of software development is to have one language and framework that enables devs to code an app once and run it on any operating system and any type of device. Flutter has been aiming to do this since its inception, and today it gets quite a bit closer to that goal with the announcement of Flutter 2. The latest major update brings major enhancements for mobile platforms, adds support to desktop, and massively extends its capabilities on the web — among other things. Does anyone here have experience with Flutter? It seems like it’s gaining some steam judging by the increase in news stories about it recently.
This is the heart of the conflict: Rust (and many other modern, safe languages) use LLVM for its relative simplicity, but LLVM does not support either native or cross-compilation to many less popular (read: niche) architectures. Package managers are increasingly finding that one of their oldest assumptions can be easily violated, and they’re not happy about that. But here’s the problem: it’s a bad assumption. The fact that it’s the default represents an unmitigated security, reliability, and reproducibility disaster. I’m sure this will go down well.
Go 1.16 has been released. The new embed package provides access to files embedded at compile time using the new //go:embed directive. Now it is easy to bundle supporting data files into your Go programs, making developing with Go even smoother. You can get started using the embed package documentation. Carl Johnson has also written a nice tutorial, “How to use Go embed”. Go 1.16 also adds macOS ARM64 support (also known as Apple silicon). Since Apple’s announcement of their new arm64 architecture, we have been working closely with them to ensure Go is fully supported; see our blog post “Go on ARM and Beyond” for more. More details can be found in the release notes.
After two tweets that I made last week, playing around with UEFI and Rust, some people asked to publish a blog post explaining how to create a UEFI application fully written in Rust and demonstrate all the testing environment. So todays objective it’s to create a UEFI application in Rust that prints out the memory map filtered by usable memory (described as conventional memory by the UEFI specification). But before putting the hands at work let’s review some concepts first. uefi-rs is a Rust wrapper for UEFI.
Petit FatFs is a sub-set of FatFs module for tiny 8-bit microcontrollers. It is written in compliance with ANSI C and completely separated from the disk I/O layer. It can be incorporated into the tiny microcontrollers with limited memory even if the RAM size is less than sector size. Also full featured FAT file system module is available here. Fascinating little project.
This release enables quite a lot of new things to appear in const fn, two new standard library APIs, and one feature useful for library authors. See the detailed release notes to learn about other changes not covered by this post. Well, not much for me to add.
Texas Instruments has long made graphing calculators beloved by school-goers and programmers alike. The calculators are simple, compact computing systems, and entire communities have formed over the years to celebrate the devices’ broad programming capabilities. All that’s about to change. Texas Instruments is pulling support for C-based and assembly-based programs on both the TI-84 Plus CE — the most popular calculator for sideloading — and the TI-83 Premium CE, its French sibling. The latest firmware for each completely removes the capability and leaves users with no way to roll back to previous versions of the firmware. Way back when I was in high school, I used to write my own TI-83 programs to… Well, to cheat on tests. These devices were a brand new addition to the education system at the time, and teachers had no clue what we as students were doing with them. One of my best friends and I also bought a communication cable for them so we could share stuff and play multiplayer games together in the back of class. Removing stuff like this is a terrible idea.
Shells have been around forever and, for better or for worse, haven’t changed much since their inception. Until NuShell appeared to reinvent shells and defy our muscle memory. It brought some big changes, which include rethinking how pipelines work, structured input/output, and plugins. We wanted to learn more about NuShell so we interviewed both of its creators: Jonathan Turner and Yehuda Katz.
This release makes great progress in the C++20 language support, both on the compiler and library sides, some C2X enhancements, various optimization enhancements and bug fixes, several new hardware enablement changes and enhancements to the compiler back-ends and many other changes. There is even a new experimental static analysis pass. GCC is already 33 years old. That’s one heck of a legacy.
Colorado — like most states and territories across the country — is experiencing record unemployment numbers. But the state’s unemployment system is built on aging software running on a decades-old coding language known as COBOL. Over the years, COBOL programmers have aged out of the workforce, forcing states to scramble for fluent coders in times of national crisis. A survey by The Verge found that at least 12 states still use COBOL in some capacity in their unemployment systems. Alaska, Connecticut, California, Iowa, Kansas, and Rhode Island all run on the aging language. According to a spokesperson from the Colorado Department of Labor and Employment, the state was actually only a month or two away from “migrating into a new environment and away from COBOL,” before the COVID-19 pandemic hit. Are you one of the already 17 million people laid off in the US, losing what little health insurance you had in the process, and now you can’t even apply for unemployment assistance because some baby boomer coded the damn system in COBOL? Time to lift yourself up by the bootstraps and learn the wonders of COBOL!
The Cidco MailStation is a series of dedicated e-mail terminals sold in the 2000s as simple, standalone devices for people to use to send and receive e-mail over dialup modem. While their POP3 e-mail functionality is of little use today, the hardware is a neat Z80 development platform that integrates a 320×128 LCD, full QWERTY keyboard, and an internal modem. After purchasing one (ok, four) on eBay some months ago, I’ve learned enough about the platform to write my own software that allows it to be a terminal for accessing BBSes via its modem or as a terminal for a Unix machine connected over parallel cable. A year old story, but come on, this is timelessly cool.
Dropbox is a big user of Python. It’s our most widely used language both for backend services and the desktop client app (we are also heavy users of Go, TypeScript, and Rust). At our scale—millions of lines of Python—the dynamic typing in Python made code needlessly hard to understand and started to seriously impact productivity. To mitigate this, we have been gradually migrating our code to static type checking using mypy, likely the most popular standalone type checker for Python. (Mypy is an open source project, and the core team is employed by Dropbox.) This post tells the story of Python static checking at Dropbox, from the humble beginnings as part of my academic research project, to the present day, when type checking and type hinting is a normal thing for numerous developers across the Python community. It is supported by a wide variety of tools such as IDEs and code analyzers. I recently came across an article complaining about Python’s dynamic typing and couldn’t quite believe this was still the case. As it turns out, nowadays there is indeed a standardized way to do write type annotations and to type-check prior to runtime using mypy, all the while being driven forward by the good folks at Dropbox (which includes Python’s Benevolent Dictator for Life Guido van Rossum). This article provides a fascinating insider insight into the history of type-checking in Python and how it evolved in symbiosis with Dropbox’s codebase.
I have been curious about data compression and the Zip file format in particular for a long time. At some point I decided to address that by learning how it works and writing my own Zip program. The implementation turned into an exciting programming exercise; there is great pleasure to be had from creating a well oiled machine that takes data apart, jumbles its bits into a more efficient representation, and puts it all back together again. Hopefully it is interesting to read about too. This article explains how the Zip file format and its compression scheme work in great detail: LZ77 compression, Huffman coding, Deflate and all. It tells some of the history, and provides a reasonably efficient example implementation written from scratch in C. One for the ages. Articles like this don’t get written every day.
Under hard time pressure, the ground had to quickly figure out what was wrong and devise a workaround. What they came up with was the most brilliant computer hack of the entire Apollo program, and possibly in the entire history of electronic computing. To explain exactly what the hack was, how it functioned, and the issues facing the developers during its creation, we need to dig deep into how the Apollo Guidance Computer worked. Hold onto your hats, Ars readers—we’re going in. Amazing story.
This release represents a year of development effort and over 7,400 individual changes. It contains a large number of improvements that are listed in the release notes below. The main highlights are: – Builtin modules in PE format.– Multi-monitor support.– XAudio2 reimplementation.– Vulkan 1.1 support. Wine allows me to run virtually any Windows game I use on Linux – including League of Legends, my most-played game – so it’s a pretty amazing tool in my book. Since many people no longer directly interact with Wine, using it through tools like Steam’s compatibility tools or Lutris, instead, it’s easy to forget just how important of a project Wine really is.
This article demonstrates some of Magit’s most essential features in order to give you an impression of how the interface works. It also hints at some of the design principals behind that interface. But this is only the tip of the iceberg. Magit is a complete interface to Git, which does not limit itself to the “most essential features everyone needs”. I hope that this article succeeds in demonstrating how Magit’s focus on work-flows allows its users to become more effective Git users. Here we concentrate on some essential work-flows, but note that more advanced features and work-flows have been optimized with the same attention to detail. If you would rather concentrate on the big picture, then read the article Magit the magical Git interface instead (or afterwards). As a non-developer, I have no idea if this is a useful tool, but I do like the idea of it.