Google wants RISC-V to be a “tier-1” Android architecture

Thom Holwerda 2023-01-04 Android 34 Comments

Google’s keynote at the RISC-V Summit was all about bold proclamations, though. Lars Bergstrom, Android’s director of engineering, wants RISC-V to be seen as a “tier-1 platform” in Android, which would put it on par with Arm. That’s a big change from just six months ago. Bergstrom says getting optimized Android builds on RISC-V will take “a lot of work” and outlined a roadmap that will take “a few years” to come to fruition, but AOSP started to land official RISC-V patches back in September.

Another vote of confidence for RISC-V.

About The Author

Thom Holwerda

Follow me on Mastodon @[email protected]

34 Comments

2023-01-04 8:37 pm

Alfman verbose=1
It’s good news and RISC-V has the potential to be a great & accessible architecture for owners. Here’s hoping it can actually be what many of us had hoped ARM would be. :-/

It’s nice that RISC-V is an open architecture, but unfortunately there’s a risk that manufacturers could ruin it with proprietary addons and restrictions that keep owners from benefitting.
2023-01-05 7:41 am

dsmogor
And here is the role that google could play in standardizing the ecosystem. Not dissimilar to the one Microsoft played in x86 world.

2023-01-05 8:01 am

The123king
Microsoft had very little role in standardising the x86 world. Most of that was down to IBM, ironically.

100% IBM PC compatible, was largely the gold standard that manufacturers wanted to achieve in the 80’s. And this was driven because much of the overlying software was written to target the x86 IBM PC running PC-DOS. Microsoft would happily sell MS-DOS to any x86 vendor, compatible or not, but the market was ultimately driven by the software houses and manufacturers, who wanted to keep IBM PC compatibility.

2023-01-05 2:02 pm

anevilyak
That’s a bit revisionist. IBM very much wanted to keep the IBM PC under lock and key like most of their past product offerings ; the courts simply disagreed with them and asserted that Compaq’s clean-room reverse engineered BIOS was acceptable under copyright law, which was the catalyst for creating the PC clone market. If IBM had had any choice about it, there would be no such market because they’d never have been interested in marketing outside of businesses paying top dollar for support.

Edit: I might note, IBM tried and failed to reassert control with the Personal System/2, which included many proprietary new features, but ultimately failed in the market for that very reason. About the only part that survived was its more compact keyboard/mouse ports.

2023-01-05 4:21 pm

Alfman verbose=1
anevilyak,

That’s a bit revisionist. IBM very much wanted to keep the IBM PC under lock and key like most of their past product offerings ;

Well, both can technically be true. IBM wanted to control the platform and can also be responsible for producing the PC standard The123king is talking about.

the courts simply disagreed with them and asserted that Compaq’s clean-room reverse engineered BIOS was acceptable under copyright law, which was the catalyst for creating the PC clone market. If IBM had had any choice about it, there would be no such market because they’d never have been interested in marketing outside of businesses paying top dollar for support

I agree, and we are all so lucky for this because otherwise we probably wouldn’t have generic & interoperable x86 PC standards. Today’s courts are much less reliable, afterhaving crossed the line into APIs and such being considered copyrightable regardless of implementation by the supreme court, it could make it harder to get away with reverse engineering & cloning new platforms going forward.

2023-01-05 4:35 pm

LeFantome
It is all opinions of course but I would agree with the take that Microsoft was the major driver in x86 standardization but it is a bit complicated.

What customers wanted was application compatibility. That mean MS-DOS ( as opposed to CP/M or something else ). Of course, MS-DOS was only possible because IBM failed to properly assert exclusivity for PC-DOS. IBM did not intend for the PC to be “open” though and the BIOS was the other major element required for MS-DOS compatability. MS-DOS became a standard, many computer manufacturers jumped on that bandwagon and many ( at first ) did so without true “IBM compatibility” in mind.

I think the fate of the Tandy 2000 ( an MS-DOS and BIOS driven but not “IBM” compatible machine ) outlines the situation well.
https://en.wikipedia.org/wiki/Tandy_2000

The Tandy 2000 ran MS-DOS on an x86 chip and had a BIOS that was API compatible. In theory, DOS programs would run and indeed the ones that stuck to calling into the BIOS ran fine. That said, many programs bypassed the BIOS to call directly into the IBM hardware and that is where things went wrong. Given the lack of real DOS application compatibility in practice, Tandy and several other players realized that true “IBM compatibility” was the only way to participate in the MS-DOS market. The follow-up to the Tandy 2000 was the Tandy 1000 ( a true clone ).

So, IBM compatibility became a requirement for the software ( DOS ) compatibility that people wanted. It was really a means to an end though and I do think that MS-DOS application compatibility ( and later Windows ) is what drove standardization within the ecosystem.

In my view, if MS-DOS applications writers had stuck with pure BIOS calls, there would have been more variety in the early PC universe and IBM may not have been the vendor calling the shots for as long as they did. Ironically, it is the fact that the IBM BIOS was “buggy” that gets credited with why so many applications went around the BIOS to talk directly to IBM hardware. The “inferior” product managed to define and dominate a market yet again.

Notably, when IBM attempted to evolve the PC with the PS/2, the industry did not follow. This is when “IBM compatible” become “PC compatible” as the PC architecture started to evolve and diverge from IBM’s offerings ( eg. EISA and later PCI as an alternative to micro-channel ). The fact that applications continued to run on PCs from all vendors meant that there was little incentive to follow IBMs lead on hardware. DOS / Windows compatibility is what mattered ( though Windows did not matter all that much until 1990 ).

2023-01-05 5:26 pm

Alfman verbose=1
tanishaj,

In my view, if MS-DOS applications writers had stuck with pure BIOS calls, there would have been more variety in the early PC universe and IBM may not have been the vendor calling the shots for as long as they did. Ironically, it is the fact that the IBM BIOS was “buggy” that gets credited with why so many applications went around the BIOS to talk directly to IBM hardware. The “inferior” product managed to define and dominate a market yet again.

I don’t recall this. It could be before my time but do you have examples?

Often times direct IO was done out of necessity because the BIOS simply lacked functionality. The BIOS didn’t support much hardware and asynchronous modes of operation so the problem fell on OS and application developers to write their own drivers. And of course the performance was far better with direct IO too, BIOS software interrupts were expensive and a game programmer of that era would have wanted to get BIOS out of the game loop to get better control and performance.

2023-01-06 8:51 pm

sukru
Yes,

BIOS was very useful for learning what a sector is, how the VGA layout works, or learning to write simple programs. Anything serious though needed low level access.

DOS was great for this purpose. It basically loaded programs and got out of the way. You did not need to worry about filesystems, or anything else, unless you wanted to. And then there were extensions like DPMI, XMS, EMS which standardized some of the system management.

And it came with two powerful tools to begin programming. gwbasic, and more importantly debug.com. It was an integrated assembler, disassembler, and a “REPL” hacking tool for x86.

(I know; you, or any other DOS programmer, already knows all these).

Nothing after that came close. Windows required much more specialized tools to even write “hello world”. Linux was okay, hello.asm is not too difficult (int 0x80 helps). But system programming meant kernel programming, it was no longer easy to directly access hardware.

2023-01-06 8:39 am

jalnl
The Tandy 1000 was a clone of the IBM PCjr, and therefore not a true clone of the IBM PC. In fact, it had superior sound and graphics. As for EISA, it never saw much application in home PCs, before PCI there was VLB (VESA local bus).

2023-01-05 3:30 pm

jgfenix
Then Google has to clearly define what ISA extensions are required for Android.

2023-01-05 4:32 pm

Alfman verbose=1
jgfenix,

Then Google has to clearly define what ISA extensions are required for Android.

Yes, of course. But unless they also mandate the absence of anti-features we might well end up with devices that are tightly controlled by vendors and owners not being able to treat the OS as generic & interoperable. This would rope consumers into all the same vendor locking cons we experience with android devices today. For contrast, think about PC hardware: it typically continues to be functional and up-gradable long past the point when the original vendor looses interest in supporting it.

2023-01-06 8:44 am

jalnl
Theoretically, but there aren’t that many ARM SOCs used in mobile phones (Qualcomm, Apple and MediaTek and that’s about it), so I don’t see that changing for RISC-V. In fact, I don’t think many companies would want to both design RISC-V hardware (with extensions) and the necessary software. Apple is the only one doing it currently, while Samsung et al just by 3rd party SOCs.

That said, it’s nice that Google wants to support RISC-V, but RISC-V CPUs are still pretty slow, not least because it’s basically a toy architecture based on ancient RISC CPUs. I don’t see that changing much.

2023-01-06 9:32 am

Alfman verbose=1
jalnl,

Theoretically, but there aren’t that many ARM SOCs used in mobile phones (Qualcomm, Apple and MediaTek and that’s about it), so I don’t see that changing for RISC-V. In fact, I don’t think many companies would want to both design RISC-V hardware (with extensions) and the necessary software. Apple is the only one doing it currently, while Samsung et al just by 3rd party SOCs.

Honestly my reaction to this would be: Good! Give us the hardware and the specs and keep the manufacture’s mitts out of the software. The less they have to do with software, the better off we’ll be.

That said, it’s nice that Google wants to support RISC-V, but RISC-V CPUs are still pretty slow, not least because it’s basically a toy architecture based on ancient RISC CPUs. I don’t see that changing much.

ARM used to be in that boat as well. If someone puts the resources into optimizing it and producing it on a modern fab, RISC-V would become more attractive. Of course who knows if/when that will happen.

2023-01-06 10:53 am

Bill Shooter of Bul Platinum Prime
Some one correct me if I’m wrong, but I think the issue with Risc v isn’t so much the process, but the actual core design itself. I think it will compete with arm in the areas that Arm originally came to dominates: the small devices with low power and compute requirements, long before it gets competitive with arm in smart phones or any other more demanding uses.
2023-01-06 1:12 pm

jgfenix
I read some comments in Tom’s Hardware (one from a guy with expertise in Arm) that RISC-V lacked high performance instructions (whatever that is) that in some loops it had the double of instructions that Arm’s. Also in a case RISC-V did with 5 instructions what Arm did with one. The guy’s opinion was that RISC-V’s designers wanted to rely in microoperations fusion to improve performance but he thought that was problematic and it was better to use performance instructions.
It was also mentioned that Alibaba’s chip used proprietary instructions to improve performance.
2023-01-06 2:01 pm

Alfman verbose=1
jgenix,

I read some comments in Tom’s Hardware (one from a guy with expertise in Arm) that RISC-V lacked high performance instructions (whatever that is) that in some loops it had the double of instructions that Arm’s. Also in a case RISC-V did with 5 instructions what Arm did with one.

I agree Instruction density is important. Previously I’ve tested the density of x86 and ARM and found that ARM has a small advantage (around 5%) in terms of density. This is on top of the benefits of simplified instruction fetching. I’d like to play around with RISC-V too, but so far early adapters have to pay through the nose for relatively immature CPU tech.

The guy’s opinion was that RISC-V’s designers wanted to rely in microoperations fusion to improve performance but he thought that was problematic and it was better to use performance instructions.

I suspect it would be possible to create a highly optimized super-scalar pipeline for RISC-V where simple instructions can run in parallel and there isn’t a big performance bottleneck for running them in sequence. I’d like to look more at the specifics here though, do you have a link?

It was also mentioned that Alibaba’s chip used proprietary instructions to improve performance.

Adding proprietary instructions could help for code they write, but are these proprietary instructions supported by standard compilers like gcc or llvm? If not it’s hard to imagine that many applications actually benefit from them.
2023-01-06 9:01 pm

sukru
I think you mean this discussion:
https://news.ycombinator.com/item?id=24958423 (and the linked article).

Yes, RISC-V seems to be disadvantaged in high performance loads, due to needing more commands to represent simple operations. One important advantage of Apple’s ARM chips was having longer look-ahead queues in M1. RISC-V starts with a negative score there.

Some other criticisms can be addressed by extensions (like lack of atomic instructions). But as mentioned here, it will cause possible fragmentation.

Maybe we can expect a “modern” RISC-V dialect, which will work in multi-core applications, and be optimized for longer look ahead queue depths. After all necessity is the mother of invention.
2023-01-07 2:02 pm

Alfman verbose=1
sukru,

I think you mean this discussion:
… (and the linked article).
Yes, RISC-V seems to be disadvantaged in high performance loads, due to needing more commands to represent simple operations.

Thanks for the link. I don’t think there was direct conscientious that it would necessarily be slower after optimizing the pipeline, however there was concern was over instruction density and the lack of certain math flags, which could lead to some algorithms being sub-optimal. It’d be interesting to take a closer look at that.

They talked about a compressed variation in this link, which could more than make up the instruction density differences, but at the expense of more expensive & complex prefetch.
https://people.eecs.berkeley.edu/~krste/papers/waterman-ms.pdf
It’s interesting to dig more into this tradeoff, though my gut feeling is that it would probably be better to start with instructions that fit the problem space in the first place rather than try and compensate for instruction deficiencies. The base instruction set needs to hit the right balance up front, and it’s possible RISC-V didn’t hit that balance.

They could technically do something akin to ARM’s thumb extensions, but those didn’t pan out long term and were removed from 64bit ARM.

One important advantage of Apple’s ARM chips was having longer look-ahead queues in M1. RISC-V starts with a negative score there.

Are you talking about a specific implementation or the ISA itself? While an ISA needs to be relatively permanent, the implementation of it can always be optimized. The M1’s instruction decoder clearly benefits from ARM’s RISC ISA, but I would think that RISC-V could do just as well as ARM. Obviously it’s a different problem if the instruction density isn’t there, but in terms of a simple decoder I honestly would have expected RISC-V to do well. Can you elaborate?

Some other criticisms can be addressed by extensions (like lack of atomic instructions). But as mentioned here, it will cause possible fragmentation.

Atomics are relatively rare, so performance wise it probably doesn’t matter how the ISA represents them, but you are right fragmentation could be a major problem. It would not be ideal for such things to remain unresolved prior to mass production.

Maybe we can expect a “modern” RISC-V dialect, which will work in multi-core applications, and be optimized for longer look ahead queue depths. After all necessity is the mother of invention.

I am not following what that has to do with the ISA beyond the ability to decode it easily?
2023-01-07 2:25 pm

Alfman verbose=1
conscientious -> consensus
Ah I need to be more careful using spell checking since FF’s recommendations are often the wrong word!
2023-01-07 3:07 pm

sukru
Alfman,

I am not following what that has to do with the ISA beyond the ability to decode it easily?

Easy decoding and unnecessarily long command sequences are the main problem. 🙂 At least that is what I gathered looking into the discussions on RISC-V.

It seems to he heavily optimized for cheaper implementation (low power devices). Same was with JVM (Back in the day, we went though a hardware implementation of it within a semester of CS class, but I remember almost nothing anymore). Java was supposed to run on refrigerators (Internet of Things, back in the day).

https://en.wikipedia.org/wiki/Oak_%28programming_language%29

Over time the bytecode and JVM implementations became more optimized, and now can compete with native C++, given sufficient amount of RAM.
2023-01-07 4:15 pm

Alfman verbose=1
sukru,

Easy decoding and unnecessarily long command sequences are the main problem. At least that is what I gathered looking into the discussions on RISC-V.

I think ease of decoding was a priority for RISC-V though and I have no reason to suspect they didn’t succeed here. I haven’t seen evidence that RISC-V is disadvantaged in the decode stage.

https://fraserinnovations.com/risc-v/risc-v-instruction-set-explanation/
https://developer.arm.com/documentation/ddi0602/2022-12/Base-Instructions

Of course easy of decode may not matter if the instruction density is bad. I haven’t tested it myself but from other accounts it may be a problem.

It seems to he heavily optimized for cheaper implementation (low power devices). Same was with JVM (Back in the day, we went though a hardware implementation of it within a semester of CS class, but I remember almost nothing anymore). Java was supposed to run on refrigerators (Internet of Things, back in the day).

I’m not sure I can get behind this sort of generalization. Generally I don’t see why an ISA that’s good for performance isn’t also good for low power devices too. You want high density and ease of decoding in either case, no? Either way it doesn’t hurt.

Implementation-wise obviously things start to diverge with massive caches and longer speculative pipelines, which is obviously a tradeoff between power and performance. But at the ISA level it’s harder to see an obvious tradeoff IMHO.

I guess if we’re talking about “refrigerators” or other appliances, an 8-bit micro-controller ISA might be more optimal. But the moment you start putting network and android-like UI functionality into appliances, it hardly seems worthwhile to build out appliance specific architectures, It may be overkill, but you could use something like a raspberry pi across the board and standardization brings tons of benefits and cost savings over highly specialized components & systems.

Over time the bytecode and JVM implementations became more optimized, and now can compete with native C++, given sufficient amount of RAM.

Yes, it drives some C devs crazy, but there are times JVM technology can actually rival C code. Hotspot optimization can provide compilers with more information about active code paths. Also sometimes there are performance advantages to using GC over alloc/free. Application servers are often well suited for GC since jobs are often completed before any garbage collection cycles have taken place, resulting in zero cost garbage collection.
2023-01-07 4:34 pm

sukru
Alfman,

As a “not a hardware” person, I might not be conveying my thoughts correctly. (Or it is possible they are wrong). But, anyway….

For C++, “smart” pointers are actually not very efficient compared to a proper GC, and yes malloc/free has a measurable overhead.

Yet, languages can learn from each other. And modern C++ copies Java (ouch, very difficult admission).

One practice is using an “arena” for incremental allocation like GC’ed languages, and deconstructing the entire group in one sweep. This brings the speed advantage of a generational mark and sweep GC, without incurring the GC overhead itself:
https://developers.google.com/protocol-buffers/docs/reference/arenas

Anyway, back to RISC-V… The architecture getting traction is good. I am sure lessons learned from ARM will slowly adopt to that platform, and maybe one day we can even see a “Raspberry Vee”.
2023-01-07 5:07 pm

Alfman verbose=1
sukru,

Yet, languages can learn from each other. And modern C++ copies Java (ouch, very difficult admission).

One practice is using an “arena” for incremental allocation like GC’ed languages, and deconstructing the entire group in one sweep. This brings the speed advantage of a generational mark and sweep GC, without incurring the GC overhead itself:

“incurring the GC overhead itself” needs clarification. Many java algorithms would not incur any significant “GC” overhead either, depending on their use case.

But yes, in principal one can implement variations on GC in C/C++, and your arena reference is one such example. I recall such optimizations being done in DOS and pascal having a mechanism to do it too.

Anyway, back to RISC-V… The architecture getting traction is good. I am sure lessons learned from ARM will slowly adopt to that platform, and maybe one day we can even see a “Raspberry Vee”.

I don’t yet know how bad the instruction density is next to ARM or even x86, but assuming this isn’t too bad then I think it will be good to have an open hardware alternative. That said, I don’t know whether this actually translates into open hardware for consumers. For all of this hype about RISC-V’s open architecture, we might still end up with proprietary implementations/drivers/kernels leaving consumers no better off than before. If this happens, it’s harder to see an upside. I’ll hope for the best though.
2023-01-07 8:38 pm

sukru
Alfman,

The paper is a bit older, but offers a high level peek on how much memory overhead a garbage collector needs before becoming efficient:
https://www.cs.utexas.edu/~mckinley/papers/mmtk-sigmetrics-2004.pdf

Everything of course depends on the algorithms used, the rate of allocation, and if the patterns match the expectations of those algorithms.

Anyway, at worst, if given 3x amount of RAM of a classic static allocator, the garbage collected version will run faster. Not only because it avoids doing costly destructor calls, or avoiding fragmentations. But also, it can make much better use of cache locality. Given RAM access is not O(1) (I think we had this here before), but rather O(sqrt N) or O(log N) depending on your model, this is a great advantage.

Anyway, RAM is not free, but makes a GC version much faster than old style allocators (including smart pointers).

In this matter, we seem to agree a lot.
2023-01-08 12:40 am

Alfman verbose=1
sukru,

The paper is a bit older, but offers a high level peek on how much memory overhead a garbage collector needs before becoming efficient:

Thanks for the link. I think those are reasonable figures for typical applications. I don’t normally choose GC languages based on performance characteristics. What’s far more important in practice are the benefits of managed memory. Having done many projects both ways, I’m extremely thankful to have safe languages that avoid memory corruption! The main case I’d avoid GC for is realtime processing where jitter itself is harmful.

2023-01-06 7:59 am

Geck
If they will provide upstream GNU/Linux device drivers for their RISC-V set of devices. Then great. If not then great for them but it doesn’t really change all that much for me.

2023-01-06 8:33 am

Alfman verbose=1
Geck,

If they will provide upstream GNU/Linux device drivers for their RISC-V set of devices. Then great. If not then great for them but it doesn’t really change all that much for me.

I’d like to know as well since this will determine whether anything will be fundamentally different. It’s important but not sufficient for google to commit to FOSS on its end, the SOC manufacturers (like qualcomm) will need to as well otherwise it could just be more of the same.

I do see this is as a good opportunity, but at this stage it’s unclear whether actual implementations of RISC-V will be progressive for FOSS users in practice. Consider that even though the RISC-V spec is open, Nvidia uses RISC-V cores in their GPUs in a proprietary way that’s not open whatsoever. They can benefit from openness without passing any benefit onto owners. IMHO this would be a disappointing outcome.

2023-01-06 8:45 am

Geck
It’s not like they have a choice. Do it or get out of the way. For somebody else to do it instead. Google occupied the FOSS side of tech. And presents themself as FOSS proponents. If they are now backing out. Fine. Others will take over from there on. All we got is some lame excuses in regards to ARM device drivers in the past decade or so. It’s time to move on. Progress can’t be stopped. On top of that algorithms that manipulate daily lives. That needs to get open too.

2023-01-06 10:08 am

Alfman verbose=1
Geck,

It’s not like they have a choice. Do it or get out of the way. Google occupied the FOSS side of tech. And presents themself as FOSS proponents. If they are now backing out. Fine. Others will take over from there on. All we got is some lame excuses in regards to ARM device drivers in the past decade or so.

You can blame google as much as you like but if you don’t blame those directly responsible for the proprietary drivers ending up in consumer hardware then you are barking up the wrong tree. We’re talking about driver code that google didn’t write, they may not even have the source code for, and wouldn’t have the legal right to release it even if they did have access under an NDA.

It’s time to move on. Progress can’t be stopped. On top of that algorithms that manipulate daily lives. That needs to get open too.

Sure many of us want to move on for progress, but “progress can’t be stopped”? This is naive. The small set of corporations that produce cutting edge chips are holding most of the cards, you can make a stink about what they’re dealing – all of us do, but at the end of the day it’s doubtful that you have the resources to go around them. Like it or not a company like google may be your best shot at change. I don’t put blind faith in google, but without them your odds of successfully combating those responsible for proprietary drivers in consumer devices drops dramatically.

2023-01-06 12:15 pm

Geck
Excuses. Google is now in a position to produce a flagship GNU/Linux mobile device with upstream device drivers. There is nothing preventing them to make such a deal with chip manufacturers. There is enough chip manufacturers now interested in that. It’s by choice. On why they don’t do it. Lets give it time. And we will see on how long can they continue to stand in the way of that. And who is naive. They have built their entire empire on FOSS. It’s a rather shortsighted behavior. On what they are doing now. In regards to being against upstream device drivers. But it will happen. With or without them. On top of that it’s time to adapt things like Wayland in Android. And beyond. If they don’t want to do that anymore. Somebody else will do just that.
2023-01-06 12:56 pm

Alfman verbose=1
Geck,

You’re constantly blaming google for your woes, so what are you actually going to do without them? Tell us, because honestly blaming google seems to be your goto excuse for almost everything that ails FOSS. Are you going to keep using them as an excuse every time this comes up? You keep saying things will happen and it will get done. I don’t think you are completely justified in blaming google who’ve done more for FOSS than their competitors. Still I wouldn’t necessarily have a problem with your ideas but when it comes to execution you always come up short and you never seem to propose any new plans that haven’t already been tried and failed to materialized.

So what is the plan? And no excuses this time!

2023-01-06 10:26 am

Moochman
This is all great and all but keep in mind that 1) it’s mainly driven by Chinese manufacturers for their domestic market and 2) I don’t know of any RISC-V chips on the market that are completely open, for general purpose mobile computing a whole host of SoC stuff for graphics, networking, etc. is needed and it’s all proprietary AFAIK. Please someone jump in here if they know of any truly open RISC-V chip projects.

2023-01-06 10:54 am

Bill Shooter of Bul Platinum Prime
I think you’re right, it will be slightly more open than ARM, but in practice as a consumer, not that much different otherwise.
2023-01-06 12:03 pm

Geck
The promise with RISC-V is it could be done. If somebody would do it. Where with ARM it can’t be done. As ARM license doesn’t allow it. So from this point of view RISC-V should be considered as being progress. In regards to things you ask for. But if somebody will actually provide that. We’ll see. In my opinion yes. Over time it will happen.