Linked by David Adams on Wed 14th Dec 2011 16:01 UTC, submitted by fran
Internet & Networking PHP's popularity and simplicity made it easy for the company's developers to quickly build new features. But PHP's (lack of) performance makes scaling Facebook's site to handle hundreds of billions of page views a month problematic, so Facebook has made big investments in making it leaner and faster. The latest product of those efforts is the HipHop VM (HHVM), a PHP virtual machine that significantly boosts performance of dynamic pages . And Facebook is sharing it with the world as open-source.
Thread beginning with comment 500115
To view parent comment, click here.
To read all comments associated with this story, please click here.
RE: Database
by Alfman on Wed 14th Dec 2011 20:14 UTC in reply to "Database"
Alfman
Member since:
2011-01-28

Well, the front end PHP servers are usually (always?) stateless, which means they can be scaled trivially by running clusters of mirrored web servers in parallel. The same sort of scalability is not nearly as trivial for databases, and for that reason they tend to be much more problematic.

However that said, PHP is extremely inefficient. It's worse than java or .net by a factor of roughly 100 according to the "average" row in the following benchmark:

http://www.csharp-architect.com/images/benchmarksJan2009Final.gif

So, with some hand-waiving, we'd expect a VM version of PHP to significantly reduce the quantity of PHP servers required to service a given load, and the excess servers no longer needed could be redeployed as less heavily loaded database servers.

This makes a lot of sense for an entity like facebook and shared hosting providers where they run servers at max capacity.

Reply Parent Score: 2

RE[2]: Database
by Alfman on Wed 14th Dec 2011 20:40 in reply to "RE: Database"
Alfman Member since:
2011-01-28

Just correcting myself, php session data isn't stateless, but if load balancing drives users to the same server each request, it's not an issue. And it's easy to store on NFS otherwise.

Reply Parent Score: 2

RE[2]: Database
by fran on Wed 14th Dec 2011 22:46 in reply to "RE: Database"
fran Member since:
2010-08-06

ok I know this is not stackoverflow but is this a problem across all databases? For instance NoSql, Couchbase, SQL ect.
Is'nt there also type of cacheing plugin that make PHP faster and methods to curb output buffer flooding?

Reply Parent Score: 2

RE[3]: Database
by lucas_maximus on Wed 14th Dec 2011 23:38 in reply to "RE[2]: Database"
lucas_maximus Member since:
2009-08-18

There are things like mem-cache and other technologies ... but Facebook is built on MySQL which is alright for running a blog or a small web store ... but once it gets serious you need a proper RDBMS.

Reply Parent Score: 1

RE[3]: Database
by Soulbender on Thu 15th Dec 2011 11:45 in reply to "RE[2]: Database"
Soulbender Member since:
2005-08-18

Part of the problem, and no small part, is that most web developers couldn't design a relational database if their life depended on it. Normalize? Wazzat? Just use incrementing serials always, everywhere. Relations? No, I'm single and do my database consistency in the code.

Edited 2011-12-15 11:45 UTC

Reply Parent Score: 2

RE[2]: Database
by Dr.Mabuse on Thu 15th Dec 2011 01:26 in reply to "RE: Database"
Dr.Mabuse Member since:
2009-05-19

However that said, PHP is extremely inefficient. It's worse than java or .net by a factor of roughly 100 according to the "average" row in the following benchmark: ... (snip)


Most Java applications, in my experience, running in a web environment are horrible, simply horrible, memory hogs. I can think of many examples, commercial and internally developed that fit this description!

Maybe it's not Java's fault per-se, but the toolkits, or the methodology behind them, I really don't know (or care), but I cannot hope to recall the amount of times Tomcat has balked because it's run out of resources. Servers running this software nearly always need more ram, and more cpu than their PHP/Apache2 counterparts.

Maybe the VM powering PHP really is that much slower, but when considering the complete stack to deliver content to the web, PHP is a much better option if you actually care about reliability and, I think, user-perceived performance.

Just my 2 cents...

Reply Parent Score: 3

RE[3]: Database
by Alfman on Thu 15th Dec 2011 03:07 in reply to "RE[2]: Database"
Alfman Member since:
2011-01-28

Dr Mabuse,

"Most Java applications, in my experience, running in a web environment are horrible, simply horrible, memory hogs. I can think of many examples, commercial and internally developed that fit this description!"

I can sure vouch for this indirectly. I needed to work on some Java code with some special version of eclipse IDE which consumed no less than 500MB of ram. We thought something was wrong, but the vendor said it was normal and within specs. I needed more ram installed in my employer-provided computer.

Of course, as you say it may not indicate a problem with Java per say, but yikes...

Anyways, some people have done the benchmarks for memory too so we can speak a little more intelligently about it:

http://shootout.alioth.debian.org/u64q/benchmark.php?test=all&lang=...


"Maybe the VM powering PHP really is that much slower, but when considering the complete stack to deliver content to the web, PHP is a much better option if you actually care about reliability and, I think, user-perceived performance."

I think it's largely developer preference and skill.

Java forced exceptions caused a lot of friction with developers who simply wanted exceptions to bubble up until they were caught or the program aborted itself. Without a Java IDE to insert code templates, calling exception throwing functions was uniquely painful. Java invented a new problem that no other languages had. It would have been much better handled as a compiler warning. This was the deciding factor in my personal projects to avoid java despite wanting to use it for it's other qualities.


PHP is a global heap of inconsistent functions with a long history of semantic incompatibility between versions. PHP designers were clearly not qualified to build the language that would become the standard web platform for the internet. They can be credited for the development of anti-features such as \"magic quotes\" and "=== I really mean it" equality.

PHP's strong point is online documentation, they've done an excellent job making things very easy to learn how to do. I think many languages are far behind in the documentation dept.

I would like to try other modern languages, but PHP's ubiquity at hosting providers keeps me coming back - it remains top dog because it's top dog.

Edited 2011-12-15 03:12 UTC

Reply Parent Score: 3

RE[3]: Database
by moondevil on Thu 15th Dec 2011 15:11 in reply to "RE[2]: Database"
moondevil Member since:
2005-07-08

Most Java applications, in my experience, running in a web environment are horrible, simply horrible, memory hogs. I can think of many examples, commercial and internally developed that fit this description!

Maybe it's not Java's fault per-se, but the toolkits, or the methodology behind them, I really don't know (or care), but I cannot hope to recall the amount of times Tomcat has balked because it's run out of resources. Servers running this software nearly always need more ram, and more cpu than their PHP/Apache2 counterparts.


Usually the fault is the programmers.

Most of the time when I see bad Java code, it is caused by developers that learned to program Java on the job while putting to production the first thing they managed to compile.

But this is not Java specific, I see this a lot when we need to rescue projects done by wannabe developers,
regardless of the programming language being used.

Reply Parent Score: 2

RE[2]: Database
by Laurence on Thu 15th Dec 2011 11:24 in reply to "RE: Database"
Laurence Member since:
2007-03-26

Well, the front end PHP servers are usually (always?) stateless, which means they can be scaled trivially by running clusters of mirrored web servers in parallel. The same sort of scalability is not nearly as trivial for databases, and for that reason they tend to be much more problematic.

However that said, PHP is extremely inefficient. It's worse than java or .net by a factor of roughly 100 according to the "average" row in the following benchmark:

http://www.csharp-architect.com/images/benchmarksJan2009Final.gif

So, with some hand-waiving, we'd expect a VM version of PHP to significantly reduce the quantity of PHP servers required to service a given load, and the excess servers no longer needed could be redeployed as less heavily loaded database servers.

This makes a lot of sense for an entity like facebook and shared hosting providers where they run servers at max capacity.

I can't see any figures for PHP in that gif.

I'd also be interested to see PHP compared against CGI, mod_perl and Python.

Reply Parent Score: 2

RE[3]: Database
by Alfman on Thu 15th Dec 2011 16:40 in reply to "RE[2]: Database"
Alfman Member since:
2011-01-28

Laurence,

"I can't see any figures for PHP in that gif."

The table up at the top. I don't really understand why they were omitted from the graph. Anyways I posted another source with an interactive language by language comparison.

.Net and Java were fairly similar, Perl was often better than PHP, but interpreted languages across the board were at least a magnitude slower than native ones.

Ideally, all languages would have native JIT compilers so that performance would no longer be such a crucial factor between them.

Reply Parent Score: 2

RE[2]: Database
by tony on Thu 15th Dec 2011 18:51 in reply to "RE: Database"
tony Member since:
2005-07-06

Well, the front end PHP servers are usually (always?) stateless, which means they can be scaled trivially by running clusters of mirrored web servers in parallel. The same sort of scalability is not nearly as trivial for databases, and for that reason they tend to be much more problematic.

However that said, PHP is extremely inefficient. It's worse than java or .net by a factor of roughly 100 according to the "average" row in the following benchmark:

http://www.csharp-architect.com/images/benchmarksJan2009Final.gif

So, with some hand-waiving, we'd expect a VM version of PHP to significantly reduce the quantity of PHP servers required to service a given load, and the excess servers no longer needed could be redeployed as less heavily loaded database servers.

This makes a lot of sense for an entity like facebook and shared hosting providers where they run servers at max capacity.


Those functions in the benchmark, how often is PHP required to do computationally complex operations? Most of the time it's rendering HTML and pulling data from or pushing into a database. Very simple stuff. We're not calculating Pi to the millionth digit or calculating jump coordinates with PHP. I can't imagine you see any difference in a basic web page with a DB back-end using PHP or compiled C.

And Java has always seemed slower to me in implementation. Such a resource hog, and what's worse as a sysadmin I have very little idea of what's going on inside the Java engine. Neither do the developers, either it seems. So when stuff goes wrong, it goes way wrong.

Reply Parent Score: 2

RE[3]: Database
by Alfman on Thu 15th Dec 2011 20:56 in reply to "RE[2]: Database"
Alfman Member since:
2011-01-28

tony,

"Those functions in the benchmark, how often is PHP required to do computationally complex operations?"

Consider things like dynamically generated graphics (like captcha). If PHP is the wrong language for that, what is the right language? Is that language available in your hosting package?

Most PHP pages do very little at a time, like maintaining shopping carts and constructing SQL strings, but aggregately the inefficiencies do add up, especially when the level of inefficiency is great.

I'm a little surprised that I'm the only soul here who seems to care about language efficiency. Oh well, it's a sign of the times.

Reply Parent Score: 2

RE[2]: Database
by Lennie on Sat 17th Dec 2011 00:13 in reply to "RE: Database"
Lennie Member since:
2007-09-22

No large website uses stock PHP, they atleast use a bytecode cache like APC/eaccelerator/xcache and a cluster of memcached servers to not hit the database servers to do read-queries. Maybe a NOSQL solution like redis or whatever.

Reply Parent Score: 2

RE[3]: Database
by Alfman on Sat 17th Dec 2011 00:48 in reply to "RE[2]: Database"
Alfman Member since:
2011-01-28

Lennie,

"No large website uses stock PHP, they atleast use a bytecode cache like APC/eaccelerator/xcache and a cluster of memcached servers to not hit the database servers to do read-queries. Maybe a NOSQL solution like redis or whatever."

You're probably right, but so what? Even small shared-hosting web sites could benefit from the higher performance.

Reply Parent Score: 2