HipHop - The talk of the day 
Today I read about HipHop, the talk of the day in the PHP community. It is basically a PHP to C++ compiler, developed by Facebook to have all benefits of compiled binaries (improved performance, more efficient memory management, less overhead), without having to train their PHP developers into C++ developers or do a big round of fire-and-hire. They open sourced it today.
Very cool idea. But I can't be the only one thinking "why doesn't PHP work like this in the first place?". Well, the answer is obvious. PHP might just not be the right tool for the right job if you need performance like Facebook desires. So Facebook thought of a way that cheap (and apparently not such good) PHP developers, building (somewhat) crappy code, combined with a ginormous code-beautifier, actually gives superperformant binaries, as if they had highly educated C++ programmers on the job.
There's only one word for this. Brilliant. No sarcasm intended
I'm very curious where HipHop will be in a year time. I'm also very curious if HipHop 2.0 will compile into assembly code. And if HipHop 3.0 will compile directly into binaries. And if HipHop 4.0 will actually be a virtual machine, interpreting bytecode files and comes with a JIT compiler. Oh wait...
Very cool idea. But I can't be the only one thinking "why doesn't PHP work like this in the first place?". Well, the answer is obvious. PHP might just not be the right tool for the right job if you need performance like Facebook desires. So Facebook thought of a way that cheap (and apparently not such good) PHP developers, building (somewhat) crappy code, combined with a ginormous code-beautifier, actually gives superperformant binaries, as if they had highly educated C++ programmers on the job.
There's only one word for this. Brilliant. No sarcasm intended
I'm very curious where HipHop will be in a year time. I'm also very curious if HipHop 2.0 will compile into assembly code. And if HipHop 3.0 will compile directly into binaries. And if HipHop 4.0 will actually be a virtual machine, interpreting bytecode files and comes with a JIT compiler. Oh wait...
|
|
VirtualBox is more free than you might think |
|
|
Ubuntu 9.10 karmic installation... Not so positive. |
Comments
You might wonder if a GJC compiler is not what you want for PHP, Python et al. A compiler to C++ seems to be a wrong choice, just because C++ is OO, it isn't a good intermediate language per se.
When I reading the title this came into my brain:
http://www.youtube.com/watch?v=mQBllqvxMl4
http://www.youtube.com/watch?v=mQBllqvxMl4
I'm wondering why Facebook didn't look at the PHP to Java compiler from Caucho 'Quercus' first. Although the free version only interprets the PHP code, a commercial version is also available and this one compiles the code into Java.
When then a Java VM is used which compiles the bytecode into machine code the same effect might be achieved, however with lesser costs. It is even possible that the end result is even more stable than the current C++ code generated...
When then a Java VM is used which compiles the bytecode into machine code the same effect might be achieved, however with lesser costs. It is even possible that the end result is even more stable than the current C++ code generated...
Skinkie, there is also Quercus which aims to translate PHP into Java bytecode at runtime and lets the JIT/HotSpot-magic of the JDK do its work. The performance-gains aren't super large since they still allow weak/dynamic typing.
Then again, you're not there if you have HipHop working on your code... You still need some runtime environment that can actually run your code (appaerantly Apache can't), while Quercus aims to be just another PHP-runtime, rather than a translator to alternative code.
drm, I don't understand the semi-flames to the PHP-developers of Facebook. Even the very best PHP-developers in the world can do only so much with PHP to achieve the best possible performance. If you want more performance, you can try and transform your (good) PHP-programmers in good Java/C++-programmers and have them work long hours to convert (even more parts of) a large code-base, or you can try and improve/circumvent the interpreter and use the original code if possible. Facebook seems to have taken the latter road.
By the way, if you're not running a at least a rack full of php-servers you're probably not going to need a special optimized compiler+runtime like HipHop, you'll have to look really hard into just rewriting parts of the expensive PHP-stuff to (asynchronous) services, rather than trying to increase the PHP-performance with tools that make deployment and debugging even harder than it is now.
Then again, you're not there if you have HipHop working on your code... You still need some runtime environment that can actually run your code (appaerantly Apache can't), while Quercus aims to be just another PHP-runtime, rather than a translator to alternative code.
drm, I don't understand the semi-flames to the PHP-developers of Facebook. Even the very best PHP-developers in the world can do only so much with PHP to achieve the best possible performance. If you want more performance, you can try and transform your (good) PHP-programmers in good Java/C++-programmers and have them work long hours to convert (even more parts of) a large code-base, or you can try and improve/circumvent the interpreter and use the original code if possible. Facebook seems to have taken the latter road.
By the way, if you're not running a at least a rack full of php-servers you're probably not going to need a special optimized compiler+runtime like HipHop, you'll have to look really hard into just rewriting parts of the expensive PHP-stuff to (asynchronous) services, rather than trying to increase the PHP-performance with tools that make deployment and debugging even harder than it is now.
[Comment edited on Wednesday 03 February 2010 20:32]
Thank God the world doesn't stop with one webserver. Never the less the most interesting in the last month that I had observed was that an actual webserver written in PHP with a reverse proxy in front of it beat the performance of any fastcgi or mod_php setup. I thought the claim of lightspeed was that their LS 'undocumented' interface was good... this was mind blowing."appaerantly Apache can't"
This new C++ translator just shows that PHP is a programming language. And any programming language can be translated into any other language applying a constant function to it.
[Comment edited on Wednesday 03 February 2010 21:11]
I don't intend to flame. As I understood it, they chose for building HipHop over a refactoring of the current code to get the same performance increase... That rang some alarm bells for me. That's all.drm, I don't understand the semi-flames to the PHP-developers of Facebook
True, however, they've worked for two (!) years on this project, and this makes me wonder - wouldn't it be easier to retrain their developers, hire some new people, and switch to another language altogether? I mean sure, PHP's a nice language, and good enough for most things, but if you're investing time and a lot of money in making a round peg fit a square hole (because the square hole's bigger, thus faster), are you sure you should stick with the round peg?quote: SkinkieThis new C++ translator just shows that PHP is a programming language. And any programming language can be translated into any other language applying a constant function to it.
Yes, analogies ftw. If they had good programmers, they could've easily spend the same two years rewriting their core components in Java, C++, C#, ASP.NET, or any language that performed better without losing (much) in the flexibility the language offers - insofar that still exists when working on a big / important project like Facebook (since I can imagine they'd have stringent rules and guidelines for writing code for that).
I highly doubt refactoring their current code will give them the same performance increase as HipHop 
I agree with you here, the good choice is to code high used parts and locking parts in a low level language, such as C. Just create a webserver extention that handles all your posts and database updates. In my humble opinion creating an application, where required is better than making a webserver+application server that under performs and has to cache everything to get performance.True, however, they've worked for two (!) years on this project, and this makes me wonder - wouldn't it be easier to retrain their developers, hire some new people, and switch to another language altogether?
If they really created a compiler, php=>object code. That could have been a killer, but I guess you might be able to do that with the object code that PHP makes internally anyway.I mean sure, PHP's a nice language, and good enough for most things, but if you're investing time and a lot of money in making a round peg fit a square hole (because the square hole's bigger, thus faster), are you sure you should stick with the round peg?
[Comment edited on Wednesday 03 February 2010 22:34]
That all depends on the quality of the current PHP code, doesn't it?I highly doubt refactoring their current code will give them the same performance increase as HipHop
YopY, three guys spent two years on it (well, not exactly, it comes down to about four man-years) on the project. In all likelyhood, those three guys could have never rewritten Facebook, or even just some core parts, in that time. Plus, when you're rewriting, you're not improving, you're doing work to make something you already have.
Plus, they have a development team of about 350 people. You can't retrain them all, and firing those that can't be retrained into C/Java/whatever developers and hiring new people to replace them isn't going to be very economic.
And for things like social networking websites, being able to adapt quickly is important. If you want short release cycles, you'll probably want to use something like PHP and not something like Java.
For people interested in a (whole) bunch of articles on HipHop for PHP, http://www.planet-php.net/search/hiphop has a nice (not so little) list.
Plus, they have a development team of about 350 people. You can't retrain them all, and firing those that can't be retrained into C/Java/whatever developers and hiring new people to replace them isn't going to be very economic.
And for things like social networking websites, being able to adapt quickly is important. If you want short release cycles, you'll probably want to use something like PHP and not something like Java.
For people interested in a (whole) bunch of articles on HipHop for PHP, http://www.planet-php.net/search/hiphop has a nice (not so little) list.
I would have liked to see Facebook joining the efforts to create a LLVM based PHP JIT-compiler rather than creating a source code transformer.
However, I must say Google's effort to create a LLVM based Python interpretor (http://code.google.com/p/unladen-swallow/) isn't really working out either, the only reason why their benchmarks look good is because they are looking at worst case Python code and comparing to just older versions of unladen-swallow. However, I do expect LLVM can have a promising future, especially if a lot of different projects use it for different purposes.
However, I must say Google's effort to create a LLVM based Python interpretor (http://code.google.com/p/unladen-swallow/) isn't really working out either, the only reason why their benchmarks look good is because they are looking at worst case Python code and comparing to just older versions of unladen-swallow. However, I do expect LLVM can have a promising future, especially if a lot of different projects use it for different purposes.
Of course it doesn't. But then afaik it is still the best webserver option for php. To my knowledge all the fastcgi-setups are either slower or much less convenient to manage, apart from the loss of several useful features in various cases.Thank God the world doesn't stop with one webserver. Never the less the most interesting in the last month that I had observed was that an actual webserver written in PHP with a reverse proxy in front of it beat the performance of any fastcgi or mod_php setup.
Do you have a link to the PHP-server you mention here?
Their point was that that wouldn't have been easier. Besides, even if you did retrain those people, you'd loose the advantages PHP has (which at least they claim (and I agree) there are!). Especially in the rapid release area.True, however, they've worked for two (!) years on this project, and this makes me wonder - wouldn't it be easier to retrain their developers, hire some new people, and switch to another language altogether?
They already have that, a lot, as I understand it. PHP is already mostly a glue-language between their various backend applications and (specialized) databases and a templating environment to display all the gathered data. But if you can make your normal webservers, of which they have a lot as well, run so much faster that you can decrease your investments in new machines at the scale they do... that is a very interesting project.In my humble opinion creating an application, where required is better than making a webserver+application server that under performs and has to cache everything to get performance.
Exactlt Jory167, facebook looked upon this matter from a business point of view, not like most other reactions on this blog from a technical point of view.
It's just plain cheaper to use a small team (in the video there where 6 people) to develop something like hiphop, then retrain 350 programmers and reinvent wheels again. Facebook spent about 5 to 6 years developing their current codebase. Most likely made a few framework's to support the clustering and scaling of the site etc.
To rewrite that into a different language would mean they had to throw away everything they made upon till now and restart. Not impossible, but from a business point of view it means burning money.
I think it's smart route they picked, but as stated in other posts about this on the web, it's not a route for the majority of websites out there. No matter how important a average php developer thinks his/her site is, it's no facebook when i comes to scaling/performance. It's a fun route, from a developer point of view, to see how well your current code could transform into C++, but it's useless for most of us.
It's just plain cheaper to use a small team (in the video there where 6 people) to develop something like hiphop, then retrain 350 programmers and reinvent wheels again. Facebook spent about 5 to 6 years developing their current codebase. Most likely made a few framework's to support the clustering and scaling of the site etc.
To rewrite that into a different language would mean they had to throw away everything they made upon till now and restart. Not impossible, but from a business point of view it means burning money.
I think it's smart route they picked, but as stated in other posts about this on the web, it's not a route for the majority of websites out there. No matter how important a average php developer thinks his/her site is, it's no facebook when i comes to scaling/performance. It's a fun route, from a developer point of view, to see how well your current code could transform into C++, but it's useless for most of us.
Sidenote: It's going to be the talk of the week, clearly ... 
[Comment edited on Thursday 04 February 2010 15:09]