During the Blender Conference, Jeroen Bakker demonstrated a OpenCL accelerated version of the Blender compositor. The result was a jaw-dropping increase in speed.
Jeroen Bakker writes:
This year at the Blender conference, we demoed a new concept. Blender OpenCL Compositor. OpenCL (Open Computing Language) is a technical standard to use multiple type of processor units (like the ones that can be found on graphical cards) to perform computational tasks.
The test system is an I7 920 quad core (+- 280 EUR) with a GT 220 (+- 70 EUR !!!!). A mid-range processor with a mid-range graphical card. The test contains several heavy nodes like the bokeh blur and the Lens distortion. A bokeh blur of 50 by 50 pixels on the CPU took 204 seconds. Using the GPU the task only took 22 seconds! watch the video to see it for yourselves or test drive it on your system. This all sounds good, but there are some limitations.
Not all systems are capable running OpenCL, hence you need a fall-back to normal CPU. A normal implementation would be to implement two separated node systems. One for OpenCL, and the other as a fall-back to the CPU. As developers don't want to maintain 2 compositor systems, we have designed a way to easily maintain both compositor implementations.
The current compositor system uses a lot of computer memory and is designed with a 'shared memory model' (a development model where all processor units can access all needed memory). OpenCL does not support this model. Other limitations are the limited amount of memory that can be used by OpenCL. This memory is limited to 25% of the total memory of the graphical card. This 25% is also used by the overhead of OpenCL.
We want to make this stable and integrate it in a future Blender version. Some aspects of this system are:
- More direct feedback to the artist. The final picture will be calculated and show to the user at the same time (final image will be build in front of the user).
- Usage of multiple graphics cards at the same time. If it is not quick enough, just a new graphics card.
- Introduction of a camera socket types and the "input camera file" node.
We are looking for artists and financial support to help us with the system.
This experimental version is released to get feedback from the community and as a benchmark between different OpenCL implementations. Please have fun with it. Be warned that this release is only for demo purposes. Nodes that are using OpenCL are:
- Defocus node (only color images are supported. Black white images do not work)
- Blur node (only when the bokeh check-box is checked)
- Color Balance
- Lens distortion
- And some smaller nodes: Brightness, Rotate, Tonemap
This version is only available for Linux 64 bit. To start the OpenCL compositor you have to start blender with the parameter "--enable-opencl". There are some test blend-files and the patch is included.
Video: Battle between CPU and GPU
Video: Speedup of the defocus node
Woah sweet! Looking forward to try this out :D
please oh please let this be the primary compositor processing method as soon as possible!!!! Cant wait for an OSX build! OpenCL throughout Blender, where applicable, should be priority #1(after bug fixes and feature match from 2.49b).
Wow these new features just keep rolling in, Blender, I cant keep up with you anymore (^^)
Wow looking forward to trying this downloading now. :) have opencl all setup :) opencl particles and hopefully soon bullet will be nice too :)
Wow, I want this NOW!
It's funny how an i7 quadcore is considered to be a mid-range processor :)
This is not the only OpenCL project in BLender, Enja is working on some OpenCL implementation of particles: http://enja.org/2010/06/30/blender-game-engine-particles-in-the-mix/
Apologies for my ignorance, but can OpenCL be applied to the everyday Blender rendering processes, e.g. raytracing (especially with gloss < 1!) etc.? I never use any of the specialist compositing nodes mentioned above, but would still love superfast day-to-day rendering using my graphics card.
I am really looking forward for this feature improvement!!! This could increase the Speed of your renderings!! Especially for Animations.
Thanks for this, and hoping this will come as fast as possible :)
That's very impressive!
Does anyone know what types of graphic card that's supported?
Finally! This was something I've always been asking for in the forum for the render, but the compositor node, now that's really good news!
@Reaction: You can currently get a build of Blender 2.55 with Luxrender built in. If you enable hybrid mode and change the surface integrator to path tracing mode then you already have Open CL accelerated rendering this very minute! Of course this all still in heavy development and will only improve with time.
...times they are a changing fast :)
@Artorp: most modern video-cards with NVidia or ATI chipset work fine. Intel based video cards are not supported (yet). You should have a hardware driver with OpenCL what is normal for NVidia, for ATI you need to install a secondary driver (stream).
Even my laptop is OpenCL compliant!
Ghz's are at their limits for long already, but industry demands don't stand still.
sweet! how can artists contribute?
> Windows 64 bit
> Windows 32 bit
Apparently filiciss builds have a lot more features to offer and add-ons 'O.o`
This is very interesting although it won't have any effect for me right now since I am working with my white 13" MacBook 'without' seperate video RAM (not the best setup for working with 3D but anyhow - the way it is).
I've been using Nuke on my system and it's awesomely fast and responsive even with very high resolution images like 2K or 4K. The way the images are drawn, i mean preview wise, is also very fast and great to use.
So I am hoping that blenders compositor will get faster and more responsive during usage. I'd say OpenCL is one good way to achieve this and at least for me, a very appreciated effort on developing from you - so thanks (I won't always use this white MacBook I strongly guess :)
@All, be aware that the Windows build on graphicall have only the rotate node that benefits from the OpenCL speedup. the other nodes only run on the CPU and therefore no speedup.
Please have this in the next update of blender.
Man, what would an NVIDIA GTX460 make out of this...
Is there an example blend file to do the benchmark?
Was it me or was the CPU graph in the video NOT topping out ever? I think I only saw it reach 50% utilization. Nevertheless, OpenCL integration into Blender is highly welcomed! Thank you dev.!
Great news! Keep up the good work! Sometimes in the future I would also like to get my hands on OpenCL, projects like these motivate me towards that!
So, is it CPU vs. GPU? I thought that with OpenCL you could utilize BOTH (so CPU+GPU) for tasks like these...am I wrong?
yeah! use all cpu/gpu cores available.
A compositor with hybrid mode:) a must have in upcomming blender versions..
thx a lot and keep on
Also, let's remember Enjalot's ( http://enja.org ) work on particles using Opencl in Blender BGE for particles. His video should still be available in youtube for your viewing for those who haven't see it yet.
Where do I go again to suggest a feature request?
I'm waiting for the day that Blender node tree includes/switches all or most of it's physics to bullet and support .bullet file outputs or feature Bullet Physics as an add-on.
@Jogal: it is packed with the linux 64 build it is called sinteltest.blend
@BlenderBoy: the CPU version is the non OpenCL blender. so the speed it measures is between the current trunk and the OpenCL branch. In the current trunk only very complex compositor systems and the defocus node will utilize more.
@Temaruk: good question! OpenCL will only allow you to write code that can run on a CPU or a GPU, it doesn't have anything to use both at the same time. This is still developer effort.
Going back to my comment about Bullet support. I learned recently it's got OpenCL support. Here's the Bullet Physics export for 2.49b build and bullet physics editor http://code.google.com/p/bullet-physics-editor/downloads/list
found > http://www.bulletphysics.org/mediawiki-1.5.8/index.php/Bullet_binary_serialization
Alright! Cool! Thanks!
i donated but it was a measly amount so im going to donate a much more appropiate amount soon.
Will your work include a wide gamut RGB default workspace and an option for 3D LUT support for import, view and export? Looking to import sources like xvYCC, RedSpace, RAW camera formats etc and in line with applications like Nuke, Premiere & AE CS5.
yCMS is a free 3D LUT generator and has an open specification, there's also OpenColorIO:
For details of yCMS open spec here:
It's creates 96bit precision 3D LUTs.
As Blender is 32bit float will this allow no clipping or desaturating of 'out of gamut' colors when image processing.
For example ononeo, http://www.oloneo.com/:
An extract from there 'Press' page:
"At the core of Oloneo PhotoEngine is a fully real-time, 32-bit floating-point per channel (96-bit per pixel), ultra-wide gamut, full resolution and non-destructive image-editing engine. Oloneo's color model handles a range of colors that largely surpasses what printers or screens are capable of displaying today. Combined with the suppression of any color shifting and clipping, it guarantees photographers against any loss of image data during the HDR process. With a simple, intuitive and responsive user interface, all image controls work in full real-time at a rate of 1/20 second or less on a general-purpose desktop or laptop computer."
I've actually been working on an opensource implementation of viewing 3D LUTS and manipulating them in the space via Blender.
Details to follow...
Joshua, any news on this? This could be the solution a big problem I've been trying to solve in development of a product.
?? I'd really love to chat with you about this, please email me at: zach (at) visualsupply dot co (not com)