Announcing CLFORTRAN

We are pleased to announce CLFORTRAN for GPGPU.

CLFORTRAN is a new and elegant Fortran module that allows integration of OpenCL with Fortran programs easier than ever.
Taking advantage of Fortran language features, it is written in pure Fortran – aka no C/C++ code is required to utilize the GPGPU.

CLFORTRAN is compatible with all major compilers: GNU, Intel and IBM, and supporting OpenCL 1.2 API.
In addition, it is provided as open source and licensed under LGPL, to allow scientific computing at massive scales and all supported vendors.

You may read more at CLFORTRAN.

Intel® Released the Xeon® Phi™ Processor (MIC)

Intel® have just introduced their Xeon® Phi™ processor to the market, targeting HPC and scientific computing. It is available for purchase and integration into existing systems/platforms/servers.

The new co-processor is a discrete device that runs an operating system of its own and functions as a fully functional computer (though being a co-processor).

Why should anyone be interested in Xeon® Phi™?

It is based on the most common x86 architecture, therefore porting existing code and algorithms should be the easiest possible.
One may also utilize OpenCL™ algorithms to take advantage of the high-parallelism of the Phi™ processor.

In addition, it features 60 cores, 8GB of internal memory (with 320 GB/s) and uses PCIe x16 slot to provide high performance bandwidth throughput.
With almost 1 TFLOPS of double precision, Phi™ is competent to very high end GPUs in the market today, but on some aspects, provides better performance and industrial matching than other vendors.

You can read more at:

Contact us for more details and projects regarding Intel® Xeon® Phi™ family of processors.

OpenCL.NET 1.1.44 Released

We are pleased to annouce the release of a new OpenCL.NET library supporting OpenCL™ 1.1 (revision 44) driver API by the Khronos group.

This release includes all API functions described by the standard and targets cross-platform HPC and GPGPU applications.
OpenCL.NET can be used to develop solutions under Windows, Linux or Mac, also integrating with Matlab to accelerate algorithms of different kinds.
All vendors are supported, including NVIDIA®, Intel® and AMD.

For more information: OpenCL.NET.

Vicodeo™ – Accelerated Video Decoding Library

Dear all,

We are glad to introduce a new library for video decoding, Vicodeo™, featuring accelerated performance for faster than real-time decoding of H.264, MPEG-2 (and more) video streams – in managed environments (.NET / Java).

Video processing nowadays has become a computing intensive task. Being able to accelerate decoding and various processing tasks, opens the door for many types of applications and usage of video in life, from: high-quality films, security/surveillance cameras, live events, video conversations over the web and much more.

Our library provides many capabilities beyond real-time (+) decoding of 1080p (Full HD) streams:

  • Codec support: H.264, MPEG-2, VC-1 and more
  • Color space conversion from YUV 4:2:0 to RGB (accelerated)
  • Integrated parser for elementary/transport streams and video packets
  • Simple integration with DirectX or OpenGL
  • Faster than real-time decoding for 1080p even on low-end platforms
  • Optional immediate decoding of frames, without buffering
  • And more!

For more information: Video Decoding.

OpenCL.NET 1.0.48 Released


We are happy to announce the availability of the so long waiting OpenCL.NET 1.0.48 library.

This version aligns with OpenCL 1.0.48 standard, and fully conforms with latest NVIDIA drivers for OpenCL (and as well on supported platforms).

In brief, this release of the standard added few API functions and modified some, to truly allow heterogeneous computing on a single system. An application can query for the existence of multiple computing devices on the system, also by different vendors (recognize the CPU and a GPU as compute resources) regardless of the vendor. Such that consuming different computing resources can be transparent.

For further details about standard features and changes please consult Khronos website.

For OpenCL.NET page and download, click here.

As always, you are invited to contact us at:

World Cloud Computing Summit 2009

The 2nd annual cloud computing summit is about to take place in Shfayim, Israel, between December 2-3, 2009.

Following last year success, the event will cover recent developments and progress in cloud technologies. Presenting with top-of-the-line companies active in this field, including (partial list): Amazon, Google, eBay, IBM, HP, Sun, RedHat and more.

Additional “hands-on” labs and workshops are offered during the event for participants that would like to learn more about cloud technologies and integration possibilities.

We are also presenting Hoopoe at the summit, for GPU Cloud Computing, and providing a workshop on GPU Computing in general and Hoopoe as well.

This event ends 2009 and symbolically the last decade, marking cloud computing as a major development that we are about to see more and more in the next years.

You are invited to join us during the event.

OpenCL.NET Released

Hello everyone,

We are happy to announce the immediate availability of OpenCL.NET for the public.
This library provides a .NET implementation and wrapping of the OpenCL interface for GPU computing (and general computing as well).

Currently, the library supports revision 1.0.43 of Khronos (being the latest version of the standard).

Users may test the library with NVIDIA released drivers for OpenCL, or on other architectures as OpenCL should be supported on (Intel, AMD CPU etc.).

The API in this release was adapted to be cross platform in mind, and code, using the new SizeT construct for transparent handling of 32/64 bit platforms.

In addition, there is only one version of the library conforming to all operating systems who support OpenCL, regardless of Windows, Linux or Mac.

For any question, request, bug report or else, please contact us at:

We hope you will find this library useful.

SizeT – .NET and native code


In this post I wanted to introduce you with a new construct we added to the latest release of CUDA.NET (2.3.6) and will be available with the published OpenCL.NET library.

The problem

.NET is a very fixed environment, defining well known types, such that an int is always 4 bytes long (32 bit) and a long is always 8 bytes long (64 bit).

This is not the case with native code, for developers of C/C++. Writing a program in 32 bit environment, will always yield 32 bit types, unless using specific directives to get 64 bit variables. When writing 64 bit programs, they do get access to 64 bit wide variables as primitives supported by the compiler.

This clearly creates a portability problem for code and applications written in 32 and 64 bit environments.

Another example, is pointer size, where in C/C++ environments, under 32 bit the pointer is 4 bytes wide (int) and under 64 bit systems it is 8 bytes wide (long). The .NET environment (through different languages) provides a simple construct to overcome this problem, namely the IntPtr object, which some of you may be familiar with.

Now, coming back to our domain, the runtime API (also the driver in a new CUDA 2.3 function) and OpenCL makes extensive use of the C/C++ size_t data type. This data type ensures for developers that under different environments they will get the maximum width of the supported data type, unsigned int for 32 bit systems and unsigned long for 64 bit systems.

Possible options

By means of the interoprating library (wrapper), such as CUDA.NET, it creates a problem, since the API should provide several versions of the function, one given an uint (to map to 32 bit with unsigned int C/C++ type), and ulong (to map against unsigned long in 64 bit C/C++ systems). Supplying such an interface to the user will have to force him a specific behavior and system, since in .NET, the uint is always 32 bit wide, and ulong is 64 bit wide, no matter what.

Another option can be to provide a unique, standalone interface, using the IntPtr object, since .NET takes care to make it 32 bit wide in 32 bit systems and 64 bit wide for 64 bit systems, dynamically, without user intervention.

But using the IntPtr and a very serious downside, it is not dynamic, once it’s value is set, it cannot be changed through simple arithmetic operators, like +,-,*,/ or else.

The solution

Exactly for this purpose we created the SizeT object (structure). First, it maps to the same name as it’s native counterpart (size_t) and second it provides the dynamic mechanisms we want for working with 32 or 64 bit systems transparently.

SizeT can serve just like any other basic primitive in .NET.
For example:

SizeT temp = 15;
uint value = (uint)temp;
ulong value2 = (ulong)temp;
temp = value;

Internally, the SizeT wraps the IntPtr object to provide the same dynamic capabilities under 32 and 64 bit platforms.
It can host the required .NET primitives (int, uint, long, ulong), so when programming, one will make a good habit for using the SizeT instead of other data types (working with the runtime CUDA API).

For OpenCL the interface was built from the first place to use SizeT in mind, as the OpenCL API uses only size_t data types for cross platform functions.

Security in Hoopoe

Security in cloud systems is always a major part of the system, and requires a great effort to deal with and develop.
It usually starts when users are given access to actual machines, so they can run applications using the operating system, whether it is Windows or Linux based.

Security models in Hoopoe

Hoopoe provides several features to overcome this problem.

Isolated user environment

Hoopoe provides each user with a unique, isolated, environment. This way, only the user can access its files and computations, using the specific mechanism provided by for file management and related operations.

Hiding the “metal”

Hoopoe hides the “metal” from the user, providing access only through a web service interface to communicate with the system.
Thus, the user is limited with the flexibility of the code it can run.
There is no direct access to machines, so the user is able to submit his task to Hoopoe for further processing of the system. After the submission point, the user waits for the task to finish, and copy the results back.

Independent data management

User data is managed by Hoopoe as files, either raw or compressed (using GZip).
A buffer is then read in a fully managed (.NET) environment, thus reducing the risk for malformed or “bad” files.

Running computations

Hoopoe is meant to run computations, and not serve as an operating system. By such, user tasks are compiled on demand for the platform it should be processed on (if 32/64 bit, or specific hardware support).

Computations are running on the GPU itself, and this is where the interaction with the GPU ends. Copying the relevant data, performing the computations and placing the results back in the appropriate buffer.