OpenCL™ Development

[toc]

Introduction

From the pioneers of GPU computing in Israel – our background with the OpenCL™ parallel computing platform goes back to the early days (2009), times it was called GPGPU. Since then, we had developed and ported over hundreds of different algorithms for typical GPU computing fields:

Image & Video Processing
Finance
Geophysics
Medical Imaging
Cloud Computing (on Amazon AWS EC2 and 3rd party providers)
DirectX / OpenGL Graphics Interoperability
And much more

With such expertise and experience we are capable of delivering any GPU based solution, targeting most industries and technologies.

Supported OpenCL hardware profiles include: embedded (HPeC, low power), desktop / workstation, server-side / cluster scale with multi-GPU scenarios in all profiles to achieve high-performance, asynchronous and low-latency processing.

OpenCL and the OpenCL logo are trademarks of Apple Inc.

Development Methodology

Dedicated software development methodologies were created and tuned over the years to achieve robust cycles, reduce cost and minimize development duration for HPC projects involving GPUs.

Development stages follow the sequence below (high-level overview):

Requirements collection
System level analysis
Algorithm level analysis
Prototype development
Optimization and fine-tuning
Testing (system and unit level, follows previous stages as well)
Integration
Deployment

Development processes can be performed in compliance with ISO 9000/9001 standards definitions for organizations with high quality assurance requirements.

Cloud Computing

Our OpenCL solutions can be developed to be hosted under Amazon AWS EC2 HPC instances, integrated with NVIDIA Tesla GPU family and CPU architectures.

The benefits of using AWS services are mainly to achieve better scalability and lower development costs as it eliminates the need to purchase high-end systems.

For production purposes, AWS can offer great added values by using an effective cost model, offer higher acceleration and scalability.

Follow Amazon AWS EC2.

Other 3rd party cloud service providers can be used as well on request.

Frameworks and Programming Languages

C / C++
Microsoft .NET (C#, Visual Basic, Python and more)
Java
FORTRAN
Python
MathWorks MATLAB

Operating Systems

Microsoft Windows:

Desktop: XP, Vista, 7
Server: 2003, 2008, 2008 R2, HPC Server 2008
Embedded editions support for all types above

Linux (most distributions):

RHEL
CentOS
Fedora
Ubuntu
And others…

Real-Time embedded Linux
Windriver
Real-Time (Soft & Hard) modes support in Windows/Linux editions
Android (from 2.0 and newer)
Apple MacOSX
Support 32 bit and 64 bit architectures

SDK Platforms

A comprehensive list of libraries are available with CUDA accelerated features and functionalities. The list below provides a short description of capabilities we gained over the years, but not limited to.

NVIDIA CUDA® SDK – Image Processing Primitives
AMD APP – Fast Fourier Transform
Intel OpenCL – Basic Linear Algebra Subroutines
IBM OpenCL – Video decoding/encoding (H.264, MPEG-2, VC-1 etc.)

Graphics Interoperability

OpenCL integration into visualization systems can be useful in many cases, to replace existing intensive CPU processing stages or provide additional acceleration where normal shaders are limited or too complex.

It is also possible to integrate OpenCL functionality where high-level graphics frameworks are used, such as OpenSceneGraph, Ogre, Unity, SlimDX and more.

A list of supported API (multiple operating systems):

Microsoft DirectX 9, 10 and 11
OpenGL

Supported Computing Architectures

CPU

Intel based CPU architectures by Intel and AMD
ARM processors
DSP components

GPU

AMD
NVIDIA®:

NVIDIA® GeForce® (by different brands or manufacturers)
NVIDIA® Tesla® – Official GPU computing solution
NVIDIA® Quadro® – Official Visualization and computing solution
NVIDIA® QuadroPlex – Official Visualization and computing solution for cluster envrionments
NVIDIA® Tegra®, including ARM processor support
CUDA® on ARM – CARMA (based on ARM architecture)

Intel MIC

APU

AMD Fusion:

AMD Ontario, eOntario (Embedded G-Series)
AMD Llano
AMD Trinity, eTrinity (Embedded R-Series)

Intel Ivy-Bridge