CUDA.NET 2.3.7 Released

Dear all,

We would like to announce for the release of CUDA.NET 2.3.7.

This version addresses various issues with runtime API and types. The change was in data types and structures compliance with the native wrapper of CUDA Runtime API, to support cross-platform environments operating in 32 or 64 bit mode. The structures now support the SizeT structure we introduced in the previous CUDA.NET release.

Link to the download page.

Please send us your comments and feedback.

22 Replies to “CUDA.NET 2.3.7 Released”

Hornbydd says:

October 28, 2009 at 12:22 AM

Hello,

I was wondering if you could provide some sample projects that are for VB .Net 2005 or 2008 express edition? The examples are in C# and I cannot load them.

Duncan
hvh says:

January 20, 2010 at 2:30 PM

Hello,
I successfully downloaded, installed, compiled and ran CUDA.NET and the examples you have provided. When I run the simpleFFT example on my 64-Bit machine the computation returns an error in the L2-norm (about 0.7 – depending of course on the random values it uses). I was wondering if this error comes from the fact that I’m running simpleFFT on a 64-Bit machine. Is that possible?

Thank’ a lot for providing CUDA.NET!

Hermann
1. moti_bot says:
  
  January 20, 2010 at 2:51 PM
  
  Hey,
  In fact that is more relevant to the precision your GPU performs calculations with.
  Computing FFT’s involves using the exponent, sine & cosine functions, which are known not to be as exact as the CPU (do not conform to fully IEEE 754 standard, due to these functions). Still, you may expect at maximum a ULP 4 error (accurate as to 4 digits for the smallest number expressed in float/double).
  
  Better as on the CUDA forums too, since this example is based on the CUDA SDK, so others would experience it too through C/C++ or other environments.
  
  Regards,
  CASS :).
hvh says:

January 20, 2010 at 3:40 PM

Hi,

if I run the simpleFFT example from CUDA SDK (which is in essence the same code) I don’t get these big errors from computing the L2-norm between forward/backward FFTs. Btw, my GPU is a 8800GTS (G92).

Any more hints where this big error might come from? I don’t see any bugs in your simpleFFT example.

all the best,
Hermann
hvh says:

January 21, 2010 at 12:28 AM

Hello again,

I repeated the computation on a different (32-Bit) computer with a 8400GS card with the same result as before: L2-norm around 0.6 to 0.8 as the result of your example project simpleCUFFT based on CUDA.NET.

Regards,
Hermann
hvh says:

January 22, 2010 at 7:40 PM

Hello,
just as a follow up to my previous messages: I found a bug in your simpleCUFFT example in the function PadData. Now I got correct results with max ULP 4 error.

Regards,
Hermann
1. moti_bot says:
  
  January 23, 2010 at 9:55 PM
  
  Hi Hermann,
  
  Thank you, this is great.
  Can you share with us the bug you found?
  We will fix it with the next release.
hvh says:

January 23, 2010 at 11:12 PM

Sure! I will send my suggestion to support@hoopoe-cloud.com, is that OK?

Regards,
Hermann
1. moti_bot says:
  
  January 23, 2010 at 11:15 PM
  
  Certainly yes!
Sergio says:

February 16, 2010 at 1:18 AM

Hi

I am using your latest version of CUDA.NET. Unfortunately, when a use any of the CUDARuntime methods I get:

Unable to load DLL ‘cudart’: The specified module could not be found.

Am I missing something?

Sergio
1. moti_bot says:
  
  February 16, 2010 at 8:11 PM
  
  Hi Sergio,
  
  The cudart.dll file is part of NVIDIA CUDA Toolkit installation.
  The driver API would work in any environment with NVIDIA drivers installed, but for the runtime API you need to install the Toolkit.
Sergio says:

March 4, 2010 at 9:32 PM

Thanks!

I have another quick question. Sometimes in CUDA it is recommended to use templates to get some performance speedup. How do I pass the parameter to the template in CUDA.NET?

For example, if my CUDA function is:

template __global__ static void map(…..

How do I pass that blocksize in CUDA.NET?
hvh says:

March 9, 2010 at 7:56 AM

I guess you have to use SetFunctionSharedSize – didn’t try it though…
sl says:

May 7, 2010 at 3:34 PM

Hi,

I have the problem Sergio met a couple of months ago. I am trying to use CUDA Runtime methods but get the message error :

Unable to load DLL ‘cudart’: The specified module could not be found.

I had installed the toolkit and actually have a dll named “cudart32_30_14.dll”. Is it a wrong one ? I copied and pasted it in the folder containing CUDA.NET.dll but it still doesn’t work.

Am I doing something wrong ?

Thank you
1. moti_bot says:
  
  May 8, 2010 at 11:23 PM
  
  Actually no, you should look for a cudart.dll file under the bin directory of the CUDA installation.
  If none exists, then you could create a copy of cudart32_30_14.dll file to become cudart.dll.
SetecAstronomy says:

May 10, 2010 at 6:22 PM

Any idea when we might be able to expect a CUDA 3.0/3.1 compatible release?
Emily says:

June 2, 2010 at 7:33 AM

Hi Sergio,

The cudart.dll file is part of NVIDIA CUDA Toolkit installation.
The driver API would work in any environment with NVIDIA drivers installed, but for the runtime API you need to install the Toolkit.
Amy says:

June 4, 2010 at 9:02 AM

Hi,

I have the problem Sergio met a couple of months ago. I am trying to use CUDA Runtime methods but get the message error :

Unable to load DLL ‘cudart’: The specified module could not be found.

I had installed the toolkit and actually have a dll named “cudart32_30_14.dll”. Is it a wrong one ? I copied and pasted it in the folder containing CUDA.NET.dll but it still doesn’t work.

Am I doing something wrong ?

Thank you
easyUeltje says:

June 9, 2010 at 4:45 PM

Hello,
like hvh, i get an error message as result. It’s also L2-norm error. hvh said tat he got rid of this error message by manipulating PadData-method. Would you share with me how to do this?

Or maybe otherwise: Do i need this PadData-method for FFT functionality?

Kind regards
Marc says:

June 18, 2010 at 10:25 AM

The Async example does not work:

cuda.CopyDeviceToHostAsync(d_a, a, stream);
Error: This method is not generic.

Rewriting the method to:
unsafe
{
fixed (int* p = a)
{
cuda.CopyDeviceToHostAsync(d_a, new IntPtr(p), (uint)a.Length * 4, stream);
}
}

always returns ErrorInvalidValue.

Any help is greatly appreciated, I need to do async calls.

Thank you very much!
Marc says:

June 18, 2010 at 10:26 AM

Whoops, the non-generic method which does not work is supposed to be:
cuda.CopyDeviceToHostAsync(d_a, a, stream);
Marc says:

June 18, 2010 at 3:50 PM

Turns out the async operation can only work with non-pageable host memory which has to be allocated by AllocHost first.

Strange thing though that i cannot use FreeHost to free up this host memory, it always fails..

Comments are closed.