Regular expression for Amazon S3 URL

Hello Everyone,

We added support for Amazon S3 storage services recently to Hoopoe. Following the previous article with our general account details, we wanted to share with you a regular expression we use for validating S3 URL as sources of data and files.

You may find more information about S3 naming conventions and requirements in the manuals available from http://aws.amazon.com/s3.

When submitting a task to Hoopoe with input/output sources from Amazon S3, one must specify the S3 URL of the resource. A simple format for a resource can be:
https://test-bucket.s3.amazonaws.com/dir1/input.bin.
With this example, the bucket of the user storing the object is called “test-bucket“, and the file for input is “dir1/input.bin”, called the key of the object (in the bucket).

This is a general form for S3 URL to make them accessible over the internet.

Regular Expression

We are using a regular expression to validate all Amazon S3 URLs with submitted tasks to Hoopoe.

In .NET (and general) manners, the RegEx is:
https://[a-z0-9][a-z0-9-.]*.s3.amazonaws.com/[w][wW]*

As you may see, the following limitations exist:

  1. For DNS compatibility, bucket names must be lower case and start with a letter or number
  2. In S3, and following DNS limitations, bucket names should not exceed 63 characters in length
  3. Object keys can be of variable length, must start with a valid character but can follow with other possible characters, also to denote paths (a file named: “dir/input.bin” is located under “dir” directory)
  4. In addition to the above, Hoopoe restricts S3 URL to be up to 256 characters in length

In case you find a mistake in the regular expression definition, whether possible URLs do not fit or it is permissive, please send us an email.
We also hope you may find this information useful for your own purposes.

Amazon S3 Integration

Hi,

We are pleased to announce that recently we added Amazon S3 services support, integrated to Hoopoe.

Using Amazon S3 services users can have extended storage support from Amazon Web Services (AWS), also communicating with other cloud systems, such as EC2 and more, to offer variety of processing capabilities.

Users who would like to use Amazon S3 can do it with a very intuitive interface, specifying the buckets and objects they use, following S3 semantics and terms, so Hoopoe can offer bi-directional communication with S3, for reading data, and outputting computed results.

We will follow with more articles presenting best practices guide for using Amazon S3 with Hoopoe.

As general information, users can use the following details to recognize Hoopoe in Amazon S3.

Hoopoe Amazon S3 details:

  • E-mail Username: support@cass-hpc.com
  • Canonical User ID: 939155fee5acfced9622d4a7df63e8a1fd54a24290a81871fd7d20f43aa758dd

We highly encourage users to use the email form for Hoopoe support when adding an ACL record in Amazon S3.

For more information about Amazon S3: http://aws.amazon.com/s3/

Hoopoe Cloud.