Working with Amazon S3 (Simple Storage Service)


Amazon S3 (Simple storage service) is “in the cloud” or internet based storage service. It is managed by Amazon in their data centers world wide and provides highly available, secure and scalable solution for storage. The objects or files stored on this storage can be retrieved & saved via REST interface or APIs provided by Amazon.

The objects stored on S3 can be used in your web/mobile applications, websites, Big data analytics or for backup and archiving. It provides cost effective solution by on demand basis need of storage space and pay according to usage. The prices of storing, archiving, retrieving and transfer can be found here.

According to S3 storage architecture the objects are stored in logical containers called buckets. Buckets are tied to your amazon account and logically infinite number of objects can be stored in the bucket. When objects are stored in the bucket they can be accessed by a unique URL. There are two styles of URLs to access the buckets one is virtual hosted style URL and the other is path styled URL. The difference is in virtual style bucket name is part of the domain name where in path style it is not. You can see the examples below.

virtual host style URL example : http://bucketname.s3.amazonaws.com

path style URL example: http://s3.amazonaws.com/bucketname

Access to buckets and objects are regulated by ACL (Access control list). Each user in AWS account can be given permissions grants to buckets and objects.

In this post we will try to do basic operations with buckets and objects with .NET API. You can do the same with AWS console commands and some with REST Services. The difference between .NET API and REST API is that REST APIs are low level APIs and give us more control over transfer of data. In later posts I will try to work with REST APIs.

Using .NET SDK buckets can be created via simple classes and methods in Amazon.S3 namespace. The namespace is included in the project after referencing AWSSDK assembly. Assemblies and namespaces will be included by default by creating AWS projects from AWS templates.

1- Create a Bucket

In the below code bucket is created in S3 with a few lines of code. The first step is to create amazonS3Client object from AWSClientFactory. It is a generic client object through which any type of S3 request can be submitted.

The client takes the type of request object as a parameter. For creating a bucket PutBucketRequest object is created and the name of the bucket is specified.

image

After the call following is the output and bucket named “awss3-samples-abdul-rafay” is created.

3

 

2- Write an object in the bucket

With a similar pattern the PutObjectRequest type of object is created and passed to the AWSS3Client object.

The parameters (ContentBody, BucketName, Key) as shown below is set for the object of PutObjectRequest type.

Please note that there are several other parameters (like metadata) can be set but depends on the type and nature of object to be stored.

image

Following is the output after the object named “Test Object Key” is created in the bucket.

4

You can see the object created in the bucket which can be browsed from AWS Explorer in VS2010.

7

 

3- Retrieve object from S3 and write the contents in the file

The GetObjectRequest type of object is passed to the S3Client object and the object which was stored earlier can be retrieved by passing the key and the bucket name. The key should be unique per bucket and if any other object is overwritten with the same key it will be overridden. It is up to the application to ensure uniqueness and object locking when objects are retrieved and written.

image

Below is the output of the code and the object is read and saved in the file on the desktop with the same name.

5

6

4- Listing objects in the bucket

ListObjectRequest type of object is passed to the S3Client object with BucketName as a parameter.

8

5- Deleting object

DeleteObjectRequest type object is passed to S3 Client with BucketName and Key as a parameter.

9

The object is deleted from the bucket.

10

Why cloud computing?


The origin of “cloud computing” term is unknown but the concept of cloud has been there in since internet. I have seen diagrams with cloud depicting the internet or grid of sophisticated computer network. Back in old web development days I used to deploy websites and its database on remote web/database hosting servers available publicly which shared resources between different websites/databases hosted on these servers. The concept of cloud computing is the same but goes beyond websites and database and has a lot to offer.

Today the concept of cloud has evolved to an abstract remote network of computers providing processing, storage and networking capability which can be accessed via internet. Cloud computing refers to the usage of dynamically allocated resources based on demand via remote calls. The concept is an analogous to the SOA (Service oriented architecture) which provides processing, execution and business logic abstraction and reusability accessed remotely. Cloud computing goes beyond this and provides infrastructure capabilities (IaaS), platforms (PaaS) and software (SaaS) as a service which can be reused, shared, dynamically allocated and called remotely.

SOA was a paradigm shift in distributed computing from the conventional peer to peer communication and provides business benefits like lower cost, less time to market the product and technological benefits like reusability of business functions, less connectivity management, less support overhead and less effort in IT change management. I won’t go into the details of how SOA is capable to provide these benefits. You can search more about SOA if you have confusions and questions about it.

Similarly cloud computing is a paradigm shift in the business and technology model. Cloud computing also has private, public and hybrid deployment models. I will focus on public cloud deployment and discuss the factors that benefits business and technology of the organization.

Changing the Business Model:

First let’s take a look how cloud computing fits and changes the business model. By business model I mean the cost, payment, calculation and business focus shifts for an organization. If an organization adapts public cloud computing it will be renting resources than owning the resources. This will effect the balance sheet of the organization shifting CAPEX to OPEX model, meaning that amortization of resources over the years will be replaced by operating expenses which is analogous to rather than owning a car hiring a taxi. In addition, avoiding the cost of owning the resources will enable the organization to use its capital investment in its core business. You can think for e.g. instead of investing your money to buy a car & maintaining it you can use this money in other core activities and keep hiring a taxi. This means you are having pay-as-you-go model instead of paying upfront which will be helpful in increasing the liquidity for the organization.

This also transfers the technology risks like upgrades, downtimes and maintenance issues to the cloud hosting company. This is analogous to our car example that if you hire a taxi you don’t care about its maintenance, breakdowns or accidents. You ensure that whenever you want to travel you can do so without any hiccups.

So this ensures business continuity without owning the risks and mitigation methods.

There is a marketing hype that cloud computing helps organizations to save costs and achieve economies of scale. I would say it can be possible but in reality we have to look at a lot of factors. If our renting cost outweighs the owning cost then we are in a loss from cloud computing and economies of scale is an illusion. So the organization adapting cloud computing has to do its analysis and then justify this point. Normally predicting or formulating the justification of adapting cloud computing vs. owning the resources would be a challenge and would require due diligence, vision and planning. In the upcoming blog post I plan to write about some use cases of cloud computing and genes of applications that might be a good candidate to host in the cloud due to their nature and are widely accepted in the industry to be hosted in the cloud rather on premises.

Changing the Technology model:

Other important factor is elasticity and dynamic allocation of resources based on demand. This is the beauty of cloud computing that you pay for what you use and can scale up/down or de allocate resources for any duration at any point of time. This factor is very important for companies which have significant variation in customer base, business cycles and peak business hours. Lets take an example of a new dynamic website which attracted people and needs scaling up continuously because of the increase in usage and customer base. Moreover it’s used to the maximum in peak hours for 7-8 hours while rest of the time the resources are sitting idle.

In contrast if we use fixed on premises hardware it would be very difficult to scale up the hardware all the time involving downtimes and resulting in unsatisfied customers. It would also waste resources when the business cycle is in a recession or during non peak hours. For e.g. take an example of a web site selling tickets for FIFA world cup. This website is a good candidate to host in the cloud because the maximum customers logged in would be in day time in the region where the world cup has to be hosted and the volume of customers can vary. Also after some time the web site will be depreciated, in this case de allocating resources from the cloud would be easy rather than liquidating a lot of hardware.

Summing it up, the factors mentioned in this article can urge the organization to adapt cloud computing and shift the business and technology paradigm. There are a lot of myths and misconceptions about cloud computing like privacy, data security and re-architecting of applications which hopefully I will write about in the upcoming posts.