Back
Featured image of post S3 Authentication

S3 Authentication

Authenticating request to S3 Object Storage

Introduction

Amazon Web Services (AWS) offers a Simple Storage Service (S3) which can be used to store and protect data. This kind of storage is referred to as “Object Storage” which differs subtly from the disk storage that we are most familiar with which uses “Block Storage”. One aspect of object storage is that it cannot be used to incrementally edit one part of a file which means that applications that rely on growing files are difficult to support.

Amazon offers a suite of SDKs that allow application developers to interact with S3. These SDKs eventually translate the application code into HTTP requests to the Amazon service. This post explains how these requests are formed and secured.

S3 storage

Amazon Web Services provides a browser accessible console that allows S3 storage to be configured. A user can setup a “bucket” which is repostory for objects that are going to be managed together.

The console shows the properties of each object including a download URL.

https://brians3test.s3.us-west-2.amazonaws.com/020706ac4b5c16e8342686f8c7202755_2022-03-26.mxf

However, entering the URL in a browser or in an application that can create HTTP GET requests, e.g., PostMan, will produce the following response from the service.

<Error>
<Code>AccessDenied</Code>
<Message>Access Denied</Message>
<RequestId>DFRDTKYF2AYGH45J</RequestId>
<HostId>85LeuZsHrtW8rdwrlx3WVS6l+4NOt4tjsy+Vl0Apc9E2h/SRIS8e3rJHkndVKeXBPTKPpL12lGg=</HostId>
</Error>

This message indicates that the request has been denied for security reasons, which makes sense because S3 provides a layer of security for the users files.

HTTP GET Requests

Hypertext Transfer Protocol (HTTP) is an application layer protocol originally designed for communication between web servers and web browsers. Depending upon the operation required, different methods are specified in the request message. For example, a simple request that a browser might issue to a web server to retrieve a file might look like …

GET /gmail/ HTTP/1.1
Host: google.com
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7

The GET method, indicates that this a request for data and the lines of text following that are called headers and consist of a collection of key value pairs.

The Amazon S3 service requires a Authorization header be added to the request to provide proof that the requester is authorized to retrieve the file.

GET /gmail/ HTTP/1.1
Host: google.com
x-amz-content-sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
x-amz-date: 20220521T190125Z
Authorization: AWS4-HMAC-SHA256 Credential=AKIA5TRHZWO37NBK72XS/20220521/us-west-2/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=2849011c9030062b75621e63d3451dea3b098c46baa94b2b67aae22db793cc0d

The signature is a hash in this example must be calculated using a secret that is shared between Amazon and the user and which Amazon can use to determine if the user is authorized to issue this HTTP request.

Hashing

A hash is a calculation that attempts to produce a fixed length output from a variable length input. There are many algorithms that can be used to calculate a hash but the best ones provide values that are evenly distributed about the output space.

For example, suppose we have an arbitrarily long integer number that we want to calculate a 2 digit hash for. We could square the number and extract the middle two digits as the hash. e.g., 45678 * 45678 = 2086479684. The middle two digits of, 2086 47 9684, are 47. Using the middle two digits is better than using the last two since more of the input digits contrbute to the middle two numbers than the last two numbers.

This method of creating a hash is not practical as the input strings get longer and so other mathematical techniques have been developed. One such algorithm is called SHA256, and for the purposes of this post the details aren’t important. However, an important feature of this algorithm is that the input can be of any length but the output is always 256 bits (or a base64 string with 64 characters).

SHA256("The quick brown fox jumped over the lazy dog.") = 68b1282b91de2c054c36629cb8dd447f12f096d3e3c587978dc2248444633483

Hashed based Message Authentication Code (HMAC)

The SHA256 hash is the basis of an algorithm called HMACSHA256 which uses a message and a secret to sign a message in a way that can be authenticated by the server. The way this works is that the original message is concatenated with the secret and the resultant string is hashed and transmitted to the server. When the server receives the message it performs the same operation with the shared secret and if the hash it calculates matches the signature then the server can presume that the sender new the secret.

Amazon S3 Secret Access Key

The Amazon AWS console provides an Identity and Access Management (IAM) service that allows the creation of access keys.

In this example, the access key has an ID of AKIA5TRHZWO3QGECXEHC and a secret access key of jvIMMNOMEI7MoskR2rblLUZlInHE4FAYSZiyeiKK. The secret access key is the secret that is shared between Amazon and the user that will be used to sign our requests.

Authorization

In order to authenticate our request, a string is established that includes the path to the file that is to be retrieved, and the date and time that the request is being made. This string is called the Canonical Request.

GET
/020706ac4b5c16e8342686f8c7202755_2022-03-26.mxf

host:brians3test.s3.us-west-2.amazonaws.com
x-amz-content-sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
x-amz-date:20220521T190125Z

host;x-amz-content-sha256;x-amz-date
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855"

The SHA256 hash of this string is a0a9a5cf50df5cd0469e53924938796c9e118076cd9ca0bee9449cd43954e598 and this is used in the construction of a new string called the String to sign.

AWS4-HMAC-SHA256
20220521T190125Z
20220521/us-west-2/s3/aws4_request
a0a9a5cf50df5cd0469e53924938796c9e118076cd9ca0bee9449cd43954e598"

This string is the message that is to be signed using the secret access key. After a convoluted series of HMACSHA256 calculations, a signature is created which looks like 2849011c9030062b75621e63d3451dea3b098c46baa94b2b67aae22db793cc0d. This is used as the signature parameter in the Authorization header mentioned earlier, which also transmits the access key ID.

Signed URLs

Sometimes an owner of an S3 bucket would like to send a time limited link to another person such that they can download the data. Such a link is called a signed URL. A signed URL is calculated in an almost identical way to that described above, but instead of the signature being embedded in an HTTP header, it is transmitted as part of the query parameters.

The canonical request is altered to include the query parameters that are intended to be transmitted and to remove some of the HTTP headers that won’t be transmitted. One of the query parameters specified is X-Amz-Expires which is used to time limit the validity of this link.

GET
/020706ac4b5c16e8342686f8c7202755_2022-03-26.mxf
X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA5TRHZWO37NBK72XS%2F20220521%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20220521T190125Z&X-Amz-Expires=86400&X-Amz-SignedHeaders=host
host:brians3test.s3.us-west-2.amazonaws.com

host
UNSIGNED-PAYLOAD

The SHA256 hash of this canonical request is used in the construction of the string to sign.

AWS4-HMAC-SHA256
20220521T190125Z
20220521/us-west-2/s3/aws4_request
93474784c7110bfb8c354e6cf2b7651a2f3e03c8be182b5f1808b213c9fa4ac0

After the series of HMACSHA256 calculations a new signature is created (3b9fc74d0a507ff02fc730c18f826cedf22bb70231e25ebac0075b72af345f1f) and used in the signed URL

https://brians3test.s3.us-west-2.amazonaws.com/020706ac4b5c16e8342686f8c7202755_2022-03-26.mxf?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIA5TRHZWO37NBK72XS%2F20220521%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Date=20220521T190125Z&X-Amz-Expires=86400&X-Amz-SignedHeaders=host&X-Amz-Signature=3b9fc74d0a507ff02fc730c18f826cedf22bb70231e25ebac0075b72af345f1f

This link could be emailed, or otherwise transmitted, to any recipient who the creator would like to access the referenced file.

Excel Demonstration

The linked Excel spreadsheet demonstrates the calculations detailed in this post. The spreadsheet contains Windows only macros that implement the SHA256 based algorithms.

Demo Excel Spreadsheet

Licensed under CC BY-NC-SA 4.0
comments powered by Disqus
Built with Hugo
Theme Stack designed by Jimmy