Messing Around with Boto3: S3 Clients and Resources

[ aws  python  automation  etl  wwe  ]

So you’ve pip-installed boto3 and want to connect to S3. Should you create an S3 resource or an S3 client?

Googling some code examples you will find both being used. In this post, let’s look at the difference between these two basic approaches of interacting with your AWS assets from boto3, and show a few examples of each.

First and Foremost: What’s What?

In short, a Boto3 resource is a high-level abstraction, whereas a client is more granular.

From the documentation on resources, we find

Resources represent an object-oriented interface to Amazon Web Services (AWS). They provide a higher-level abstraction than the raw, low-level calls made by service clients.

The docs on clients tell us:

Clients provide a low-level interface to AWS whose methods > map close to 1:1 with service APIs. All service operations are supported by clients. Clients are generated from a JSON service definition file.

Create an S3 Resource and Client

import boto3
s3r = boto3.resource('s3')
s3c = boto3.client('s3')

If you play around with the resource_buckets list, you will see that each item is a Bucket object. On the other hand, the client_buckets list contains dictionary representations of the S3 buckets.

Resource

resource_buckets = list(s3r.buckets.all())
for bucket in resource_buckets:
    print(bucket.name)

Client

client_buckets = s3c.list_buckets()['Buckets']
for item in client_buckets: 
    print(item['Name'])

Upload a file to a bucket

Resource

fileName = 'someFile.txt'
bucketName = 'some-bucket-name'
with open(fileName, 'rb') as file:
    s3.Bucket(bucketName).put_object(Key=fileName, Body=file)

Client

# Uploads the given file using a managed uploader, which will split up large
# files automatically and upload parts in parallel.
fileName = 'someFile.txt'
bucketName = 'some-bucket-name'
s3.upload_file(fileName, bucketName, fileName)

Read a file from a bucket

Client

csv_file = s3.get_object(Bucket='some-bucket-name', Key='some-filepath-name.csv')
csv_str = file['Body'].read().decode('utf-8')

Looking Forward

Something that I tinkered with today, but could not get to work is direct access to S3 using Pandas. This is definitely something I want to come back to.

This article looks promising:

References

Written on December 18, 2017