What is the best way to transfer large files to Amazon S3?

What is the best way to transfer large files to Amazon S3?

April 25, 2020 / Eternal Team

In this blog we will explain to you how to upload large amounts of data on simple storage service (S3) by AWS.

There are so many tools and services that AWS offers for transferring your on-premise to AWS.

  1. AWS Storage Gateway.
  2. AWS DataSync.
  3. AWS Direct Connect.
  4. AWS Snowball Family.
  5. Amazon S3 Transfer Acceleration.
  6. Using the AWS CLI.

In this blog we will understand the best way to transfer your data using S3 Transfer Acceleration and AWS CLI.

But first let’s understand what multipart upload is?

S3 supports multipart uploads for large files. For example: using this feature, you can break a 5 GB upload into as many as 1024 separate parts and upload each one independently, as long as each part has a size of 5 megabytes (MB) or more. If an upload of a part fails it can be restarted without affecting any of the other parts. Once you have uploaded all the parts you ask S3 to assemble the full object with another call to S3.
Consider the following options for improving the performance of uploads and optimizing multipart uploads:

  • Enable Amazon S3 Transfer Acceleration.
  • Using the AWS CLI.

1) Enable Amazon S3 Transfer Acceleration

Amazon S3 Transfer Acceleration can provide fast and secure transfers over long distances between your client and Amazon S3. Transfer Acceleration uses Amazon CloudFront’s globally distributed edge locations.

Transfer Acceleration has additional charges, so be sure to review pricing.

If you want to see the transfer speeds for your use case, review the Amazon S3 Transfer Acceleration Speed Comparison tool.

How to use

There are so many ways to Enable Transfer Acceleration.so that I will put the link below so you can use it as per your requirement.

https://docs.aws.amazon.com/AmazonS3/latest/dev/transfer-acceleration-examples.html#transfer-acceleration-examples-console

Note:- Transfer Acceleration does not support cross-Region copies using

CopyObject

2) Using the AWS CLI

The AWS Command Line Interface (CLI) is a unified tool to manage your AWS services. With just one tool to download and configure, you can control multiple AWS services from the command line and automate them through scripts.

You can install AWS CLI for any major operating system: macOS, Linux, or Windows.

You can customize the following AWS CLI configurations for Amazon S3.

How to use

We have considered the Linux operating system.

$ pip install awscli

Get your access keys

1) Get your access keys

2) Go to Users.

3) Click on your user name

4) Go to the Security credentials tab.

5) Click Create access key

6) You’ll see your Access key ID. Click “Show” to see your Secret access key and download it and keep safe

Once you successfully install the AWS CLI, open the command prompt and execute the below commands.

  1. First, execute aws configure to configure your account (This is a one-time process) and press Enter (this is a one-time process).
  2. Now, it will ask for an AWS access key ID, key, region name, and output format. Enter all the inputs and press Enter.

Uploading large files

Here, assume we are uploading a large 150GB data file to s3://systut-data-test/store_dir/ (that is, directory store-dir under bucket systut-data-test) and the bucket and directory are already created on S3

The command is:

$ aws s3 cp ./150GB.data s3://systut-data-test/store_dir/

After it starts to upload the file, it will print the progress message like
Completed 1 part(s) with … file(s) remaining

at the beginning, and the progress message as follows when it is reaching the end.
Completed 9896 of 9896 part(s) with 1 file(s) remaining

After it successfully uploads the file, it will print a message like
upload: ./150GB.data to s3://systut-data-test/store_dir/150GB.data

But AWS CLI can do much more. Check out the comprehensive documentation at AWS CLI Command Reference.

Want to start a project?

It’s simple.

Contact us