Amazon S3 versus Amazon Glacier

  1. Home
  2. AWS
  3. Amazon S3 versus Amazon Glacier
Amazon S3 versus Amazon Glacier

Being one of the top leading companies Amazon has always surprised its customer and clients by providing services for getting advancement and best outcomes. You are well aware of the storage service – Amazon S3. But, there is one more service that is in line namely Amazon Glacier. 

Both of the Amazon storage services are used for handling data in a secured environment. But, you must know that Amazon Glacier comes under the Amazon S3 storage classes. However, this is one thing, in this blog we will be understanding the differences and comparison of both the services. So, keep up with the blog because there is so much to learn!

Amazon S3 versus Amazon Glacier

What is Amazon S3?

  • Amazon S3 services provide usable management features for helping in organizing your data as well as in configuring access controls for meeting your specific business, organizational, and compliance requirements. 
  • Secondly, this is designed for providing durability, and storing data for millions of applications for various companies worldwide.
  • Thirdly, you can use Amazon S3 for retrieving any amount of data at any time and from anywhere on the web using the AWS Management Console.
  • Next, you need to understand that Amazon S3 stores data as objects within buckets. 
    • Object in this consists of a file and metadata explaining that file. 
    • Then, for storing an object in Amazon S3, you have to upload the file you want to store to a bucket. After uploading a file, permissions can be set on the object and any metadata.
    • Lastly, buckets here refer to the containers for objects. 

What is Amazon Glacier?

  • Amazon Glacier refers to a cost-efficient storage service used for providing safe and durable storage for data archiving and backup. 
  • Secondly, this is optimized for data that is accessed infrequently and for which retrieval times of several hours are suitable. 
  • Thirdly, using Amazon Glacier can help customers reliably store large or small amounts of data. Moreover, it is designed for delivering high durability, and comprehensive security including compliance capabilities.
  • Next, it provides customers the option to store data for as little as $1 per terabyte per month. Amazon S3 Glacier provides three options for access to archives for keeping costs low for differing retrieval needs. And S3 Glacier Deep Archive gives access to two options ranging from 12 to 48 hours.
  • Lastly, S3 Glacier enables customers for offloading the administrative burdens of operating and scaling storage to AWS. This helps them not to worry about capacity planning, hardware provisioning, data replication, hardware failure detection, and recovery, or time-consuming hardware migrations.
Amazon S3 versus Amazon Glacier benefits

Above we have understood the brief overview of both Amazon S3 and Glacier. To get more clarity in this, below we will learn about the features of both the services that will differentiate them from each other.

What are the Features of Amazon S3?

Amazon S3 comes consists of features that help in organizing and managing data and meeting compliance requirements.

1. Storage management and monitoring
  • Amazon S3 is helping customers and industries in organizing their data by providing value to their businesses and teams. All objects are stored in S3 buckets. Further, they can be organized with shared names known as prefixes. 
  • Secondly, you can also add up to 10 key-value pairs that are known as S3 object tags.
  • Next, it provides S3 Batch Operations for making it simple to manage your data in Amazon S3 at any scale. 
  • Lastly, you can also use S3 Batch Operations for running AWS Lambda functions over your objects for executing custom business logic like processing data or transcoding image files.
2. Storage Analytics and Insights

S3 Storage Lens

  • S3 Storage Lens is for implementing organization-wide visibility into object storage usage and activity trends. Moreover, it makes actionable recommendations for improving cost-efficiency and for data protection.

S3 Storage Class Analysis

  • Amazon S3 Storage Class Analysis is used for analyzing storage access patterns for deciding the transition of the right data the storage class. Moreover, this observes data access patterns for helping you determine when to transition less frequently accessed storage to the lower-cost storage class. 
3. Storage classes
  • At corresponding costs or geographic location, every S3 Storage Class supports a specific data access level. That is to say, it can store mission-critical production data in S3 Standard for frequent access. And, it also saves costs by storing infrequently accessed data in S3 Standard-IA or S3 One Zone-IA. Lastly, it archives data at the lowest costs in the archival storage classes S3 Glacier and S3 Glacier Deep Archive. 
  • Further, you can store data with changing or unknown access patterns in S3 Intelligent-Tiering. This automatically moves your data based on changing access patterns between two low latency access tiers optimized for frequent and infrequent access. 
4. Access management and security

Access management

For keeping data safe in Amazon S3, by default, users can only access the S3 resources which they create. However, you can grant access to other users by using one or a combination of the following access management features: 

  • Firstly, AWS Identity and Access Management (IAM) for creating users and manage their respective access
  • Secondly, Access Control Lists (ACLs) for making individual objects accessible to authorized users
  • Thirdly, Bucket policies for configuring permissions for all objects within a single S3 bucket
  • Lastly, Query String Authentication for granting time-limited access to others with temporary URLs. 

Security

  • Amazon S3 offers flexible security features for blocking unauthorized users from accessing your data. Moreover, Amazon S3 supports S3 Block Public Access which is a security control for ensuring that S3 buckets and objects do not have public access. Using a few clicks in the Amazon S3 Management Console, you can apply the S3 Block Public Access settings to all buckets within your AWS account or to specific S3 buckets. 
5. Data processing

S3 Object Lambda

  • S3 Object Lambda allows you to add your own code in S3 GET requests for modifying and processing data as it is returned to an application. The S3 Object Lambda uses AWS Lambda functions for automatically processing the output of a standard S3 GET request. With just a few clicks in the AWS Management Console, the configuration of the Lambda function can be done. After that, attach it to an S3 Object Lambda Access Point. From that point forward, S3 will automatically call your Lambda function for processing any data retrieved through the S3 Object Lambda Access Point, returning a transformed result back to the application.
6. Query in place

Amazon S3 has a built-in feature that queries data without needing to copy and then, load it into a separate analytics platform or data warehouse. That is to say, you can directly run big data analytics on your data stored in Amazon S3. However, Amazon S3 has compatibility with AWS analytics services that are:

  • Firstly, Amazon Athena. This is for querying data in Amazon S3 without extracting and loading it into a separate service or platform. 
  • Secondly, Amazon Redshift Spectrum. This directly runs SQL queries against data at rest in Amazon S3 and is good for complex queries and large data sets.
7. Data transfer

AWS provides a portfolio of data transfer services for providing the right solution for any data migration project. It offers:

  • Firstly, Hybrid cloud storage. AWS Storage Gateway is a hybrid cloud storage service that lets you connect and extend your on-premises applications to AWS Storage.
  • Secondly, Online data transfer. AWS DataSync is well-efficient for transferring hundreds of terabytes and millions of files into Amazon S3. DataSync automatically handles or removes various manual tasks.
  • Then, Offline data transfer. The AWS Snow Family is built for using it in edge locations where network capacity is constrained or nonexistent. This provides storage and computing capabilities in harsh environments as well.
8. Performance
  • Amazon S3 provides performance for cloud object storage in the leading industry. It supports parallel requests for scaling S3 performance by the factor of the computing cluster. 
  • Secondly, performance scales per prefix. That is to say, you can use as many prefixes as you need in parallel for achieving the needed throughput. Further, Amazon S3 performance supports at least 3,500 requests per second for adding data and 5,500 requests per second for retrieving data.
  • Lastly, for achieving this S3 request rate performance there is no need for randomizing object prefixes for achieving faster performance. 

What are the key features of Amazon Glacier?

The features include:

1. Data Retrieval
  • Amazon S3 Glacier offers three retrieval features for archives to meet varying access time and cost requirements: 
    • Expedited
    • Standard
    • Bulk retrievals
2. Amazon S3 Glacier Select
  • Amazon S3 Glacier Select is for running queries directly on the data stored without ane need for retrieving the archive. S3 Glacier Select helps in lowering the costs and uncovering more insights from your archive data.
3. AWS Snowball and Direct Connect integration
  • AWS Snowball increases the movement of large amounts of data into and out of AWS using portable storage devices for transport. On the other hand, AWS Direct Connect makes it easy for establishing a high-bandwidth, dedicated network connection from your premises to AWS. Further, for transferring business-critical data directly from your data center into AWS you can use AWS Direct Connect.
4. Vault Lock
  • This uses a lockable policy that helps in deploying and enforcing compliance controls on individual S3 Glacier vaults. Moreover, you can define controls like “Write Once Read Many” (WORM) and lock the policy from future edits. Once locked, the policy becomes unchangeable. After that,  Amazon S3 Glacier will enforce the prescribed controls for achieving compliance objectives.
5. Access control
  • AWS Identity and Access Management (IAM) by Amazon S3 Glacier helps in securely monitoring access to AWS and your Amazon S3 Glacier data. Moreover, in this, you can,
    • Firstly, create users in IAM
    • Secondly, assign individual security credentials and IAM policies on each Amazon S3 Glacier vault for granting permitted activities to intended users.
6. Tagging support
  • Tags are labels used for defining and associating with your vaults, and using tags adds filtering capabilities to operations like AWS cost reports. 
7. Audit logs
  • AWS CloudTrail helps you deliver and account for the log files by tracking Amazon S3 Glacier API calls. It also facilitates audit logging with Amazon S3 Glacier. Moreover, these log files provide insight into Amazon assets.
8. Vault access policies
  • Vault access policies are for managing access to your individual S3 Glacier vaults. Moreover, you can define an access policy directly on a vault for granting vault access to users and business groups internal to your organization.
9. AWS software development kits (SDKs)
  • AWS SDKs are used for data uploading and retrieval in Amazon Glacier. Amazon S3 Glacier is supported by the AWS SDKs for Java, .NET, PHP, and Python (Boto). Further, these SDKs provide libraries that plan an underlying REST API and helps in constructing requests and processing responses. 
10. Protecting your data

By default data stored in Amazon S3 Glacier is protected and can only be accessed by vault owners. Moreover, it encrypts your data at rest by default and supports secure data transit with SSL. With Amazon S3 Glacier’s data protection features, you can protect your data from,

  • Firstly, both logical and physical failures
  • Secondly, guarding against data loss from unintended user actions
  • Thirdly, application errors
  • Lastly, infrastructure breakdown
11. Data durability and reliability
  • Amazon S3 Glacier provides durability to the storage infrastructure developed for long-term data archival storage. It is designed for providing average annual durability of 99.999999999% for an archive. The service redundantly stores data in multiple AWS Availability Zones (AZ) and on multiple devices within each AZ.
Amazon S3 Key FeaturesAmazon Glacier Key Features
Storage management and monitoringData Retrieval
Storage Analytics and InsightsAmazon S3 Glacier Select
Storage classesAWS Snowball and Direct Connect integration
Access management and securityVault Lock
Data processingAccess control
Query in placeTagging support
Data transferAudit logs
PerformanceVault access policies
AWS software development kits (SDKs)
Protecting your data
Data durability and reliability

Moving on, in the upcoming sections, we will be understanding the various use cases for Amazon S3 and Glacier.

Amazon Use cases

use cases

Amazon S3 Use Cases

1. Backup and restore

Use Amazon S3 and other AWS services, such as S3 Glacier, Amazon EFS, and Amazon EBS for building scalable, durable, and secure backup and restore solutions. And, augment or replace existing on-premises capabilities. However, AWS and APN partners can help you meet Recovery Time Objectives (RTO), Recovery Point Objectives (RPO), and compliance requirements. Using AWS, it is possible to back up data that is already in the AWS Cloud. Another option can be to use AWS Storage Gateway. This is a hybrid storage service for sending backups of on-premises data to AWS.

2. Disaster recovery (DR)

Protect critical data, applications, and IT systems that are running in the AWS Cloud or in your on-premises environment without collecting the expense of a second physical site. Moreover, with Amazon S3 storage, S3 Cross-Region Replication, and other AWS compute, networking, and database services, you can create DR architectures for quickly recovering from outages caused by natural disasters, system failures, and human errors.

3. Archive

Archive data using the S3 Glacier and S3 Glacier Deep Archive. These S3 Storage Classes maintain objects long-term at the lowest rates. Simply create an S3 Lifecycle policy for archiving objects throughout their lifecycles, or uploading objects directly to the archival storage classes. 

However, with S3 Object Lock, you can apply retention dates to objects for protecting them from deletions, and meet compliance requirements. On the other hand, S3 Glacier provides restoring of archived objects in one minute for advanced retrievals and 3-5 hours for standard retrievals.

4. Data lakes and big data analytics

This is for accelerating innovation by creating a data lake on Amazon S3 and bringing out valuable insights by using query-in-place, analytics, and machine learning tools. Use S3 Access Points for configuring access to your data, with specific permissions for each application or set of applications. Moreover, you can also use AWS Lake Formation for creating a data lake, and centrally defining and enforcing security, governance, and auditing policies. 

Further, the service gathers data across your databases. And then, S3 resources move it into a new data lake in Amazon S3 and cleans it using machine learning algorithms.

5. Hybrid cloud storage

Setting up private connectivity between Amazon S3 and on-premises with AWS PrivateLink. You can provision private endpoints in a VPC for allowing direct access to S3 from on-premises using private IPs from your VPC. However, AWS Storage Gateway allows you to connect and extend your on-premises applications to AWS Storage while caching data locally for low-latency access. This also allows you to automate data transfers between on-premises storage. It can also include data transfers from S3 on Outposts, and Amazon S3 by using AWS DataSync. 

Further, by using AWS Transfer Family, you can directly transfer files in and out of Amazon S3. This is a fully managed, simple, and seamless service that enables secure file exchanges with third parties using SFTP, FTPS, and FTP.

6. Cloud-native applications

Use AWS services and Amazon S3 for developing fast, cost-effective mobile and internet-based applications. This further, is used for storing development and production data shared by the microservices making up cloud-native applications. With Amazon S3, you can upload any amount of data and access it anywhere for deploying applications faster and reach more end users. However, storing data in Amazon S3 means you have access to the latest AWS developer tools, S3 API, and services for machine learning and analytics to innovate and optimize your cloud-native applications.

aws professional

Amazon Glacier Use Cases

1. Media Asset Workflow

There is a requirement for durable storage in the media assets such as video and news footage. Related to this, the Amazon S3 Glacier and S3 Glacier Deep Archive storage classes provide archiving for older media content affordably then move it to Amazon S3 for distribution when needed.

2. Healthcare Information Archiving

For meeting regulatory requirements, hospital systems need to retain petabytes of patient records for decades. In this, the Amazon S3 Glacier and S3 Glacier Deep Archive storage classes provide reliable archiving for patient record data securely at a very low cost.

3. Regulatory and Compliance Archiving

Financial Services and Healthcare must keep regulatory and compliance archives for extended durations. Further, for meeting objectives like SEC Rule 17a-4(f), Amazon S3 Object Lock also sets compliance controls.

4. Scientific Data Storage

Research organizations have tasks for generating, analyzing, and archiving vast amounts of data. But, here, you can avoid the complexities of hardware and facility management and capacity planning using the Amazon S3 Glacier and S3 Glacier Deep Archive storage classes.

5. Digital Preservation

There are data-integrity challenges faced by libraries and government agencies in their digital preservation efforts. For this, Amazon S3 regularly monitors and performs systematic data integrity checks. Moreover, it is built to be automatically self-healing.

6. Magnetic Tape Replacement

On-premises or offsite tape libraries are useful for lowering storage costs but they also require a big in advance investments and expert-level maintenance. However, you won’t have to pay upfront costs while using the Amazon S3 Glacier and S3 Glacier Deep Archive storage classes. Moreover, it also removes the cost and burden of maintenance.

Amazon S3 Use CasesAmazon Glacier Use Cases
Backup and RestoreMedia Asset Workflow
Disaster Recovery (DR)Healthcare Information Archiving
ArchiveRegulatory and Compliance Archiving
Data lakes and big data analyticsScientific Data Storage
Hybrid cloud storageDigital Preservation
Cloud-native applicationsMagnetic Tape Replacement

Amazon Pricing Comparison

Amazon S3 Pricing

There is a policy provided by Amazon for pricing which is, paying only for what you use as there is no minimum fee. However, for Amazon S3, there are six cost components for considering when storing and managing your data:

1. Storage Pricing 

In this, you have to pay for storing objects in your S3 buckets. The charges are based on your objects’ size, how long the object is stored during the month, and the storage classes. However, you pay a monthly monitoring and automation fee per object stored in the S3 Intelligent-Tiering storage class for monitoring access patterns and moving objects between access tiers in S3 Intelligent-Tiering. 

S3 Standard – General purpose storage for any type of data, typically used for frequently accessed data
First 50 TB / Month$0.023 per GB
Next 450 TB / Month$0.022 per GB
Over 500 TB / Month$0.021 per GB
Source: Microsoft
2. Requests and Data Retrieval Pricing

In this, you pay for requests made against your S3 buckets and objects. However, the S3 request costs depend on the request type and are charged on the number of requests. When you use the Amazon S3 console for browsing your storage, then, you collect charges for GET, LIST, and other requests made for facilitating browsing.

3. Data Transfer Pricing

In this, you have to pay for all bandwidth into and out of Amazon S3. But, not for the following:

  • Firstly, data transferred from the internet.
  • Secondly, data transferred between S3 buckets. 
  • Thirdly, data transferred from an Amazon S3 bucket to any AWS service(s) within the same AWS Region as the S3 bucket 
  • Lastly, data transferred out to Amazon CloudFront (CloudFront).
4. Data Management and Analytics Pricing

In this, you have to pay for the storage management features and analytics-enabled on your account’s buckets. S3 Storage management and analytics are priced per feature. Check the below table.

S3 Inventory††$0.0025 per million objects listed
S3 Analytics Storage Class Analysis††$0.10 per million objects monitored per month
S3 Object Tagging$0.01 per 10,000 tags per month
Source: Microsoft
5. S3 Replication pricing

For S3 Replication you have to pay the S3 charges for:

  • Firstly, storage in the selected destination S3 storage classes
  • Secondly, the primary copy
  • Thirdly, replication PUT requests
  • Lastly, applicable infrequent access storage retrieval fees
    • Further, for CRR, you also pay for inter-region Data Transfer OUT from S3 to each destination region. 
6. S3 Object Lambda pricing

When you use S3 Object Lambda, your S3 GET requests an AWS Lambda function that you define. This function will be processing your data and then, it will return a processed object back to your application. In the US East Region, you have to pay $0.0000167 per GB-second for the duration of your AWS Lambda function, and $0.20 per 1M AWS Lambda requests. You also pay $0.0004 per 1,000 requests for all S3 GET requests invoked by your Lambda function, and a $0.005 per-GB fee for the data S3 Object Lambda returns to your application. S3 request and Lambda prices are based on the AWS Region, and the duration and memory allocated to your Lambda function.

Amazon Glacier Pricing

As part of the AWS Free Usage Tier, 10 GB of your Amazon S3 Glacier data can be reclaimed per month for free. This can be used at any time during the month and applies to Standard retrievals. The pricing options include:

1. Storage pricing
  • $0.004 per GB / Month
2. Retrieval pricing
Retrieval TimeData Retrievals
Expedited$0.03 per GB
Standard$0.01 per GB
Bulk$0.0025 per GB
Source: Microsoft
3. Retrieval request pricing
TimeRequests
Expedited$10.00 per 1,000 requests
Standard$0.03 per 1,000 requests
Bulk$0.025 per 1,000 requests
Source: Microsoft
4. Provisioned expedited retrieval
  • $100.00 per Provisioned Capacity Unit**
5. Request pricing
Pricing
UPLOAD Requests$0.03 per 1,000 requests
6. Amazon S3 Glacier Select pricing

Amazon S3 Glacier Select is for running queries directly on data stored in Amazon S3 Glacier without any need for retrieving the entire archive. Pricing for this feature depends upon

  • Firstly, the total amount of data scanned
  • Secondly, the amount of data returned by Amazon S3 Glacier Select
  • Lastly, the number of Amazon S3 Glacier Select requests initiated.

Final Words

Above we have learned and understood about the major areas of both Amazon S3 and Glacier whether it is their features, use cases, or pricing. And, we know that Amazon Glacier is one of the storage classes of Amazon S3. So, there is already a link between these two services. Talking about the comparison, both services have it’s unique roles in providing data storage solutions. For more information, read the blog and use the links provided to get clarity.

aws solution architect
Menu