Understanding about DynamoDB
DynamoDB
- It is a fully managed NoSQL database service
- It provides fast and predictable performance with seamless scalability.
- Offload the administrative burdens of operating and scaling a distributed database
- It also offers encryption at rest
- Create database tables that can store and retrieve any amount of data
- Serve any level of request traffic.
- Scale up or down tables’ throughput capacity without downtime or performance degradation
- Use the AWS Management Console to monitor resource utilization.
- Also, provides on-demand backup capability with full backups of tables for long-term retention and archival.
- tables, items, and attributes are the core components
- A table is a collection of items
- item is a collection of attributes.
- primary keys are used to uniquely identify each item in a table
- secondary indexes provide more querying flexibility
- DynamoDB Streams capture data modification events in DynamoDB tables.
- Collection of Tables, tables highest level structure within DB, WCU number of 1KB blocks per second, RCU number of 4KB blocks per second.
- eventually consistent (less cost/RCU)/immediate consistent
- Hash Key/Partition Key
- Range Key/Sort Key
- Partitions:
- Underlying storage and processing nodes of DynamoDB. Initially one table -> one partition, one partition can store 10 GB,
- handles 3000 RCU and 1000 WCU, data distributed based on Hash/Partition Key, can scale indefinitely, no decrease in performance,
- allocated WCU and RCU is split between partitions.
- GSI/LSI:
- DynamoDB offers 2 data retrieval operations, SCAN (scan entire table) and QUERY (select single/multiple item by partition key value)
- Index allows efficient queries
- Global Secondary Index – can be created anytime, can have alternative Partition & Sort Key, RCU and WCU are defined on GSI.
- GSI only support evetually consistent reads.
- Local Secondary Index – can only be created at the time of table creation; contains Partition, Sort key and New Sort Key, Projected values
- LSI storage concerns – Beware of ItemCollections, ItemCollections max size is 10 GB, ItemCollectionSizeLimitExceededException – Answer
- LSI and capacity exceeded 10GB
- Streams & Replications:
- Streams: ordered record of updates to a DynamoDB table, If stream enabled records table and stores for 24 hours.
- AWS guarantee no duplication and real time. can be configured with 4 views;
- KEYS_ONLY (only key attributes are written to the stream)
- NEW_IMAGE (entire item POST update)
- OLD_IMAGE (entire item PRE update)
- NEW_AND_OLD_IMAGES (PRE and POST operation state)
- Use cases of Stream: Replication, Triggers, Games or large distributed app with user worldwide, DR, Lambda function triggered when items are added perform analytics etc
- Replication
- not built in DynamoDB. Create or select table to be replicated, apply CFN stack and wait, get the location from URL of CFN output
- test a simple cross region replication
- Use SQS as Management write buffer.
- Increase in RCU is dangerous.
- prefix/suffix key additions to improve keyspace load leveling
- Buffering read/writes with SQS and Caching
DynamoDB components
- Tables
- Similar to other database systems, DynamoDB stores data in tables.
- A table is a collection of data
- Example, see the example table called People, is listed below, to store personal contact information about friends, family, or anyone else of interest. You could also have a Cars table to store information about vehicles that people drive.
- Items
- Each table contains zero or more items
- An item is a group of attributes that is uniquely identifiable among all of the other items.
- In a People table, each item represents a person. For a Cars table, each item represents one vehicle.
- Items are similar in many ways to rows, or tuples in other database systems.
- There is no limit to the number of items you can store in a table.
- Attributes
- Each item is composed of one or more attributes.
- It is a fundamental data element, and is not to be broken down any further.
- For example, an item in a People table contains attributes called PersonID, LastName, FirstName, and so on.
- Attributes are similar to fields or columns in other database systems.
The following diagram shows a table named People with some example items and attributes.
Note the following about the People table:
- Each item in the table has a unique identifier, or primary key, that distinguishes the item from all of the others in the table. In the People table, the primary key consists of one attribute (PersonID).
- Other than the primary key, the People table is schemaless, which means that neither the attributes nor their data types need to be defined beforehand. Each item can have its own distinct attributes.
- Most of the attributes are scalar, which means that they can have only one value. Strings and numbers are common examples of scalars.
- Some of the items have a nested attribute (Address). DynamoDB supports nested attributes up to 32 levels deep.
DynamoDB supports two different kinds of primary keys:
- Partition key – A simple primary key, composed of one attribute known as the partition key. Partition key’s value is input to an internal hash function. The output from the hash function determines the partition (physical storage internal to DynamoDB) in which the item will be stored. In a table that has only a partition key, no two items can have the same partition key value. The People table has a simple primary key (PersonID) to access any item in the table directly by providing it.
- Partition key and sort key – Called as a composite primary key. Has two attributes – first is the partition key, and second is the sort key. Partition key value as input to an internal hash function. The output from the hash function determines the partition (physical storage internal to DynamoDB) in which the item will be stored. All items with the same partition key value are stored together, in sorted order by sort key value.
Secondary Indexes –
- Can create one or more secondary indexes on a table.
- Query the table using an alternate key by secondary index, instead of primary key.
- DynamoDB doesn’t require index, but index give more flexibility during querying data.
- With a secondary index on a table, read data from index similar as, from the table.
DynamoDB index types
- Global secondary index – Partition key and sort key that can be different from those on the table.
- Local secondary index –Same partition key as the table, but a different sort key.
DynamoDB supports eventually consistent and strongly consistent reads.
- Eventually Consistent Reads – When you read data from a DynamoDB table, the response might not reflect the results of a recently completed write operation. The response might include some stale data. If you repeat read request after a short time, the response should return the latest data.
- Strongly Consistent Reads – When you request a strongly consistent read, DynamoDB returns a response with the most up-to-date data, reflecting the updates from all prior write operations that were successful. A strongly consistent read might not be available if there is a network delay or outage. Consistent reads are not supported on global secondary indexes (GSI).
Naming Rules
Tables, attributes, and other objects in DynamoDB must have names. Names should be meaningful and concise—for example, names such as Products, Books, and Authors are self-explanatory.
The following are the naming rules for DynamoDB:
- All names must be encoded using UTF-8, and are case-sensitive.
- Table names and index names must be between 3 and 255 characters long, and can contain only the following characters:
- a-z
- A-Z
- 0-9
- _ (underscore)
- – (dash)
- . (dot)
- Attribute names must be between 1 and 255 characters long.
Reserved Words and Special Characters
DynamoDB has a list of reserved words and special characters. For a complete list of reserved words in DynamoDB, see Reserved Words in DynamoDB. Also, the following characters have special meaning in DynamoDB: # (hash) and : (colon). Although DynamoDB allows you to use these reserved words and special characters for names, we recommend that you avoid doing so because you have to define placeholder variables whenever you use these names in an expression.
Data Types
DynamoDB supports many different data types for attributes within a table. They can be categorized as follows:
- Scalar Types – A scalar type can represent exactly one value. The scalar types are number, string, binary, Boolean, and null.
- Document Types – A document type can represent a complex structure with nested attributes, such as you would find in a JSON document. The document types are list and map.
- Set Types – A set type can represent multiple scalar values. The set types are string set, number set, and binary set.
When you create a table or a secondary index, you must specify the names and data types of each primary key attribute (partition key and sort key). Furthermore, each primary key attribute must be defined as type string, number, or binary.
DynamoDB is a NoSQL database and is schemaless. This means that, other than the primary key attributes, you don’t have to define any attributes or data types when you create tables. By comparison, relational databases require you to define the names and data types of each column when you create a table.
Boost your chances to qualify and become an AWS Certified Solutions Architect Associate. Try hundreds of Free Practice Test Now!