A simple ERD (entity-relation diagram) for this application will have four distinct entities: customers, addresses, orders, and products. We use an example of a successful ecommerce store allowing registered users to order products from their website. We begin with a single-table design as an initial state and build a scalable batch extract, load, and transform (ELT) pipeline to restructure the data into a dimensional model for OLAP workloads. We discuss data model design for both NoSQL databases and SQL data warehouses. In this post, we walk through the process of exporting data from a DynamoDB table to Amazon Redshift. Building a performant data warehouse is non-trivial because the data needs to be highly curated to serve as a reliable and accurate version of the truth. Amazon Redshift is fully managed, scalable, cloud data warehouse. The goal of a data warehouse is to enable businesses to analyze their data fast this is important because it means they are able to gain valuable insights in a timely manner. These types of queries are suited for a data warehouse. Deriving business insights by identifying year-on-year sales growth is an example of an online analytical processing (OLAP) query. A key pillar of AWS’s modern data strategy is the use of purpose-built data stores for specific use cases to achieve performance, cost, and scale. These types of queries require complex aggregations over a large number of records. A typical ask for this data may be to identify sales trends as well as sales growth on a yearly, monthly, or even daily basis. Suppose we have a successful ecommerce application handling a high volume of sales transactions in DynamoDB. Nonetheless, many of the same customers using DynamoDB would also like to be able to perform aggregations and ad hoc queries against their data to measure important KPIs that are pertinent to their business. In this format, some_attribute is a partition key or part of an index. SELECT * FROM TABLE WHERE Some_Attribute = 'some_value'
0 Comments
Leave a Reply. |