SQS SNS Kinesis

Table of contents
  1. Simple Queuing Service (SQS)
    1. SQS Simplified:
    2. SQS Key Details:
    3. SQS Polling:
  2. Simple Workflow Service (SWF)
    1. SWF Simplified:
    2. SWF Key Details:
  3. Simple Notification Service (SNS)
    1. SNS Simplified:
    2. SNS Key Details:
  4. Kinesis
    1. Kinesis Simplified:
    2. Kinesis Key Details:

Simple Queuing Service (SQS)

SQS Simplified:

SQS is a web-based service that gives you access to a message queue that can be used to store messages while waiting for another service to process them. It helps in the decoupling of systems and the horizontal scaling of AWS resources.

SQS Key Details:

  • The point behind SQS is to decouple work across systems. This way, downstream services in a system can perform work when they are ready to rather than when upstream services feed them data.
  • In a hypothetical AWS environment running without SQS, Application A would pass Application B data regardless if Application B was ready to receive the info. With SQS however, there is an intermediary step where the data is stored temporarily in a buffer. It waits there until Application B pulls the temporarily stored data. SQS is not a push-based service so it is necessary for SQS to work in tandem with another service that queries it for information.
  • There are two types of SQS queues; standard and FIFO. Standard queues may be received out of order based on message size or however else the SQS queues decide to optimize. FIFO queues guarantees that the order of messages that went into the queue is the same as the order of messages that leave it.
  • Standard SQS queues guarantee that a message is delivered at least once and because of this, it is possible on occasion that a message might be delivered more than once due to the asynchronous and highly distributed architecture. With standard queues, you have a nearly unlimited number of transactions per second.
  • FIFO SQS queues guarantee exactly-once processing and is limited to 300 transactions per second.
  • Messages in the queue can be kept there from one minute to 14 days and the default retention period is 4 days.
  • Visibility timeouts in SQS are the mechanism in which messages marked for delivery from the queue are given a time frame to be fully received by a reader. This is done by temporarily making them invisible to other readers. If the message is not fully processed within the time limit, the message becomes visible again. This is another way in which messages can be duplicated. If you want to reduce the chance of duplication, increase the visibility timeout.
  • The visibility timeout maximum is 12 hours.
  • Always remember that the messages in the SQS queue will continue to exist even after the EC2 instance has processed it, until you delete that message. You have to ensure that you delete the message after processing to prevent the message from being received and processed again once the visibility timeout expires.
  • An SQS queue can contain an unlimited number of messages.
  • You cannot set a priority to the individual items in the SQS queue. If priority of messaging matters, create two separate SQS queues. The SQS queues for the priority message can be polled first by the EC2 Instances and once completed, the messages from the second queue can be processed next.

SQS Polling:

  • Polling is the means in which you query SQS for messages or work. Amazon SQS provides short-polling and long-polling to receive messages from a queue. By default, queues use short polling.
  • SQS long-polling: This polling technique will only return from the queue once a message is there, regardless if the queue is currently full or empty. This way, the reader needs to wait either for the timeout set or for a message to finally arrive. SQS long polling doesn’t return a response until a message arrives in the queue, reducing your overall cost over time.
  • SQS short-polling: This polling technique will return immediately with either a message that’s already stored in the queue or empty-handed.
  • The ReceiveMessageWaitTimeSeconds is the queue attribute that determines whether you are using Short or Long polling. By default, its value is zero which means it is using short-polling. If it is set to a value greater than zero, then it is long-polling.
  • Every time you poll the queue, you incur a charge. So thoughtfully deciding on a polling strategy that fits your use case is important.

Simple Workflow Service (SWF)

SWF Simplified:

SWF is a web service that makes it easy to coordinate work across distributed application components. SWF has a range of use cases including media processing, web app backend, business process workflows, and analytical pipelines.

SWF Key Details:

  • SWF is a way of coordinating tasks between application and people. It is a service that combines digital and human-oriented workflows.
  • An example of a human-oriented workflow is the process in which Amazon warehouse workers find and ship your item as part of your Amazon order.
  • SWF provides a task-oriented API and ensures a task is assigned only once and is never duplicated. Using Amazon warehouse workers as an example again, this would make sense. Amazon wouldn’t want to send you the same item twice as they’d lose money.
  • The SWF pipeline is composed of three different worker applications that help to bring a job to completion:
    • SWF Actors are workers that trigger the beginning of a workflow.
    • SWF Deciders are workers that control the flow of the workflow once it’s been started.
    • SWF Activity Workers are the workers that actually carry out the task to completion.
  • With SWF, workflow executions can last up to one year compared to the 14 days maximum retention period for SQS.

Simple Notification Service (SNS)

SNS Simplified:

Simple Notification Service is a pushed-based messaging service that provides a highly scalable, flexible, and cost-effective method to publish a custom messages to subscribers who wish to be informed about a certain topic.

SNS Key Details:

  • SNS is mainly used to send alarms or alerts.
  • SNS provides topics for high-throughput, push-based, many-to-many messaging.
  • Using Amazon SNS topics, your publisher systems can fan out messages to a large number of subscriber endpoints for parallel processing, including Amazon SQS queues, AWS Lambda functions, and HTTP/S webhooks. Additionally, SNS can be used to fan out notifications to end users using mobile push, SMS, and email.
  • You can send these push notifications to Apple, Google, Fire OS, and Windows devices.
  • SNS allows you to group multiple recipients using topics. A topic is an access point for allowing recipients to dynamically subscribe for identical copies of the same notification.
  • One topic can support deliveries to multiple endpoint types. When you publish to a topic, SNS appropriately formats copies of that message to send to whichever kind of device.
  • To prevent messages being lost, messages are stored redundantly across multiple AZs.
  • There is no long or short polling involved with SNS due to the instantaneous pushing of messages
  • SNS has flexible message delivery over multiple transport protocols and has a simple API.

Kinesis

Kinesis Simplified:

Amazon Kinesis makes it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. With Amazon Kinesis, you can ingest real-time data such as video, audio, application logs, website clickstreams, and IoT telemetry data for machine learning, analytics, and other applications. Amazon Kinesis enables you to process and analyze data as it arrives and respond instantly instead of having to wait until all your data is collected before the processing can begin.

Kinesis Key Details:

  • Amazon Kinesis makes it easy to load and analyze the large volumes of data entering AWS.
  • Kinesis is used for processing real-time data streams (data that is generated continuously) from devices constantly sending data into AWS so that said data can be collected and analyzed.
  • It is a fully managed service that automatically scales to match the throughput of your data and requires no ongoing administration. It can also batch, compress, and encrypt the data before loading it, minimizing the amount of storage used at the destination and increasing security.
  • There are three different types of Kinesis:
    • Kinesis Streams
      • Kinesis Streams works where the data producers stream their data into Kinesis Streams which can retain the data from one day up until 7 days. Once inside Kinesis Streams, the data is contained within shards.
      • Kinesis Streams can continuously capture and store terabytes of data per hour from hundreds of thousands of sources such as website clickstreams, financial transactions, social media feeds, IT logs, and location-tracking events. For example: purchase requests from a large online store like Amazon, stock prices, Netflix content, Twitch content, online gaming data, Uber positioning and directions, etc.
    • Kinesis Firehose
      • Amazon Kinesis Firehose is the easiest way to load streaming data into data stores and analytics tools. When data is streamed into Kinesis Firehose, there is no persistent storage there to hold onto it. The data has to be analyzed as it comes in so it’s optional to have Lambda functions inside your Kinesis Firehose. Once processed, you send the data elsewhere.
      • Kinesis Firehose can capture, transform, and load streaming data into Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, and Splunk, enabling near real-time analytics with existing business intelligence tools and dashboards you’re already using today.
    • Kinesis Analytics
      • Kinesis Analytics works with both Kinesis Streams and Kinesis Firehose and can analyze data on the fly. The data within Kinesis Analytics also gets sent elsewhere once it is finished processing. It analyzes your data inside of the Kinesis service itself.
  • Partition keys are used with Kinesis so you can organize data by shard. This way, input from a particular device can be assigned a key that will limit its destination to a specific shard.
  • Partition keys are useful if you would like to maintain order within your shard.
  • Consumers, or the EC2 instances that read from Kinesis Streams, can go inside the shards to analyze what is in there. Once finished analyzing or parsing the data, the consumers can then pass on the data to a number of places for storage like a DB or S3.
  • The total capacity of a Kinesis stream is the sum of data within its constituent shards.
  • You can always increase the write capacity assigned to your shard table.

Back to top

Copyright © 2022-2023 Interview Docs Email: docs.interview@gmail.com