Amazon Kinesis makes it easy to collect, process and analyze data, streaming and real-time, so you can react quickly. It is capable of integrating real-time data, such as video, audio, application logs, website click streams, or IoT telemetry data.

Amazon Kinesis allows you to process and analyze data as it is received and react instantly instead of waiting for all data to be collected to start processing.

There are different Kinesis services depending on your need:

  • Amazon Kinesis Data Streams (KDS): continuously captures gigabytes of data per second and makes them available on Amazon S3 or AWS Lambda Functions
  • Amazon Kinesis Data Firehose: Reliably loads streaming data into data lakes, data stores and analytics services
  • Amazon Kinesis Data Analytics: transforms and analyzes live streaming data with SQL and Apache Flink
  • Amazon Kinesis Video Streams: securely streams video to AWS

Let’s look at these services in detail.

Amazon Kinesis Data Streams

General Architecture

  • It is composed of Shards whose number must be initiated at creation
  • The number of Shards defines the input and output flow of the Streams
  • Message retention can range from 1 to 365 Days
  • The messages are immutable, they cannot be erased which allows to replay the data if needed
Main components of Amazon Kinesis Data Streams
Main components of Amazon Kinesis Data Streams
  1. One or more Producers send Records to Kinesis Data Streams
  2. The Kinesis Data Streams is a set of Shards, a Shard being a single sequence of data in a stream
  3. In Write, each Shard supports up to 1,000 records/sec and up to 1 MB of data/sec
  4. In Reading, this depends on the Consumer Mode:
    • Shared: flow shared by ALL Consumers of 2 MB/sec by Shard
    • Enhanced: flow rate for EACH Consumer of 2 MB/sec by Shard

Records Fan-Out

Allocation of Records in the Shards
Allocation of Records in the Shards
  1. A Producer generates a Record with a Partition Key and content of 1MB max
  2. Based on the *Record** MD5 Hash, it is directed to one of the Shards
  3. The Record receives an additional field, the Sequence Number, indicating its order of passage in the Shard
  4. The Consumer receives the Record according to the delivery method Shared or Enhanced Fan-Out

Hot Partition

To avoid Shard receiving all Records, you must ensure that the Partition Key values are distributed!


  • Access control and permissions by IAM Policies
  • Encryption in transit by HTTPS and rest by KMS
  • Kinesis Data Streams can be accessed through a VPC Endpoint, ensuring private communication, without going through the Internet
  • Kinesis API calls are logged in CloudTrail

Kinesis Procucers

  • Producers send data to Kinesis Data Streams as Records
  • They may be:
    • An application
    • A Client Mobile or Desktop
    • An application using AWS SDK (Low Level API) or KPL (Kinesis Producer Library, High Level API with batch, compression, retries)
    • A Kinesis Agent installed on a server that sends, for example, logs

Using Batch with the PutRecords API reduces costs and increases throughput

ProvisionedThroughputExceed error

When a Shard receives more data than it can process, an error ProvisionedThroughputExceed is returned to the Producer.

In order to guard against this, it is necessary to:

  • Make sure to use a distributed value of Partition Key
  • Implement a Retry mechanism with an exponential backoff
  • Increase number of Shards


It is made of 3 elements:

  • The Partition Key: defined by the Producer, it determines in which Shard will pass the record (according to its Hash)
  • The Data Blob: the content of 1 MB message at most
  • The Sequence Number: defined in Kinesis Data Streams, it indicates the pass order in the Shard

Kinesis Consumers

  • Consumers receive data from Kenesis Data Streams as Records
  • They may be:
    • An application using the AWS SDK or the KCL API (for Kinesis Client Library)
    • A Lambda Function
    • Kinesis Data Firehose
    • Kinesis Data Analytics


  • The AWS SDK is a low-level API that requires more implementation than the API KCL of high-level that implements mechanisms of Retry, checking of Records processed, reacting to Resharding, …
  • These 2 APIs take into account the Shared and Enhanced modes, however note that, for KCL, only the V2 version supports them

Shared vs Enhanced

The table below summarizes the key differences between the Shared and Enhanced modes of a Consumer:

FeatureShared fan-outEnhanced fan-out
Read rate of a ShardA maximum of 2 MB/s, spread over all ConsumersA maximum of 2 MB/s, for each Consumer
Delivery modelPull on HTTP, uses GetRecordsKinesis Data Streams sends you records via HTTP/2, uses SubscribeToShard
Message spread timeAbout 200 ms on average per Consumer so only 5 calls/sec for the GetRecords APIAbout 70 ms on average whether there are 1 or 5 Consumers (limited to 5 Consumers but can be extended)
CostN/AAdditional cost for data recovery and use of Shards

Kinesis Client Library (KCL)

  • Help to consume and process Kinesis Data Streams by supporting many complex tasks associated with distributed cloud architecture
  • This is a Java library but can interface with other languages
  • Each Shard can be read by a KCL instance at most
  • Only version KCL 2.x supports Enhanced
  • Requires a table in Amazon DynamoDB to trace the Shards processed by each KCL instance and manage resumes
  • KCL instance can be EC2, On Premise or Elastic Beanstalk (as long as IAM accesses are properly configured)

Sizing of Shards

Shard Splitting

  • Increases Streaming Capacity (Throughput) by increasing the number of Shards (and the cost!)
  • Divides a Hot Shard

The operation consists of:

  • Creating 2 new Shards
  • Closing the old Shard and deleting it once its data has expired

This is a manual operation only (no Auto Scaling possible) and only 1 split per operation

Shard Merging

  • Decreases streaming capacity (throughput) and reduces costs
  • Merges 2 Cold Shards with low traffic

The operation consists of:

  • Creating a new Shard
  • Closing the old Shards and deleting them once their data has expired

This is a manual operation only (no Auto Scaling possible) and only 1 merge per operation

Kinesis Data Firehose

General pipeline architecture with Kinesis Data Firehose
General pipeline architecture with Kinesis Data Firehose

Kinesis Data Firehose allows data flows to be loaded into data lakes, data stores, and analytics services with transformations as required:

  • It is an AWS Service fully managed, serverless and with automatic sizing
  • Cost is a function of data passing through Kinesis Data Firehose
  • Loading quasi-real-time:
    • Within 60 seconds of being sent to service
    • As soon as the Buffer is filled (the size depends on the destination service and its configuration)
  • Supports many data formats, their conversion, transformation, compression
  • Supports custom transformations (on AWS Lambda Functions)
  • Can send data in error or for backup in a S3 Bucket


Kinesis Data StreamsKinesis Data Firehose
Large-scale Streaming ServiceStreamed Data Loading Service
Requires implementation of Producer/ConsumerFully managed
Real time (~200ms)“Near” real time (min of 60s)
Manual scalingAutomatic scaling
Retention of data from 1 to 365 days
Possible replay
No retention of datas
No replay possible

Kinesis Data Analytics

General pipeline architecture with Kinesis Data Analytics
General pipeline architecture with Kinesis Data Analytics

Amazon Kinesis Data Analytics allows you to query and analyze data flows in real time. It is useful for real-time analysis of time series, metrics or generation of dashboards:

  • Based on Apache Flink, it supports SQL, Java, Scala and Python applications
  • It is an AWS Service fully managed, serverless and with automatic sizing
  • Cost is function of the data passing through Kinesis Data Analytics
Jean-Jerome Levy

Written by

Jean-Jerome Levy

DevOps Consultant

Seasoned professional in the field of information technology, I bring over 20 years of experience from working within major corporate IT departments. My diverse expertise has played a pivotal role in a myriad of projects, marked by the implementation of innovative DevOps practices.