Data Pipeline Throughput Calculator

Calculate data pipeline throughput from records per second and record size. Estimate daily volume and bandwidth requirements.

x
ms
Avg Throughput
12.70 MB/s
Peak: 38.09 MB/s
Wire Size per Record
1.3 KB
With JSON overhead
Network Bandwidth
106.5 Mbps
Peak: 319.5 Mbps
Daily Volume
1,071.2 GB
Peak day: 3,213.5 GB
Monthly Volume
31.38 TB
Yearly: 376.6 TB
End-to-End Latency
600 ms
3 stages x 200 ms/stage
Gbps Required
0.106 Gbps
Peak: 0.319 Gbps
Yearly Storage
376.6 TB
Without compression or dedup

Volume by Time Period

PeriodRecordsVolumeScale
Per Second10,00012.70 MB
Per Minute600,000761.72 MB
Per Hour36,000,00044.63 GB
Per Day864,000,0001.05 TB
Per Month (30d)25,920,000,00031.38 TB

Pipeline Stage Latency

Stage 1200 ms cumulative
Stage 2400 ms cumulative
Stage 3600 ms cumulative

Capacity Planning

MetricAvgPeak (3x)
Throughput (MB/s)12.7038.09
Bandwidth (Mbps)106.5319.5
Bandwidth (Gbps)0.1060.319
Daily Volume1,071.2 GB3,213.5 GB
Records / Hour36,000,000108,000,000
Planning notes, formulas, and examples

About the Data Pipeline Throughput Calculator

Data pipelines move records from sources to destinations at rates ranging from hundreds to millions of records per second. Understanding the throughput—both in records/sec and bytes/sec—is critical for sizing infrastructure, provisioning network bandwidth, and planning storage capacity. A pipeline processing 10,000 records/sec at 1 KB each generates 10 MB/sec, which is 864 GB/day.

This calculator converts records-per-second throughput into meaningful capacity metrics: MB/sec, GB/hour, GB/day, and TB/month. It helps you size Kafka clusters, plan network bandwidth, estimate storage requirements, and set realistic SLAs for data freshness.

Whether you're designing a new streaming pipeline on Kafka or Kinesis, or evaluating whether your existing pipeline can handle traffic growth, this calculator gives you the numbers you need for infrastructure planning.

When This Page Helps

Pipeline capacity mismatches cause data loss, backpressure, and stale analytics. This calculator translates records/sec into storage and bandwidth requirements so you can provision infrastructure correctly before traffic peaks.

How to Use the Inputs

  1. Enter the expected records per second.
  2. Enter the average record size in bytes.
  3. Review the throughput in MB/sec.
  4. Check the daily and monthly volume projections.
  5. Use the bandwidth requirement for network planning.
  6. Adjust for peak vs. average traffic with a multiplier.
Formula used
throughput_bytes_sec = records_per_sec × avg_record_bytes; daily_GB = throughput_bytes_sec × 86400 / (1024³); monthly_TB = daily_GB × 30 / 1024

Example Calculation

Result: 10 MB/sec avg; 864 GB/day

10,000 records/sec × 1,024 bytes = 10,240,000 bytes/sec (10 MB/sec). Daily: 10 × 86,400 = 864,000 MB = 844 GB. Monthly: ~25.3 TB. With 2× peak multiplier, provision for 20 MB/sec and 1.7 TB/day peak throughput.

Tips & Best Practices

  • Size for peak throughput, not average—traffic spikes can be 2–5× the average.
  • Kafka partitions should handle at least 10 MB/sec each—divide total throughput by 10 for partition count.
  • Kinesis shards handle 1 MB/sec in and 2 MB/sec out—plan shard count accordingly.
  • Add 20–30% headroom for serialization overhead (Avro: ~10%, JSON: ~40%, Protobuf: ~5%).
  • Monitor consumer lag to detect when throughput exceeds processing capacity.
  • Use compression (snappy, lz4, zstd) to reduce network bandwidth by 40–70%.

Sizing Kafka Clusters

Each Kafka partition handles roughly 10–50 MB/sec. Divide your total throughput by the per-partition throughput to get minimum partition count. Multiply by replication factor for total broker disk throughput. Add 30% headroom for traffic spikes.

Network Bandwidth Planning

Pipeline throughput directly consumes network bandwidth. A 100 MB/sec pipeline requires at least 1 Gbps network capacity (800 Mbps data + overhead). Cross-region replication doubles bandwidth requirements. Use compression to reduce wire size.

Storage Capacity from Throughput

Daily volume = throughput × 86,400 seconds. Multiply by retention period for total storage. A 50 MB/sec pipeline with 7-day retention needs: 50 × 86,400 × 7 = 30.2 TB of raw storage, or ~10 TB compressed.

Sources & Methodology

Last updated:

Frequently Asked Questions

  • Web analytics: 1,000–50,000/sec. IoT telemetry: 10,000–500,000/sec. Financial markets: 100,000–1,000,000/sec. Log shipping: 5,000–100,000/sec. Rates vary enormously by use case and traffic volume.