Technical Blog

Capacity Estimation Back Of The Envelop

Metric SystemValue
1 Million${10^6}$
1 Billion${10^9}$
1 Trillion$10^{12}$

Data Storage UnitValueByteByte
1 KB$10^{3}$ B$10^{3}B$$2^{10}B$
1 MB$10^{3}$ KB$10^{6}B$$2^{10}B$
1 GB$10^{3}$ MB$10^{9}B$$2^{10}B$
1 TB$10^{3}$ GB$10^{12}B$$2^{10}B$
1 PB(PetaByte)$10^{3}$ TB$10^{15}B$$2^{10}B$
1 EB$10^{3}$ PB$10^{18}B$$2^{10}B$
1 ZB$10^{3}$ EB$10^{21}B$$2^{10}B$
1 YB$10^{3}$ ZB$10^{24}B$$2^{10}B$

Two PowerValue
$2^{0}$1
$2^{1}$2
$2^{2}$4
$2^{3}$8
$2^{4}$32
$2^{5}$64
$2^{6}$128
$2^{7}$256
$2^{8}$512
$2^{9}$1024
$2^{10}$2048

Generally for a large scale system we count as 1 Million request per day.

1 Million/day = $\frac{10^{6}}{24 hrs * 60 min * 60 sec}$ ≈ 11.57 ≈ 12Bytes/sec.

Storage Estimation/Bandwidth Estimation - Incoming(Ingress) Outgoing(Egress)

People in world = 8 billion.

People in India = 1.4 billion.

DAU Daily Active User.

MAU Monthly Active User.

Capacity Estimation Example.

Whatsapp.

DAU ≈ 2B ≈ 100B message every day.

Average message size ≈ 100 Bytes. Ask for the size and storage. Example you can assume one good image is 100Mb.

1 day of message = $100 * 10^{9} * 100$ Bytes ≈ $10^{13}$ ≈ 10TB

  • 100B = 100 * $10^{9}$ Bytes

5 years of message = 10TB * 365 * 5 ≈ 20PB.

1 day of message ≈ 10TB.

Bandwidth Estimation = 10TB/day

10Tb/day ≈ 10 * $10^{6}$ * $10^{6}$ B/day 
         ≈ 10 * $10^{6}$ * 12 B/sec
         ≈ 120 MB/sec.

10TB ≈ $10^{12}$ B.

$10^{6}$ B/day = 1Million/day ≈ 12/sec.

Twitter.

DAU ≈ 200M.

Read:Write ≈ 150M:100M

Storage Estimation.

  • 140 char per tweet.
  • 2 bytes to store char.
  • 20 Bytes to store the metadata(ID, UserId, timestamp).

Total Storage for a single tweet = (140 char * 2 Bytes) + 20 Bytes metadata = 280 Bytes ≈ 300Bytes.

Write Estimates.

Daily 100M tweet. 100M * 300Bytes = 3 * $10^{10}$ B/day ≈ 30GB/day.

Every 5th tweet has a image ≈ 200 kb.

Every 10th tweet has a video ≈ 2MB/video.

Total space = $\frac{100M tweets}{5}$ * 200kb + $\frac{100 M tweets}{10}$ * 2MB

= ((20M * 200KB) + (10M * 2MB)) /day

= (20 * $10^6$ * 200 * $10^{3}$) + (10 * $10^{6}$ *2 * $10^{6}$) B/day

= (4 * $10^{12}$) + (20 * $10^{12}$) B/day

= (4TB + 20TB) /day

= 24 TB/day. (Ingress Incoming Bandwidth)

= 24 * $10^{6}$ MB/day (1TB = $10^{6}$)

= $\frac{24 * 10^{6}}{10^{5}}$ MB/sec

≈ 300 Mb/sec.

Read Estimates.

DAU = 200M.

Each people see 150 tweets.

Total size of a simple tweet = 200M * 150 tweets * 300 Bytes

= 30 * $10^{3}$ * $10^{6}$ * 300

= 9TB/day

= 9 * $10^{6}$ MB/day

= 9 * $10^{6}$ * 12 Bytes/sec

= 100 MB/sec.

Every 5th tweet contains image. $200M *150$ tweets are read daily.

Total size of the tweets contains image = $\frac{200*150}{5}$ * 200KB/day

= (6000M * 200KB)/day

= 12 * $10^{5}$ * $10^{6}$ KB/day

= 1.2 TB/day.

Every 10th tweet contains video. $200M * 150$ tweets are read daily.

Total size of the tweets contains video = $\frac{200 * 150}{10}$ * 2MB/day

= (3000M * 2MB)/day

= 6 * $10^{3}$ * $10^{6}$ * $10^{3}$ KB/day

= 6 * $10^{12}$ KB/day

= 6TB/day.

All the pics and video and text are egress requests.