bigrodic's picture
From bigrodic rss RSS  subscribe Subscribe

2011 mongo FR - scaling with mongodb 

2011 mongo FR - scaling with mongodb

 

 
 
Tags:  mongodb  scalability mongodb sharding replication db databa 
Views:  58
Published:  December 26, 2011
 
0
download

Share plick with friends Share
save to favorite
Report Abuse Report Abuse
 
Related Plicks
From mysql to MongoDB(Mong oDB2011北&#20 140;交&#27969 ;会)

From mysql to MongoDB(MongoDB2011北京交流会)

From: adam160
Views: 35 Comments: 0
From mysql to MongoDB(MongoDB2011北京交流会)
 
A Brief MongoDB Intro

A Brief MongoDB Intro

From: lowraxea
Views: 57 Comments: 0
A Brief MongoDB Intro
 
MongoDB for C# developer

MongoDB for C# developer

From: emcan95
Views: 27 Comments: 0

 
joe baugley cloudcamp london june2010

joe baugley cloudcamp london june2010

From: cacheny
Views: 72 Comments: 0

 
synquery platform

synquery platform

From: awlee
Views: 100 Comments: 0

 
Mongodb

Mongodb

From: onur13
Views: 61 Comments: 0

 
See all 
 
More from this user
Social Media for alliance and partner management

Social Media for alliance and partner management

From: bigrodic
Views: 404
Comments: 0

Transaction Cost Theory - Request for Participation in Research

Transaction Cost Theory - Request for Participation in Research

From: bigrodic
Views: 390
Comments: 0

Mpc1107 Web

Mpc1107 Web

From: bigrodic
Views: 387
Comments: 0

Bank Loan Policies And Road Accidents Dr. Shriniwas Kashalikar

Bank Loan Policies And Road Accidents Dr. Shriniwas Kashalikar

From: bigrodic
Views: 254
Comments: 0

SIP Trunking & Security  in an Enterprise Network

SIP Trunking & Security in an Enterprise Network

From: bigrodic
Views: 500
Comments: 0

 
See all 
 
 
 URL:          AddThis Social Bookmark Button
Embed Thin Player: (fits in most blogs)
Embed Full Player :
 
 

Name

Email (will NOT be shown to other users)

 

 
 
Comments: (watch)
 
 
Notes:
 
Slide 1: Scaling with MongoDB Eliot Horowitz @eliothorowitz MongoUK March 21, 2011
Slide 2: Scaling • Storage needs only go up • Operations/sec only go up • Complexity only goes up
Slide 3: Horizontal Scaling • Vertical scaling is limited • Hard to scale vertically in the cloud • Can scale wider than higher
Slide 4: Read Scaling • One master at any time • Programmer determines if read hits master or a slave well • Pro: easy to setup, can scale reads very • Con: reads are inconsistent on a slave • Writes don’t scale
Slide 5: One Master, Many Slaves • Custom Master/Slave setup • Have as many slaves as you want • Can put them local to application servers • Good for 90+% read heavy applications (Wikipedia)
Slide 6: Replica Sets • High Availability Cluster • One master at any time, up to 6 slaves • A slave automatically promoted to master if failure • Drivers support auto routing of reads to slaves if programmer allows • Good for applications that need high write availability but mostly reads (Commenting System)
Slide 7: Sharding • Many masters, even more slaves • Can scale in two dimensions • Add Shards for write and data size scaling • Add slaves for inconsistent read scaling and redundancy
Slide 8: Sharding Basics • Data is split up into chunks • Shard: Replica sets that hold a portion of the data system • Config Servers: Store meta data about • Mongos: Routers, direct direct and merge requests
Slide 9: Architecture Shards mongodd dd mongod Config Servers mongod mongod mongod client client client client mongos mongos ... mongodd dd mongod mongod mongodd dd mongod mongod ... mongod
Slide 10: Common Setup • A common setup is 3 shards with 3 servers per shard: 3 masters, 6 slaves • Can add sharding later to an existing replica set with no down time collections • Can have sharded and non-sharded
Slide 11: Range Based MIN A F M R MAX F M R Z LOCATION shard1 shard1 shard2 shard3 • collection is broken into chunks by range • chunks default to 64mb or 100,000 objects
Slide 12: Config Servers • 3 of them • changes are made with 2 phase commit • if any are down, meta data goes read only • system is online as long as 1/3 is up
Slide 13: mongos • Sharding Router • Acts just like a mongod to clients • Can have 1 or as many as you want • Can run on appserver so no extra network traffic • Cache meta data from config servers
Slide 14: Writes • Inserts : require shard key, routed • Removes: routed and/or scattered • Updates: routed or scattered
Slide 15: Queries • By shard key: routed • sorted by shard key: routed in order • by non shard key: scatter gather • sorted by non shard key: distributed merge sort
Slide 16: Splitting • Take a chunk and split it in 2 • Splits on the median value • Splits only change meta data, no data change
Slide 17: Splitting T1 T2 MIN A MAX Z LOCATION shard1 MIN A G MAX G Z LOCATION shard1 shard1 T3 MIN A D G S MAX D G S Z LOCATION shard1 shard1 shard1 shard1
Slide 18: Balancing • Moves chunks from one shard to another • Done online while system is running • Balancing runs in the background
Slide 19: Migrating T3 MIN A D G S MAX D G S Z MAX D G S Z MAX D G S Z LOCATION shard1 shard1 shard1 shard1 LOCATION shard1 shard1 shard1 shard2 LOCATION shard1 shard1 shard2 shard2 T4 MIN A D G T5 S MIN A D G S
Slide 20: Choosing a Shard Key • Shard key determines how data is partitioned • Hard to change • Most important performance decision
Slide 21: Use Case: User Profiles { email : “eliot@10gen.com” , addresses : [ { state : “NY” } ] } •Shard by email •Lookup by email hits 1 node •Index on { “addresses.state” : 1 }
Slide 22: Use Case: Activity Stream { user_id : XXX, event_id : YYY , data : ZZZ } •Shard by user_id •Looking up an activity stream hits 1 node •Writing even is distributed •Index on { “event_id” : 1 } for deletes
Slide 23: Use Case: Photos { photo_id : ???? , data : <binary> } What’s the right key? •auto increment •MD5( data ) •now() + MD5(data) •month() + MD5(data)
Slide 24: Use Case: Logging { machine : “app.foo.com” , app : “apache” , when : “2010-12-02:11:33:14” , data : XXX } Possible Shard keys •{ machine : 1 } •{ when : 1 } •{ machine : 1 , app : 1 } •{ app : 1 }
Slide 25: Download MongoDB http://www.mongodb.org and let us know what you think @eliothorowitz @mongodb 10gen is hiring! http://www.10gen.com/jobs

   
Time on Slide Time on Plick
Slides per Visit Slide Views Views by Location