Sample Ad Advertise your business on myplick. Only $2.00 a month.
Comments:
Notes:
Slide 1: Deploying Large File Transfer on an HTTP Content Distribution Network
KyoungSoo Park and Vivek Pai Princeton University
Slide 2: Large File Transfers
Generally in range of 10+MB - few GB
Software
distribution, patches, movies, etc. One-to-many downloads
Not friendly to HTTP CDNs
High
replication consumes too much space Whole-file caching evicts 1000’s of small files
Current approach: custom protocols
10/31/07 USENIX WORLDS '04 2
Slide 3: Why Not HTTP?
Software widely available
Both
for publishers and clients port 80
No Firewall, NAT problems
Well-known
CDNs already exist
No
need for content in many formats Easier resource provisioning
10/31/07 USENIX WORLDS '04 3
Slide 4: Large Files Over HTTP CDNs
Break file into chunks
Use
byte-range support in HTTP But proxies, CDNs hate disjoint regions
Treat chunks as files
Hashes
easily, caches easily Fetch & evict chunks, not large files
Use unmodified clients & servers
All
10/31/07
support on the CDN itself
USENIX WORLDS '04 4
Slide 5: Our Approach
CDN = Redirector + Reverse Proxy
0-1 file
CDN reverse caches the chunks!
CDN
2 e1il
file0-1
CDN
fi l e1 -2
0-1 file
file1-2
file 0-1
f
Client
Agent
CDN
fi l e 4-5
file2-3 file2-3 file 3-4
3 file -4
CDN
file 4-5
file 0-1 file2-3
Agent
Client
file 4-5
file 4-5
CDN
file3-4
10/31/07
CDN
file4-5
5
USENIX WORLDS '04
Slide 6: The Role of Agent
Separate process on a CDN node
Viewed GET
as a simple HTTP server
Split large request into many small reqs
url GET url/range Merge replies into one large response
Issue parallel requests of chunks
Massage
replies from servers Retry of slow chunks
10/31/07 USENIX WORLDS '04 6
Slide 7: HTTP Header Modifications
CDN proxy GET url/ranges Header: blah Header: blah
Origin server
egress
GET url Range: bytes ranges Header: blah
Origin server
CDN proxy
HTTP/1.0 206 Partial Range: start-end/length Header: blah
10/31/07
ingress
HTTP/1.0 200 OK Content-length: piece length New-header: obj length
7
USENIX WORLDS '04
Slide 8: Deployment Status
CoDeploy running since March 2004
Available
on ~120 PlanetLab nodes Used in file synchronization service
Low incremental overhead
Agent
is about 500 semicolons CDN mods about 20 semicolons Techniques portable to other CDNs
10/31/07 USENIX WORLDS '04 8
Slide 9: Parallelism vs. Chunk size
More parallel requests
Involves
more CDN nodes per-chunk overheads
Bigger chunk size
Reduces
Total buffer size
(#
parallel) * (chunk size) * (# clients)
10/31/07
USENIX WORLDS '04
9
Slide 10: Total Buffer Size vs. Bandwidth
10KB chunk 20KB chunk Bandwidth(Kbps)
12000 10000 8000 6000 4000 2000 0 BW ∝ total buffer size
40KB chunk
80KB chunk
160KB
320KB
640KB
1.28MB
2.56 MB
5.12MB
10KB
10/31/07
20KB
40KB
80KB
Total Buffer Size
USENIX WORLDS '04
10
Slide 11: Downloading Tests
Server at lightly-loaded Princeton node
Downloading a 50MB file 10 parallel chunks of 60KB each
Point-to-point vs. 1-to-many Test policies
Direct (with larger socket buffers) Aggressive - point-to-point CoDeploy, no CDN CoDeploy First CoDeploy Cached
10/31/07 USENIX WORLDS '04 11
Slide 12: One Server One Client
Node Rutgers U Maryland U Notre Dame U Michigan U Nebraska U Utah UBC U Washington UC Berkeley UCLA
10/31/07
unit = ms, Kbps
Aggressive 4501 5535 7728 6205 5119 3900 4137 4357 9308 4055 14123 14123 25599 10239 4266 7585 7728 17808 20479 7314
12
Ping 5.6 5.5 35.1 46.9 51.4 66.8 74.6 79.2 81.4 84.6
Direct 18617 12046 8359 5688 2786 2904 3276 4551 4501 2677
CoDeploy First 3624 4095 4179 4708 4551 2512 2100 1765 6501 2178 Cached
USENIX WORLDS '04
Slide 13: One Server Many (120) Clients 7000
6000 5000 4000 3000 Direct Aggressive CoDeploy First CoDeploy Cached 3938 2731 3011 2861 4995 4948
6023
3225
2000 Bandwidth(Kbps) 1000 651588 0 25%
10/31/07
742647
861970
1037 745
Median Mean Percentile
USENIX WORLDS '04
75%
13
Slide 14: Chunk Download Times
Short time-scale effects dominate
Low
download time, high std deviation Implies fast == unpredictable
We see this in practice
Some
nodes hit times vary 20x Makes managing timeouts harder
10/31/07
USENIX WORLDS '04
14
Slide 15: Lessons Learned
Large file support over HTTP possible
Basic
implementation is easy No client/server changes
Tradeoffs not where you expect
Flexibility
on buffer size, parallelism Hard part is managing performance Short time-scale effects dominate
10/31/07 USENIX WORLDS '04 15
Slide 16: Future Work
Better proximity info
Tradeoff
with load balance
Better timing management
Prefetching
at reverse proxies More aggressive retrying
HTTP streaming
More
10/31/07
chunks = less jitter?
USENIX WORLDS '04 16
Slide 17: More Info
http://codeen.cs.princeton.edu/codeploy/ KyoungSoo Park kyoungso@cs.princeton.edu Vivek Pai vivek@cs.princeton.edu
10/31/07 USENIX WORLDS '04 17