cyli's picture
From cyli rss RSS  subscribe Subscribe

Userspace RCU library : what linear multiprocessor scalability means for your application 

Userspace RCU library : what linear multiprocessor scalability means for your application

 

 
 
Tags:  available domain name  rcu 
Views:  152
Published:  April 20, 2011
 
0
download

Share plick with friends Share
save to favorite
Report Abuse Report Abuse
 
Related Plicks
Wp Clickto Buy Guide

Wp Clickto Buy Guide

From: cotte63
Views: 457 Comments: 0

 
Social Mktg Secrets

Social Mktg Secrets

From: amonty26
Views: 444 Comments: 0
Social Mktg Secrets
 
Uses in bulk domain registrations

Uses in bulk domain registrations

From: brianm
Views: 358 Comments: 0

 
How to get cheap domain name

How to get cheap domain name

From: akmalmarketing
Views: 71 Comments: 0
Get your cheap domain name at http://x.co/cHJr
 
Cheap Domain Names For Sale

Cheap Domain Names For Sale

From: mandrake619
Views: 73 Comments: 0
Tips for domain name registration and picking the right domain. Web hosting basics.

Godaddy Website Builder & Domains Promo Codes: Save money at Godaddy using this special
p (more)

 
See all 
 
More from this user
Razorfish Outlook Report 2010

Razorfish Outlook Report 2010

From: cyli
Views: 391
Comments: 0

Equity direct funding the lending process

Equity direct funding the lending process

From: cyli
Views: 81
Comments: 0

Anti-TNF treatment for disc-related pain at the Institute for ...

Anti-TNF treatment for disc-related pain at the Institute for ...

From: cyli
Views: 79
Comments: 0

East bay office & flex space 1 q10 market report dp

East bay office & flex space 1 q10 market report dp

From: cyli
Views: 290
Comments: 0

Instructions for Form 1120 REIT, U.S. Income Tax Return for Real Estate Investment Trusts

Instructions for Form 1120 REIT, U.S. Income Tax Return for Real Estate Investment Trusts

From: cyli
Views: 649
Comments: 0

La Casa del Balcon - Argentina Mar Del Plata Accommodation

La Casa del Balcon - Argentina Mar Del Plata Accommodation

From: cyli
Views: 411
Comments: 0

See all 
 
 
 URL:          AddThis Social Bookmark Button
Embed Thin Player: (fits in most blogs)
Embed Full Player :
 
 

Name

Email (will NOT be shown to other users)

 

 
 
Comments: (watch)
 
 
Notes:
 
Slide 1: Userspace RCU Library: What Linear Multiprocessor Scalability Means for Your Application Linux Plumbers Conference 2009 Mathieu Desnoyers École Polytechnique de Montréal
Slide 2: > Mathieu Desnoyers ● Author/maintainer of : – LTTV (Linux Trace Toolkit Viewer) ● 2003-... 2005-... 2007... 2008-... 2009-... – LTTng (Linux Trace Toolkit Next Generation) ● – Immediate Values ● – Tracepoints ● – 2 Userspace RCU Library ●
Slide 3: > Contributions by ● Paul E. McKenney – IBM Linux Technology Center Rowland Institute, Harvard University Computer Science Department, Portland State University Computer and Software Engineering Dpt., École Polytechnique de Montréal ● Alan Stern – ● Jonathan Walpole – ● Michel Dagenais – 3
Slide 4: > Summary ● RCU Overview Kernel vs Userspace RCU Userspace RCU Library Benchmarks RCU-Friendly Applications ● ● ● ● 4
Slide 5: > Linux Kernel RCU Usage 5
Slide 6: > RCU Overview ● Relativistic programming – – Updates seen in different orders by CPUs Tolerates conflicts ● Linear scalability Wait-free read-side Efficient updates – ● ● Only a single pointer exchange needs exclusive access 6
Slide 7: > Schematic of RCU Update and Read-Side C.S. 7
Slide 8: > RCU Linked-List Deletion 8
Slide 9: > Kernel vs Userspace RCU ● Quiescent state – Kernel threads ● Wait for kernel pre-existing RCU read-side C.S. to complete Wait for process pre-existing RCU read-side C.S. to complete – User threads ● 9
Slide 10: > Userspace RCU Library ● QSBR – liburcu-qsbr.so liburcu-mb.so liburcu.so liburcu-defer.so ● Generic RCU – ● Signal-based RCU – ● call_rcu() – 10
Slide 11: > QSBR ● Detection of quiescent state: – Each reader thread calls rcu_quiescent_state() periodically. ● Require application modification Read-side with very low overhead ● 11
Slide 12: > Generic RCU ● Detection of quiescent state: – rcu_read_lock()/rcu_read_unlock() mark the beginning/end of the critical sections Counts nesting level – ● Suitable for library use Higher read-side overhead than QSBR due to added memory barriers ● 12
Slide 13: > Signal-based RCU ● Same quiescent state detection as Generic RCU Suitable for library use, but reserves a signal Read-side close to QSBR performance – ● ● Remove memory barriers from rcu_read_lock()/rcu_read_unlock(). Replaced by memory barriers in signal handler, executed at each update-side memory barrier. – 13
Slide 14: > call_rcu() ● Eliminates the need to call synchronize_rcu() after each removal Queues RCU callbacks for deferred batched execution Wait-free unless per-thread queue is full “Worker thread” executes callbacks periodically Energy-efficient, uses sys_futex() ● ● ● ● 14
Slide 15: > Example: RCU Read-Side struct mystruct *rcudata = &somedata; /* register thread with rcu_register_thread()/rcu_unregister_thread() */ void fct(void) { struct mystruct *ptr; rcu_read_lock(); ptr = rcu_dereference(rcudata); /* use ptr */ rcu_read_unlock(); } 15
Slide 16: > Example: exchange pointer struct mystruct *rcudata = &somedata; void replace_data(struct mystruct data) { struct mystruct *new, *old; new = malloc(sizeof(*new)); memcpy(new, &data, sizeof(*new)); old = rcu_xchg_pointer(&rcudata, new); call_rcu(free, old); } 16
Slide 17: > Example: compare-and-exchange pointer struct mystruct *rcudata = &somedata; /* register thread with rcu_register_thread()/rcu_unregister_thread() */ void modify_data(int increment_a, int increment_b) { struct mystruct *new, *old; new = malloc(sizeof(*new)); rcu_read_lock(); /* Ensure pointer is not re-used */ do { old = rcu_dereference(rcudata); memcpy(new, old, sizeof(*new)); new->field_a += increment_a; new->field_b += increment_b; } while (rcu_cmpxchg_pointer(&rcudata, old, new) != old); rcu_read_unlock(); call_rcu(free, old); } 17
Slide 18: > Benchmarks ● Read-side Scalability Read-side C.S. length impact Update Overhead ● ● 18
Slide 19: > Read-Side Scalability 64-cores POWER5+ 19
Slide 20: > Read-Side C.S. Length Impact 64-cores POWER5+, logarithmic scale (x, y) 20
Slide 21: > Update Overhead 64-cores POWER5+, logarithmic scale (x, y) 21
Slide 22: > RCU-Friendly Applications ● Multithreaded applications with readoften shared data – Cache ● ● ● Name servers Proxy Web servers with static pages Low synchronization overhead Dynamically modified without restart – Configuration ● ● 22
Slide 23: > RCU-Friendly Applications ● Libraries supporting multithreaded applications – Tracing library, e.g. lib UST (LTTng port for userspace tracing) ● http://git.dorsal.polymtl.ca/?p=ust.git 23
Slide 24: > RCU-Friendly Applications ● Libraries supporting multithreaded applications (cont.) – Typing/data structure support ● Typing system – – – Creation of a class is a rare event Reading class structure happens at object creation/destruction (_very_ often) Applies to gobject ● Used by: gtk/gdk/glib/gstreamer... ● ● Efficient hash tables Glib “quarks” 24
Slide 25: > RCU-Friendly Applications ● Routing tables in userspace Userspace network stacks Userspace signal-handling – – ● ● Signal-safe read-side Could implement an inter-thread signal multiplexer ● Your own ? 25
Slide 26: > Info / Download / Contact ● Mathieu Desnoyers – Computer and Software Engineering Dpt., École Polytechnique de Montréal http://www.lttng.org/urcu git://lttng.org/userspace-rcu.git mathieu.desnoyers@polymtl.ca ● Web site: – ● Git tree – ● Email – 26

   
Time on Slide Time on Plick
Slides per Visit Slide Views Views by Location