Sample Ad Advertise your business on myplick. Only $2.00 a month.
Comments:
Notes:
Slide 1: Efficient Prediction of Protein-Protein nteractions using Markov Networks
Boyko Kakaradov, Matt Schefer, Boyko Kakaradov, Matt Schefer, Haidong Wang, Daphne Koller Haidong Wang, Daphne Koller
Slide 2: rotein Interaction Network
rotein eraction
Slide 3: Multiple sources of data
Protein-Protein interaction Protein localization Transcription regulation Phosphorylation Expression mRNA degradation
Slide 4: Main Goal & Approach
Extract meaningful information about Protein Interaction Networks from large-scale assays tha are: Heterogeneous Noisy Incomplete Model by Markov Network and learn it efficiently
Slide 5: The Models
Full
Regulation
Triplet
Basic + NB
LAinuc
Linuc IAi,j Ii,j Licyt
LAicyt
LAjnuc
Ljnuc
IAi,j
Ljcyt
LAjcyt
Ii,k
Ij,k
Slide 6: Previous Work
aimovich et al. (2005) proposes four Markov models PI and Localization assays inference engine is Loopy Belief Propagation limited in speed and scalability due to LBP e use efficient graphcut MAP inference engine
Learning uses Expected Counts from MAP, not the posterior from LBP Regularity constraints on clique potentials
Slide 7: 4-fold cross-validation on 2000 interaction nodes and 2000 non-interaction nodes
Naive
2104 of 2129 triplets are 1
any interactions derived from complexes easily form triplets on-interactions picked from random pairs unlikely to from triplets aive method predicts triplets pairs as interactions
Slide 8: Accuracy & Performance
ld cross-validation on 0 interaction nodes
Slide 9: Unbiased Data Sampling
Complete network on DNA Repair genes 47 proteins, 1081 interaction pairs 12 of 16,215 triplets are 111 Too many triplets. Too few interactions Oversampling: 1000 top I=1, 10K random I=0 543 proteins, 759 of 5,370 triplets are 111 Triplet Completion: 2000 top I=1, 2000 random I=0 4630 I=C form triplets with half of I=0 867 proteins, 2142 of 8181 triplets are 111
Slide 10: ci fic ity iti vi ty
Slide 11: Learning Models
• •
Generative: maximize P(I,L,IA,...) Discriminative
• •
maximize P( I | L ) maximize P( I | L,IA ) <- undirected edge
•
Standard Expectation Maximization (EM)
•
Full, Regulation2
Slide 12: ty
y
Slide 13: Full Regulation Model
•
Undirected model with a hidden Rt,k-node and a fixed node potential (func. of p-value)
Ii,k
Rt,i Rt,k
w0 = 1 - w1 w1 = (20ep)/(1-20ep)
Ii,k
Rt,i
•
w00=f(c00)
Rt,k RAt,
k
w01=f(c0
w10=f(c10)
w11=f(c1
Directed parametric model with observed Rt,k-node, RAt,k evidence depends on p-value
Slide 14: ...
Slide 15: Future Work: model
More data sources Transcription regulation Phosphorylation Expression Protein domains Context-specific interaction In which cell cycle does interaction take place Identify protein complex (e.g. Hemoglobin) from
Slide 16: Future Work: algorithms
Relax the regularity constraints to approximate MAP Discriminative training does not require regularity on conditioned nodes Truncation (Rother, 2005): extend to 3-variable terms QPBO method (Kolmogorov & Rother, 2006)
Learn the parametric model of regulation p-value given the hidden regulation node