CS791: 11/16 Lecture Notes
2 papers today: (3 if we're lucky)
- SRM (S. Floyd, V. Jacobson, C. Liu, S. McCanne, L. Zhang)
- PPV (C. Papadopoulos, G. Parulkar and G. Varghese)
- Digital Fountain (next time)
Slide 1
Difficulties in changing from a unicase to a multicast paradigm for data transport
- ACK implosion : 1-to-1, there is data/ack balance
1-to-many, "acks >> data" and are hard to merge
- General scalability issues
10-20 is easy. 100s of 1000s of clients is hard.
- Heterogeneous clients: congestion control:
receivers w/ different connectivity want good performance
layering - a general/common approach
packet loss handling/reaction
Slide 2
Defining Reliability for the purpose of Multicast:
- "Reliable Transfers"
- Goal: all rcvrs eventually receive all data
No notion of "sequenced delivery" : only "total delivery"
(abstraction = complete content // rather than // ongoing stream)
- Timeliness not a major consideration (Except PPV infocomm98)
- Multiple senders makes things more interesting
"Very Much Harder problem" (whiteboard - "wb")
- One sender (DF model - data dissemination)
Slide 3
SRM (Scalable Reliable Multicast): Key ideas
- "1 sz fits all" doesn't work for more complicated applications
- Application level framing (ALF): the app knows the right defn of reliability
- Notion: SRM as "core" functions that ALL reliable multicast
protos will need, app can then build additional functionality
on top of it
- cong control
- seq, strict ordering
- assumes best-effort IP MC
- Fate-sharing in unicast implies that either the
sender or receiver can do loss recovery
- TCP :
- sender-driver (sender retransmits until ACK)
- fate sharing: if either end dies, the connection is gone, so
sender and rcvr-driven recovery are equivalent
- Notion of WEAK fate-sharing:
It's easy for all rcvrs to know about the server,
harder for server to know about all rcvrs
If a rcvr goes down, other rcvrs (even sender)
generally don't care, therefore:
Receiver-driven reliability (NAK) repeatedly
until you get the data you want
- while this was being debated,
lots of sender v. receiver papers claimed receiver preferable
Slide 4
Objective: "minimal" definition of reliable multicast
- Use IP multicast as baseline/network paradigm
- Rcvr-driven reliability
- Target Application: "shared whiteboard"
- "moderate" number of participants
- all participants are (potentially) senders
-
Any sender can send a "drawop" - an idempotent operation
-
idempotency: ordering is not important, duplication is not an issue
(bank deposits are NOT idempotent! drawops are!)
reliable mc with duplicate uppression - HARD!
(not touched in these papers)
- Receiver set may be unmaintainable -
noone knows the complete set
- User/event Naming, esp of pieces of state, is an
interesting and difficult problem (not discussed today)
Slide 5
Weak, "eventual" reliability
- no attempt at globally consistent whiteboard
- display whatever is currently available
- fixup problems on the fly (naming conventions help here)
Feedback control (interesting part for us) is NACK-based
- but not "all-the-way-to-sender" : that's unscalable
(NACK implosion if pkt is lost close to source)
- Receivers cooperate : other rcvrs can retransmit
what we may have lost.
- NACK to "neighbors", one or more (hopefully)
rcvs and re-xmits over MC group
going across MC group - potential waste,
but it's safe because of idempotency
- Receiver collaboration
(and one can act as a retransmission source)
- Idempotency
(duplicates are not a problem - defined away by the app)
- Scope + time reqs/resps appropriately (
one of the nice features of the SRM paper)
Slide 6
Algorithm for loss recovery (essential aspects of paper):
- Sender S, hosts A, B
- 1st: A figures out he's missing something (remember - no sequencing between sources)
i.e. a gap in a sequence from a particular sender
- 2nd: Randomization ; before recovery attempt,
wait for random timer (avoids implosion)
- morover - nacks scoped locally,
so don't necessarily reach the source
best algo not yet determined
heuristics: set low TTLs and experiment.
- Host B can service a request
- Set a random times - prevent response implosion
- could unicast repairs - bad:
noone else knows you serviced - can't cancel their timers
- doesn't help with clusteres of loss
- multicast repairs (again, w/ limited TTL)
- "local recovery, setting random timers"
- Randomization suppresses both duplicate requests and duplicate responses
Question:
"can we distinguish sender S from re-sender B?"
Answer:
yes - IP sender address is different. not an issue for SRM, may be for others
Slide 7
Simulation / measurement : loss analysis
- Simple topologies:
- req vs. repairs vs. repair latency (fig 3-6)
- repair latency is O(5rtt) - penalty of timers
- "small price to pay for traffic savings"
- Feeling of mbone/community: it's hard to beat the SRM approach
Question:
Gabe: "sltn to repair latency - what if we pair receivers?"
Answer:
Problems:
- how do you set up the pairs? How do you make them loss-disjoint?
This maes them distant - higher latencies
- what if your pair quits the session - keepalive, ping, etc
pair-changing, etc
- what if you both lose anyway
- How realistic is expectation of people cooperating?
SRM isn't dependent on any individual -
just enlarge scope until you find someone who can help you
- no knowledge about who is there
Question:
Khaled: "what keeps C and T from re-transmitting after B?"
Answer:
B's re-xmission also reaches C and T
(does it? scoped? what size is the scope?) and they suppress.
What about out-of-order? Hmm - this makes it trickier.
This paper: "really nice ideas, seems to work well in practice,
but all the issues haven't been solved"
Slide 8
Scalability issues:
- session messages/
[[ representatives (in local domains:
primarily responsible for loss handling in local subtree) ]]
- setting timers - can of worms -
need to know distances between hosts (rtt estimates?)
to set stuff correctly, we need pairwise distances -
unclear how this can be known
- local recovery - still open - many papers address it
Slide 9
Local Recovery (next paper will use this extensively)
- goal: Loss neighborhood should coincide with traffic recovery
(recovery is focused in "bad" areas)
- imperfect tactics
- what about multiple uncorrelated loss regions?
- scaling problems (RLM)
Question:
Khaled: "are we using app-level framing?"
Answer:
yes - assumption about drawops and idempotency -
drawops fit into individual packets (generally)
problem: what if an op spans multiple packets?
have to send/request multiple packets.
in-other-words: application-level objects are "named"
so they can be requested.
logically = "presentation layer" (if you believe in such a thing)
point/issue : naming : sequence numbers may not be all that useful,
whereas ADU (application data unit) may convey
the needed (by application) information
assumption: non-conflicting namespaces
(global-uique-host-IP:host-wide-unique seq)
Slide 10
PPV: A Response to SRM.
Problems w/ SRM:
- Recovery latencies are too large (10-20rtts? what are you, kidding?)
- Requestor/replier mapping may be very random
- Excessive exposure to redundant replies
if they are poorly scoped (or bad overlap cases)
Better job at addressing these 3 probs
Slide 11
Error Correction: Goals
- Avoid req implision (SRM: good)
- Minimize dup replies (SRM: good)
- Minimize recovery latency (SRM: bad)
- Maximize recovery isolation (SRM: weak)
- Adapt to dynamic membership (SRM: good)
- Minimize overall recovery traffic
Slide 12
Error Correction: Algorithms (Use picture fm paper, f.e. fig 4)
- Small subset of interested nodes become "repliers"
- (repliership not mandatory -
"announce candidacy" to upstream router
who selects among its child candidates)
-
Crucial idea: every subtree has a replier.
-
How are they chosen? notdiscussed.
Probably very important to chose carefully.
- Once every subtree has a replier,
we have a careful loss-handling scenario:
When a whole subtree has lost:
- client NACKS, goes to subtree's replier
replier doesn't have it either,
NACKs its parent node
(propogate upward, toward server) until we find a replier
who GOT IT.
- UNICAST a request UPSTREAM.
- implicit suppression: upstream links see only ONE
packet even if whole tree saw loss,
since only one request "ascends" the tree
- Hard part: getting replies out to all who are interested
- Crucial router in this scheme:
point at which upstream "turns round" to a downstream =
"turning point"
- on the way back:want to MC, but only to the target subtree
- Any node hat has been a turning pt for a repair request,
when it sees the response request (unicast), it transforms it
into a multicast response wich it sends ONLY down to subtrees
who asked for repairs
- "sub-cast" : generalization of multicast
- defn: "Multicast transmission (limited)
to a particular sub-tree"
- alternative implementation: turning point notices
repair multicast and limits it to a sub-set of
MC-subscribed links
Why this approach is difficult: it requires routers to be smarter/do more.
Simulations in this paper =
"good start at analyzing this" = powerful mechanism for local recovery
Clarification: "turning point" =
router "above" a replier in distribution tree
Question:
"Can a 'turning point' also be a replier?"
Answer:
yes, but that would SEVERELY/MAJORLY change the router's role
(as opposed to just a BIG change in the router's role)
Only a replier's NACK gets forwarded upstream -
so router has to know who its immediate replier is
- assume replier isn't dead
- assume repliers are reliable and trustworthy
- it's a problem -
JB: "don't know what the solutio is, don't know if there is one"
- also important: handle failure/death of repliers (keepalive, etc)
THIS IS A GENERAL PROBLEM with hierarchical scheme.
intact tree of trusted representatives is the baseline assumption
- Lots of engineering involved -
lots of work (potential for lots of bugs?)
- Nice properties:
- completely deterministic
- make repair requets immediately, respond to them immediately
- issues: implementation, electing repliers (ill-defined)
- Anoter interesting projet:
intelligent placement of repliers (open/unsolved problem)
Slide XXX
Pros/Cons:
- Pro: Achieves objetives
- Pro: reliable, sequenced delivery
- Pro: scales "well" (transmissions are countable and understandable)
- Pro: recovery traffic is low and localized
- Con: engineering complexity of implementation
- Con: election procedure unclear, difficult problem
(esp if most clients don't want to be repliers)
- Con: Routers must implement and activate subcast functionality
Slide XXX
That's it - next week:
- Quiz thursday (45 mins)
- Guest lecture (45 mins) - Mystery Guest!