emulab.net:
An Emulation Testbed for Networks and Distributed Systems
|
|
|
Jay Lepreau |
|
and |
|
many others |
|
University of Utah |
|
|
|
Intel IXA University Workshop |
|
June 21, 2001 |
The Main Players
|
|
|
|
Undergrads |
|
Chris Alfeld, Chad Barb, Rob Ricci |
|
Grads |
|
Dave Andersen, Shashi Guruprasad, Abhijeet
Joglekar, Indrajeet Kumar, Mac Newbold |
|
Staff |
|
Mike Hibler, Leigh Stoller |
|
Alumni |
|
Various |
|
(Red: here at Intel today) |
What?
|
|
|
|
A configurable Internet emulator in a
room |
|
Today: 200 nodes, 500 wires, 2x BFS
(switch) |
|
virtualizable topology, links, software |
|
Bare hardware with lots of tools |
|
An instrument for experimental CS
research |
|
Universally available to any remote
experimenter |
|
Simple to use |
What’s a Node?
|
|
|
|
Physical hardware: PCs, StrongARMs |
|
Virtual node: |
|
Router (network emulation) |
|
Host, middlebox (distributed system) |
|
|
|
Future physical hardware: IXP1200 + |
Why?
|
|
|
“We evaluated our system on five
nodes.”
-job talk from university with
300-node cluster |
|
“We evaluated our Web proxy design with
10 clients on 100Mbit ethernet.” |
|
“Simulation results indicate ...” |
|
“Memory and CPU demands on the
individual nodes were not measured, but we believe will be modest.” |
|
“The authors ignore interrupt handling
overhead in their evaluation, which likely dominates all other costs.” |
|
“Resource control remains an open
problem.” |
Why 2
|
|
|
“You have to know the right people to
get access to the cluster.” |
|
“The cluster is hard to use.” |
|
“<Experimental network X> runs
FreeBSD 2.2.x.” |
|
“October’s schedule for
<experimental network Y> is…” |
|
“<Experimental network Z> is
tunneled through the Internet” |
|
|
Complementary to Other
Experimental Environments
|
|
|
Simulation |
|
Small static testbeds |
|
Live networks |
|
Maybe someday, a large scale set of
distributed small testbeds (“Access”) |
Slide 8
Zoom In: One Node
Fundamental Leverage:
|
|
|
|
|
Extremely Configurable |
|
Easy to Use |
Key Design Aspects
|
|
|
|
Allow experimenter complete control |
|
… but provide fast tools for common
cases |
|
OS’s, disk loading, state mgmt tools,
IP, traffic generation, batch, ... |
|
Virtualization |
|
of all experimenter-visible resources |
|
node names, network interface names,
network addresses |
|
Allows swapin/swapout |
Design Aspects (cont’d)
|
|
|
|
Flexible, extensible, powerful
allocation algorithm |
|
Persistent state maintenance: |
|
none on nodes |
|
all in database |
|
leverage node boot time: only known
state! |
|
Separate control network |
|
Familiar, powerful, extensible
configuration language: ns |
|
|
Some Unique Characteristics
|
|
|
|
User-configurable control of “physical”
characteristics: shaping of link latency/bandwidth/drops/errors
(via invisibly interposed “shaping nodes”), router processing power, buffer
space, … |
|
Node breakdown today: |
|
40 core, 160 edge |
More Unique Characteristics
|
|
|
Capture of low-level node behavior such
as interrupt load and memory bandwidth |
|
User-replaceable node OS software |
|
User-configurable physical link
topology |
|
Completely configurable and usable by
external researchers, including node power cycling |
Obligatory Pictures
Then
Now
A Few Research Issues and
Challenges
|
|
|
Network management of unknown entities |
|
Security |
|
Scheduling of experiments |
|
Calibration, validation, and scaling |
|
Artifact detection and control |
|
NP-hard virtual --> physical mapping
problem |
|
Providing a reasonable user interface |
|
…. |
An “Experiment”
|
|
|
emulab’s central operational entity |
|
Directly generated by an ns script, |
|
… then represented entirely by database
state |
|
|
|
Steps: Web, compile ns script, map,
allocate, provide access, assign IP addrs, host names, configure VLANs, load
disks, reboot, configure OS’s, run, report |
Mapping Example
Automatic mapping of desired
topologies and characteristics to physical resources
|
|
|
|
Algorithm goals: |
|
minimize likelihood of experimental
artifacts (bottlenecks) |
|
“optimal” packing of multiple
simultaneous experiments |
|
Extensible for heterogenous hardware, software, new features |
|
Randomized heuristic algorithm:
simulated annealing |
|
May move to genetic algorithm |
Virtual Topology
Mapping into Physical
Topology
Mapping Results
|
|
|
< 1 second for first solution, 40
nodes |
|
“Good” solution within 5 seconds |
|
Apparently insensitive to number of
node “features” |
Disk Loading
|
|
|
13 GB generic IDE 7200 rpm drives |
|
Was 20 minutes for 6 GB image |
|
Now 88 seconds |
|
|
How?
|
|
|
Do obvious compression |
|
… and a little less obvious: zero the
fs |
|
Disk writes become the bottleneck |
|
Hack the disk driver |
|
Carefully overlap I/O and decompression |
|
==> 6 minutes |
|
|
Last Step
|
|
|
|
Domain-specific compression |
|
Type the filesystem blocks: |
|
allocated |
|
free |
|
Never write the free ones |
|
|
Experiment Creation Time
Experiment Termination Time
Ongoing and Future Work
|
|
|
|
Multicast disk images |
|
IXP1200 nodes, tools, code fragments |
|
Routers, high-capacity shapers |
|
Event system |
|
Scheduling system |
|
Topology generation tools and GUI |
|
Simulation/enulation transparency |
|
Linked testbeds |
|
Wireless nodes, Mobile nodes |
|
Logging. Visualization tools |
|
Microsoft OSs, high speed links, more
nodes! |
Final Remarks
|
|
|
|
18 projects have used it (14 external) |
|
Plus several class projects |
|
Two OSDI’00 and three SOSP’01 papers |
|
20% SOSP general acceptance rate |
|
60% SOSP acceptance rate for emulab
users! |
|
More emulab’s under construction: |
|
Yes: Univ. of Kentucky, Stuttgart |
|
Maybe: WUSTL, Duke |
|
Sponsors (red: major ones, current or
expected) |
|
NSF, DARPA, University of Utah |
|
Cisco, Intel, Compaq, Microsoft,
Novell, Nortel |
Available for universities,
labs, and companies at:
www.emulab.net