Quick start

Detailed Build Instructions | Preparing Your Guest VMs | Implementation Details

The time-traveling infrastructure provides the capability to record, and replay execution of Xen virtual machines with an instruction-level precision. Current implementation supports recording, and replay of single-CPU, paravirtual Linux guests.

If you are interested to know more details you should read Implementation Details.

Download

The time-traveling code is a set of extensions to the Xen VMM. It is available as a Mercurial repository:

You can get the repository with the following command:

Install

Time-traveling code extends the source tree of the Xen VMM platform. You can follow the regular Xen build procedure, which is described in the README file. In short you have to invoke the following command in the Xen source tree:

make world
sudo make install

If you build our code for the first time, we recommend that you read Detailed Build Instructions.

A couple of critical notes:

  • You should configure dom0 kernel with debug FS support, and mount the debug file system upon boot. Add a line to /etc/rc.local:
mount -t debugfs debugfs /sys/kernel/debug/
  • Of course NAT-based networking requires ip tables, and related modules in dom0. LVM relies on "Multiple devices driver support (RAID and LVM)" (CONFIG_MD option).
  • Shadow paging freezes domU on boot, so keep it small. 96MB are fine.

Run/Time-Travel

You may wan to read Prepare VMs Instructions to prepare your guest VMs.

Please remember that if you want to time-travel VM with a block device, you have to make sure that disk is identical at the start of both recording and replay runs. The easiest way to ensure this property is to run your time-traveling VMs on top of LVM snapshots (see Prepare VMs Instructions for details). Of course you can reflush your disk with dd or a similar tool.

Record

To start a time-traveling run of a guest VM, you have to start a logging daemon: ttd-deviced. After you start the daemon you create a guest VM with a special ttd_flag.

sudo ttd-deviced -f /<your path to log files folder>/ttd.log
sudo xm create /<your path to config files folder>/client_A_solo_no_ttd.conf time_travel="ttd_flag=1" -c

Replay

In a similar way, to replay execution history of the guest, you start the ttd-deviced daemon. This time you create a guest VM with two flags: ttd_flag, and tt_replay_flag.

sudo ttd-deviced -f /<your path to log files folder>/ttd.log
sudo xm create /<your path to config files folder>/client_A_solo.conf time_travel="ttd_flag=1, tt_replay_flag=1" -c

Sending a Bug Report

In order to analyze a replay error I need to look at execution trace near the point at which execution diverges. You can find the first point at which execution diverges in the console output of your system, it looks like:

TT_LOG:(cpu(1): ttd_replay_current_events) Error:Replay missed event:TTD_EXCEPTION(16),info:do_page_fault(14):now (brctr:533717718, ip:0xc029ec30), event (brctr:533717582, ip:0xc029ec30),

Here the execution diverged near the brunch counter 533717582. In order to provide me with a meaningful debug information you have to do the following steps:

  • Read the replay log near this point:
    ttd-read-log -i -r -n 533717582 -v -f /tmp/sda4/images/client_snapshots/ttd.log > replay.run
  • Look at the replay.run file and look for the last successfully replayed event (this is needed to make sure that original, and trace files are aligned properly on this event -- this is handy for vimdiff comparison of the logs).
  • Lets say the last successfully replayed event happened at 533713656, read both replay and original run around this point:
    ttd-read-log -i -n 533713656 -v -f /tmp/sda4/images/client_snapshots/ttd.log > orig.run
    ttd-read-log -i -r -n 533713656 -v -f /tmp/sda4/images/client_snapshots/ttd.log > replay.run
You can compare these two files to check that execution was identical up to a certain point and then diverged:
vimdiff -R orig.run replay.run
  • Send me orig.run, and replay.run along with your domU kernel (of course I need symbolic information so send me vmlinux).

A couple of things. If execution diverged within a user-level application, e.g. the EIP was 0x8....... you have to send me a binary of that application. Sometimes it's hard to guess which application was executing at that point.