Current projects
linuxfunFAST: Fair Assignment for Storage Tenants
Build a block-level cloud storage system, with a more predictable performance.
You could see more in our HotCloud12 paper.
TR6 - Performance Interference in Ceph and FAST
git-public.flux.utah.edu:/flux/git/users/xinglin/writting/TR6.git
rbd driver:
git-public.flux.utah.edu:/flux/git/users/xinglin/projects/fast/linux-3.2.16.git
Ceph:
git-public.flux.utah.edu:/flux/git/users/xinglin/projects/fast/ceph-0.56.3.git
Improve the performance of deduplication storage system
Deduplication storage systems need massive and parallel computations for hash,
index lookup, block existence check, compression, and decompression operations.
We propose to use GPU to accelerate these computations, which reduces
overheads from these computations for read and write operations. In
deduplication storage systems, files are stored in disks in nonsequential
orders. However, disks are only good at sequential accesses. As a result,
disks in deduplication storage systems have a significant performance
degradation and increased load. For a set of Linux images we store in Venti,
we can observe a significant drop(82.04%) in the read performance:
the read performance
drops from 34.43 MB/s to be only 6.19MB/s. We are investigating the reasons
for such a huge drop and try to optimize it.
Past projects
June. 2011 ~ Jan. 2012
High-performance Disk Imaging With Deduplicated Storage
In clouds and network testbeds, a disk image deployment system
is needed to quickly distribute and install virtual machine images or
operating system images at host devices. Previous work has shown that for these
images, deduplication can save a significant amount of disk
space. However, the read and write performance in deduplication storage
systems is poor relative to traditional filesystem storage. In this work,
we demonstrate that we can use deduplication storage systems as the
backend of a high-performance image deployment system with only a
negligible drop in performance by carefully pipelining to produce a
balanced system.
[short paper][poster]
Jan. 2011 ~ June. 2011
Refining the Utility Metric for Utility-Based Cache Partitioning
Miss rate is widely used to determine cache partitioning for multi-core
systems. However, a well recognized fact in the community is that
MPKI can lead to sub-optimal cache partitioning. This project is to quantify
the extent of sub-optimal for MPKI based cache partitioning and proposed
a simple scheme for CPI predictions.
[paper]
[source code]
Dec. 2010
Linux physical memory deduplication
The main goal is to deduplicate identical pages in physical memory.
We have implemented a kernel module to calculate a hash for every single
physical page for both x86 and x86_64 Linux. Another kernel module is
also implemented to export the content of a single specified physical page.
After we found that Linux has already implemented this function in /mm/ksm.c,
we stopped this project.
[source code]
Resources:
storage-related I/O traces:
Traces from UCSC SNIA traces
open source deduplication storage systems:
Venti
ZFS opendedup
Workloads:
DVDStore, Microsoft Exchange Server(Loadgen), TPC-H
IO profiles
VDI profiles