Where?

Find the code here: http://github.com/bwlewis/pvshm.
Find PVFS2 here: http://PVFS.org/.

What?

PVSHM File system for Linux (work in progress).

Pvshm provides a simple way to add basic memory mapping capability to almost any file system, without tinkering with the file system itself.

The pvshm module defines an overlay file system for the Linux kernel that sits between applications and an underlying file system. The pvshm module implements address space operations on its files as read and write operations on corresponding backing files in the underlying file system. We wrote pvshm to allow us to memory map files in a PVFS2 file system. Although we wrote pvshm with PVFS2 in mind, it can be used with any underlying file system that supports read and write operations.

The name stands for parallel virtual shared memory, one of the nifty tricks possible with the combination of pvshm and PVFS2.

Why would one want this?

Memory mapped files are often quite useful, but not every file system supports file mapping. Consider, for example, the PVFS2 file system. PVFS2 is an elegant, high-performance, parallel file system. It allows one to aggregate storage across multiple networked computers into a parallel file system simultaneously usable by multiple clients.

PVFS2 is focused intensely on performance, especially in HPC settings and for applications using MPI. More general purpose capabilities like memory mapped files are left out by design. The pvshm file system provides basic memory mapped file support on top of existing installations of PVFS2 without requiring modification of PVFS2 code or settings. The following example inspired the name pvshm.

Example: Parallel Virtual Shared Memory with PVFS2

Let's say we have a small cluster of GNU/Linux nodes, and a problem that would benefit from the ability to access very large amounts of relatively fast memory from RAM. We can use PVFS2 and pvshm to provide a virtual pool of RAM from across the cluster as follows:
  1. Configure PVFS2 across the cluster, using /dev/shm on each node as a backing store.
  2. Mount the PVFS2 file system on one or more nodes.
  3. Copy the problem data into the PVFS2 directory.
  4. Mount the pvshm file system on one or more nodes.
  5. Programs running on the nodes mounting pvshm may now memory-map the problem data (something not possible with PVFS2).


Of course, the real benefit of a system like PVFS2 is to allow high-performance parallel use of memory. The pvshm file system does not explicitly coordinate parallel access to memory from multiple processes. It does not enforce cache consistency, and memory-mapped read/write operations are granular (at the page level). However, pvshm does provide tools that help client applications manage cache state on their own. See the README document in the source code directory for more information.