Re: [Tails-dev] Tahoe-LAFS persistence

Delete this message

Reply to this message
Author: Leif Ryge
Date:  
To: Greg Troxel
CC: tahoe-dev, The Tails public development discussion list
Subject: Re: [Tails-dev] Tahoe-LAFS persistence
On Sun, Jun 01, 2014 at 11:11:29AM -0400, Greg Troxel wrote:
> David Stainton <dstainton415@???> writes:
>
> > Since Tahoe-LAFS is not a posix compliant filesystem...
> > we cannot easily create a persistent volume that only
> > stores data on a Tahoe grid. There is an ugly FUSE hack
> > but it is extremely ineffient.
>
> This can be viewed as a bug in tahoe :-)
> But seriously, fixing the FUSE interface would be a great contribution.
> It's not clear to me how efficient the FUSE interface has to be before
> it isn't the limiting issue; tahoe is not a fast filesystem.


While Tahoe's native FUSE interface bitrotted long ago, there are two FUSE
interfaces which are currently usable:

- Tahoe has an SFTP server can be mounted with FUSE's sshfs, like any other
SFTP server
- the python-fs module (a general filesystem abstraction library) has a Tahoe
client, using Tahoe's web api, and can expose it (like any python-fs object)
via FUSE.

The big problem with using these fuse mounts for many applications is that
Tahoe mutables don't provide random-access writes. So if you put, say, your
firefox profile on a Tahoe-backed FUSE mount... every write+sync to Firefox's
places.sqlite etc will involve re-uploading the whole thing and firefox will
only be usable for brief moments at a time if at all (I think - I haven't tried
it).

 *****************************************************************************
 *** The following section of this email is not about Tahoe+Tails.         ***
 *** If you're just interested in that, skip to "BACK TO THE NEAR FUTURE". ***
 *****************************************************************************


In my opinion, this is not something that should be "fixed" by improving
Tahoe's current mutable files, but rather by replacing them since they have
several other shortcomings. Most importantly (imo):

- They don't preserve history (each write overwrites the previous version, so
if you have a write capability you can also overwrite)
- They aren't lockable (if you have uncoordinated writes to a file, you're
gonna have a bad time. so, you must be very careful sharing writecaps.)
- There is no asymmetric encryption (if you have a write capability, you can
also read).
- They aren't deduplicated at a file level, much less at a block-level

My hand-wavey ideal solution to these problems ("chisel") involves hashsplit
(BUP-style) asymmetrically-encrypted immutable files, references to which are
added to a "directory" which is a *decentralized add-only set*. So, there can
be write-only capabilities which can neither read nor delete data after they've
written it, and because the directory is an add-only set instead of an
append-only file there can be multiple writers without coordination. I've got a
rough idea about how to do this, and a little bit of code... hopefully I'll
find time to work on it more soon. I'm building this separately from Tahoe, but
intending to (optionally) use Tahoe immutable files underneath, and I'd like to
eventually be able to expose a FUSE interface that *does* allow random access
writes to files. Probably Tahoe will need some performance improvements for it
to work well on top of it, though.

Another undesirable thing about Tahoe's current mutables is that write caps
contain RSA private keys which are rather cumbersome to write down. If they
were ECC private keys they could be generated from memorable secrets (which is
potentially dangerous but quite convenient) or at least shorter secrets (because
RSA requires much larger keys than ECC for a given security level).

But none of this is very relevant to the issue of using Tahoe for Tails
persistence in the immediate future, so...

*******************************
*** BACK TO THE NEAR FUTURE ***
*******************************

> > So there should be three options per persistent file-set:
> > 1. do not persist
> > 2. persist to local media
> > 3. persist to local media AND a Tahoe-LAFS grid
>
> Are you proposing to store the capabilities to access the persistent
> data on the local media (removable flash, I'm assuming)? I've come
> into this thread somewhat late, but the security and usability
> properties are not entirely clear to me.


I think the current plan for Tails persistence on Tahoe is to persist to the
USB disk (as Tails already does) and run tahoe backup on a regular schedule
(and/or when the user manually triggers it, and/or triggered by some sort of
inotify-driven agent). So, yes, the user should store their root cap on Tails'
encrypted persistent partition, and also back it up elsewhere (on another usb
stick, or on paper). The Tails persistence setup tool should then have an
option to restore from an existing Tahoe root cap.

I guess the restore could also be done to a ramdisk on a Tails system without
USB persistence, and maybe something could even be done using unionfs to
combine a ramdisk with a fuse-mounted tahoe directory to avoid needing to
download the whole thing? I wonder how that would work.

> > For the use case where you only want to store the data in
> > the Tahoe grid... then simply use the Tahoe commandline
> > tools to upload the file(s).
>
> That seems like it could easily be:
>
> 4. persist to tmpdir and then upload to tahoe, deleting the tmp file.
>
> I think the state of the FUSE interface isn't all that relevant if
> you're going to add code for tahoe anyway. The "tahoe cp" interface is
> very dos/mtools, but quite workable, even if it would be better to be
> able to use a standard VFS interface.


For the data-on-tahoe-but-not-on-the-usb-stick use cases which don't require
random access writes, one of the existing FUSE interfaces should be sufficient
for now. This will enable people to access eg. large media files that they
don't want to keep on the USB stick. They could even "move" them there later -
imagine, while offline, you copy photos from your camera to your Tails
persistent USB. Later, while online, tahoe backup runs. Later still, your USB
stick is getting full, so you [insert UI here] link the readcap for some
backed-up photos into another subdirectory of your root cap and then delete
them from the the USB stick. Or, you delete the files before creating that
link, but you can still go find it in an old tahoe backup snapshot of the USB
Persistence directory later.

Going further, you could have something that knows you want to devote a certain
amount of space on your local storage to a "Recent Photos" directory, and then
automatically delete local copies of the oldest files as new ones are added
(while ensuring all are stored in Tahoe).

I look forward to seeing Tahoe integrated with Tails, but I am a little bit
concerned about a potential pitfall which I think should be communicated to
users somehow: there is no way to delete the ciphertext of immutable files
(they're eventually garbage collected if their leases aren't renewed). If a
user believes their writecap has been compromised but it hasn't been read yet,
they can overwrite it, but if the adversary *has* read it or in some other way
has recovered the readcaps for its immutable subdirectories (which is what
the backup snapshots are) the user might want to delete those but they cannot.
This is rather different from a typical access control based system where one
can simply change their password and/or ask the server to delete everything
quickly.

~leif (at c-base with david)