Re: [Tails-dev] Tails persistence use case

Author: anonym
Date:
To: The Tails public development discussion list
Subject: Re: [Tails-dev] Tails persistence use case

11/14/2011 10:17 AM, intrigeri:
> anonym wrote (14 Nov 2011 00:53:21 GMT) :
>
>> Requirements
>> ============
>
>> From the roadmap and various other places in our todo item [1] our
>> high-level requirements seem to be these:
>
>> * Persistent user data store.
>
> FTR, this was not meant to be implemented using live-snapshot at all.
> A "simple" wrapper around udisk may allow to unlock and mount (possibly
> read-only) a LUKS encrypted volume onto e.g. $HOME/data.

Oh. But having everything using the same system would be nice for
consistency, right?

>> The problems with snapshots
>> ===========================
>
>> But cpio.gz snapshots has some issues:
>
>> 1. It is very unfriendly to flash based storage if we only do minor >> changes to our persistent data, since *all* persistent data are >> written back to the physical storage at *every* shutdown. I'm afraid >> minor changes is a more typical usage where it matters. Imagine >> having 100 MB of emails, fetching maybe 50 KB worth of new mails from >> your inbox, and then syncing *all* the 100 MB worth of old, >> unmodified emails back as well. That causes pretty significant write- >> wearing in comparison to how much data that was added to the >> snapshot.

>
> Imagine your fetching of those 50kB new email is split into
> 4 different small fetches, accross a 2 hours Tails session. Each of
> these small fetches is likely to update local email indexes and
> whatnot other files the MUA makes sure to keep up-to-date. Say those
> files, for a 100MB email store, are a few MB each. Then, snapshots
> ensure those files are written only to the Flash device, while a more
> synchronous method would write them four times. Depending on the
> actual numbers (the 50kB / 100MB ratio, the number of fetches, the
> actual size of the email indexes etc), I guess the async' vs.
> sync' resulting figures wrt. write-wearing may be not *that* obvious.

I think this wouldn't be an issue if the underlying filesystem is
flash-friendly. But right, you make a good point. Let's just leave it at
that it's not sure that snapshots are more flash-friendly than overlays,
as currently is stated in todo/persistence.

>> 2. On boot all snapshots' files are synced into the tmpfs, so they're >> stored in preicious RAM. Hence snapshots cannot be very large >> (specifically, the maximum is ~ ${RAM_SIZE}/2, and that leaves no >> space for other file system modifications).

>
> What kind of big files / directories do we intend to make (optionally)
> persistent as part of "Persistent application-specific
> configurations"? (This is no rhetorical question, I don't want to save
> snapshots at all costs, but I think it will help the discussion to
> know better what we are talking about.)
>
> - /var/lib/tor : a few MB
> - /var/lib/i2p : ???
> - email store : generally a few dozens MBytes
> - random configuration / keys listed on the wiki: a few MB

It may be that I posed this issue as one only about limitations on
capacity for persistent data. But with "precious RAM" I also meant that
each bit of persistence is a lost bit of RAM. Even just "a few dozens
MBytes" of persistence could hit hard RAM-wise on a weaker system. We're
already treating weak systems badly enough, imho.

> Anything else?

Some examples I can think of:

* Tails server edition: storage for web servers/wikis, file servers...
* FreeNet/GNUnet: the user data store.
* i2psnark: i2p's built-in bittorrent client.

But who knows what the future will bring us? That's why I'm opting for a
potentially more future-proof alternative.

>> The case for overlays
>> =====================
>
> [...]
>> The overlay's only limitation is that it has static size and cannot
>> automatically grow like a snapshot file can (snapshot partitions
>> can't for the same reason).
>
> Aren't there ways to have a loop/file-backed filesystem grow as
> needed? (I might be dreaming of using qcow2 as a container.)
> Probably a dead-end, but would be worth a quick search.

Possibly, I have no idea. I *think* aufs is an option as well.

I'll look into qcow2, and I'm open for more suggestions.

>> Proposed solution: locally specified inclusions
>> ===============================================
>
>> We make home-{rw,sn} obsolete, only live-{rw,sn} are considered by
>> live-boot (or more correctly, the scripts it adds to the initramfs).
>> When a persistent media (with label/filename "live-rw") is found by
>> live-boot, it looks for a file called .live-persistence.includes (but
>> I'll continue calling it just ".includes") in its root. If it's not
>> there, then it mounts the media (using aufs) on / just like it does for
>> live-rw currently. But if .includes is present, then it doesn't mount
>> anything on /, it instead bind-mounts the directories listed in
>> .includes to their specified destinations.
>
> Looks like it would perfectly suit our needs.
>
> A problem though, is that this way to deal with things requires
> a clever-enough underlying filesystem to get permissions right,
> whereas good old home-sn perfectly satisfies itself, I believe, with
> a FAT32 Flash stick. This is probably not a problem for Tails itself,
> but making home-sn obsolete may be a problem for other people due to
> that. The naming of .live-persistence.includes may be incompatible
> with poor filesystems, too.

There must be a misunderstanding here. I believe it must have to with
some of:

* Persistency/snapshot partitions vs. persistency/snapshot files and
what my proposal changes for these.
* Where the .includes file is located.

Case in point: If you make a FAT32 partition and label it
"{live,home}-{rw,sn}", then you *will* get freaky permissions even if
you use the old persistence system. What you need to do with your FAT32
partition is to *NOT* use it as a persistency/snapshot partition, but as
a storage for persistency/snapshot files. So, you set an empty label and
store your {live,home}-{rw,sn}.{ext2,cpio.gz,...} files on it, and the
filesystems *in* these files will be able to handle permissions just fine.

So, the above is an issue in the *old* persistency system, and it
remains so in my new one, and probably any other reasonable system.
FAT32 cannot be used to store files intended for a more intelligent
filesystem. It's not even good for backing persistency/snapshot files
due to it's file size limits. In fact, I see extremely little use for
FAT32 storing persistency/snapshot files -- to mount any of these files'
filesystems you need support for a more intelligent filesystem any way,
so why not use that filesystem instead of FAT32 for storing the files
too? The only thing you can do with them with only FAT32 support is
deleting, backing up/restoring them.

What my proposal changes for "home-{sn,rw}" is just that it's not a
specially handled case any more -- that's the only way it's made
obsolete. You can still achieve that type of persistence by naming the
file/partition "live-{sn,rw}" and creating a .includes file that mounts
home in the right place.

Finally: The .includes files is intended to be stored *inside* the
persistency/snapshot partitions/files. I realize this wasn't made very
clear in my previous email, and the "in-depth example" was vague with
regards to this. If you go back to the example, $dev is either a
partition labeled "live-rw" or a file named
"live-rw.{ext2,cpio.gz,...}". You know, live-boot also mounts partitions
that doesn't have the correct label, and then it just looks for
persistency/snapshot files.

> The "inclusions" naming seems misleading to me. The old
> /etc/live-persistence.binds (see live-boot(7)) somehow occupies the
> "bind" namespace, which makes it more complicated to find a good name.

I'm aware of this namespace collision :/. In fact, first I chose the
name .live-boot.binds. When I found out about the existing .binds file I
changed "my" name to ".live-persistence.binds" and started toying with
the idea of making the /etc/live-persistence.binds the "gobal" version,
but if there's a "local" one present on some file/partition it overrides
the global one for *that* file/partition. The old use case would be
covered by adding the possibility to use "%tmpfs%" as source in my
suggested extended syntax (from the "Backwards-compatibility" section).

I couldn't figure out any plausible scenario where a global binds file
would be good, though, so I scrapped this idea. The power in my proposal
is that .includes is local, so there's no ambiguity on what will be
bind-mounted where.

The reason I later changed to ".includes" was to get a generic name in
case we come up with a better way to make per-directory persistence
(qcow2?). Oh, and it's also generic in the sense that it works for
snapshots.

> Wouldn't live-persistence.mount do the job?

I'm definitely open for a name change, but ".mount" isn't very generic
or intuitive either. I was also thinking about ".keep", but I don't
know. Maybe "live-persistence.list", following the style of
/etc/live-snapshot.list? I think I like that: "list which directories
should be persistent in the live system -- live-persistent.list"

>> Snapshots
>> ---------
>
>> .includes could also be used for all types of snapshots,
>
> Right. This could be configured in another file, e.g.
> .live-persistence.snapshots.

Why not a generic name that works for both? I think
live-persistence.list would work for both overlays and snapshots.

> So home-sn would not be made obsolete, eventually?

home-sn would be made obsolete in the sense that it's not a special case
any more. To get that type of functionality, you'd use live-sn and an
appropriate .includes/.mount/.binds/.whatever file.

Cheers!

This message is part of the following thread:
	the complete thread tree sorted by date
	intrigeri at
	anonym at