Since disk drives are cheap, backup should be cheap too. Of course it does not help to mirror your data by adding more disks to your own computer because a virus, fire, flood, power surge, robbery, etc. could still wipe out your local data center. Instead, you should give your files to peers (and in return store their files) so that if a catastrophe strikes your area, you can recover data from surviving peers. The Distributed Internet Backup System (DIBS) is designed to implement this vision.
Note that DIBS is a backup system not a file sharing system like Napster, Gnutella, Kazaa, etc. In fact, DIBS encrypts all data transmissions so that the peers you exchange files with can not access your data.
After initial configuration, DIBS is designed to run in the background and automatically backup desired data. Specifically, any files, directories, or links placed in the DIBS auto backup directory (usually ~/.dibs/autoBackup) are periodically examined by DIBS and sent to peers for backup. If the data changes, DIBS automatically unstores old versions and backs up changes.
DIBS performs incremental backup. Specifically, if DIBS is asked to backup a file (either automatically or by the user), and DIBS determines the file is already backed up and the file is unchanged, DIBS does not re-backup the file. This allows you to efficiently backup large numbers of files without wasting bandwidth by repeatedly backing up unchanged data.
DIBS uses Gnu Privacy Guard (GPG) to encrypt and digitally sign all transactions. Thus you can be confident that even though you are sending your files to others for backup, your data will remain private. Furthermore, by using digital signatures, DIBS prevents others from impersonating you to store files with your peers.
DIBS uses Reed-Solomon codes (a type of erasure correcting code similar to those used in RAID systems) to gain the maximum robustness for a given amount of redundancy. See the FAQ for a description of the benefits of Reed-Solomon codes.
DIBS includes a peer finder service to allow you to find peers to exchange backup space with.
Since peers can have varying levels of connectivity to the network, DIBS offers different communication methods to support a variety of users.
For DIBS to work, you need to have the following installed:
To install DIBS, download the binary distribution for you architecture or the entire source tarball from http://dibs.sourceforge.net or http://www.csua.berkeley.edu/~emin/source_code/dibs. If you download the source tarball, uncompress it, become root, and do python setup.py install to install. If you do not want to install as root, you can do python setup.py install --root=<alternate>/<root>/<path>.
The main file used to run DIBS is dibs.py which will usually be installed in an appropriate place for Python scripts. On a UNIX system, you should be able to run DIBS using dibs.py .... On a Windows system, you may need to do something like C:\<path>\<to>\python.exe dibs.py ....
Create a GPG key for DIBS using the command gpg --gen-key. You MUST use an empty passphrase for your key so that DIBS can sign/encrypt with it in batch mode. We recommend choosing the "email address" field of the key to be different from your true email address by appending .dibs (e.g. emin.dibs@alum.mit.edu). Make sure you give your gpg public key to any peers you want to trade files with. If your GPG key is not $USER@$HOST, you will need to put the commands dibsPublicKey = '<yourKeyName>' and dibsPrivateKey = '<yourKeyName>' in your dibsrc.py file. See the section in the manual on Options for more details.
DIBS currently refers to GPG keys via the email address as opposed to the GPG key fingerprint or other method. Thus when you set dibsPublicKey, dibsPrivateKey, etc., you should be doing something like
dibsPrivateKey = 'emin.dibs@alum.mit.edu'
Also, you do not need to put a real email address in the name of the key, you can essentially put in whatever name you like. Thus if you have multiple machines running dibs you can have one machine with the key name foo.dibs@example.com and another with the key bar.dibs@example.com and so on.
Issue the command dibs.py show_database
Normally, this command would show you the database of peers you can exchange files with (which should be empty). Since this is the first time you are using DIBS, this command (actually any command) will first create the ~/.dibs directory to hold DIBS related files. On a windows system ~/ usually refers to C:\Documents and Settings\<user>.
If your email address is not $USER@$HOST and/or your GPG key is not $USER.dibs@HOST you will need to tell DIBS about this in the file ~/.dibs/dibsrc.py. In that file put the following:
dibsPublicKey = '<your-gpg-key-name>'
dibsPrivateKey = '<your-gpg-key-name>'
dibsAdmin = '<your-email-address>'
Remember to refer to GPG keys by the email address as described in GPG Keys Named via Email Address.
Also, if your gpg program is in a weird place, you may also want to put
gpgProg = '<path>/<to>/gpg'
If your python executable is in a strange place (e.g., if you run Windows) or if you have more than one version of python installed (e.g., on a Debian or OS X system where you installed a different version in addition to the default python installation), you may also need to specify the location of the python interpreter via
pythonExe = '<path>/<to>/python'
Finally, you will probably need to set your outgoing mail server via something like
smtpServer = 'smtp.<your-isp>.com'
See the manual for other customizations that can go in the dibsrc.py file.
Their are two basic ways to tell DIBS about people or computers with whom you want to exchange backup space. You can either find partners yourself and manually import their GPG key and manually enter their information using the add_peer command are you can use the DIBS peer finder service with the post_contract and propose_contract commands. Ideally, the peer finder service should be simpler and easier, but it is currently somewhat experimental. Both methods are discussed below.
First, you must import the GPG keys you from people you want to exchange files with. This can be done using gpg --import. See the GPG documentation for more details. As described in GPG Keys Named via Email Address, DIBS currently uses the email address for a GPG key as the peer name, so when you create a GPG key it is useful to keep this in mind. Please read the description below before creating/importing GPG keys for use with DIBS.
Once you agree to trade files with someone, each of you must add the other to the DIBS database using the following command:
dibs.py add_peer --email TRUE_EMAIL_ADDRESS
--peer DIBS_KEY_FOR_PEER --local_quota L_QUOTA
--remote_quota R_QUOTA --comment COMMENT
--talk TALK_METHOD --listen LISTEN_METHOD
--host HOST --port PORT
The arguments have the following meaning:
For example, if you wanted to trade files with me where I store 1 MB for you and you store 5 MB for me you would issue the command
dibs.py add_peer --email emin@alum.mit.edu --peer emin.dibs@alum.mit.edu --local_quota 5000 --remote_quota 1000 --comment "trade with emin" --talk active --listen active --host martinian.com
There should be no line breaks in the command above. Notice that my email address is emin@alum.mit.edu, but my dibs key is emin.dibs@alum.mit.edu, and the machine I plan to run dibs on is martinian.com.
This will create an entry for the peer in the database (you can verify this using the command dibs show_database).
An alternative to manually finding a peer and adding the required information is to use the new post_contract and propose_contract commands introduced in version 1.0 of DIBS. Essentially, the post_contract command allows you to post an advertisement on the Internet (or alternatively a private intranet) describing the trade parameters you want. Someone else can then use the propose_contract command to answer your advertisement. If the proposed parameters match your advertisement then your DIBS client and your partner's client will automatically exchange the proper information such as GPG keys, talk and listen modes, etc.
These functions are still experimental, however, and so they are described in Peer Finder instead of the install section for now. Users are encouraged to try them out, but those who just want to get DIBS working and do not want to fiddle with things may prefer the better tested but more tedious add_peer approach.
To have DIBS automatically back up files or directories, put the desired files or directories (or better yet, links to them in ~/.dibs/autoBackup). For example, if you have a UNIX system and want to have your .emacs file and your Mail directory automatically backed up you could do
cd ~/.dibs/autoBackup
ln -s ~/Mail
ln -s ~/.emacs
Once you start the DIBS daemon (described below), it will periodically check everything in ~/.dibs/autoBackup for changes and back them up.
For DIBS to work, you must start the daemon with the command
dibs.py start_daemon
I recommend starting the daemon inside a screen session if you have the screen program installed so that you can view the messages the daemon generates. If you are going to start DIBS in a terminal or console that you plan to close and you have a UNIX system, you should instead do
nohup dibs.py start_daemon > /dev/null &
so that DIBS does not die when trying to write to standard out.
If errors occur, you may need to restart the daemon. the DIBS daemon will periodically send them out to peers to be backed up.
Finally, in order to recover from complete data loss you need to protect two absolutely critical pieces of information: your GPG key and the list of peers you trade with. You can obtain the former by exporting it from GPG and the latter from your ~/.dibs/dibs_database.peerDatabase file. So put these two pieces of information onto a floppy, CD, or good old fashioned paper and store them in a safe place (see the section on Recovering Everything for more details).
If you do not store these pieces of information you will still be able to recover from isolated damage which does not affect this critical data.
The Peer Finder service is designed to allow people to find trading partners. The basic idea is that you can use the post_contract command to advertise a “trading contract” describing your terms for trading backup space. Others can then use the propose_contract command to propose a specific contract within the parameters you specified. If your DIBS client finds the proposed contract to be valid (i.e., within the paremters you specified with post_contrat), then public keys, email addresses, and other parameters will be exchanged between your DIBS client and the peer proposing the contract. In particular, manually issuing add_peer or edit_peer commands will no longer be required.
To summarize, the main benefits of the peer finder service are:
To post a contract, use the post_contract command as illustrated by the example below:
dibs.py post_contract --min_quota 10M --max_quota 30M --quota_mult 2.3
--lifetime 86400 --contract_name my_contract
This would post an advertisement for a trading agreement where the poster wants between 10 and 30 megabytes of storage space as specified by --min_quota and --max_quota. The --quota_mult of 2.3 indicates that the poster wants at least 2.3 times more space from than he provides. The --lifetime value indicates that this contract is valid for 86400 seconds (i.e., one day). After that time elapses, it will be automatically removed from the contract server. The optional --contract_name name argument specifies the name of the contract (my_contract in this example). Usually, you should not explicitly name a contract and let DIBS automatically generate a name. Providing a name can be useful in automated scripts, however.
The advertisement is posted to the server specified by the defaultContractServerURL in the dibsrc.py file (if no such variable is present then the current default is www.martinian.com:8000/~emin/cgi-bin/peer_finder. You can view posted contracts from yourself or others by pointing a web browser to www.martinian.com:8000/~emin/cgi-bin/peer_finder/show_contracts.cgi (or the value of the defaultContractServerURL followed by show_contracts.cgi if you are using a different contract server.)
The --quota_mult option may seem strange at first. Why would anyone be willing to give more space than they provide? One possibility is donating space to charities. A worthy organization might post a contract either asking for free space or highly discounted space. Another possibility is that some people might have very high bandwidth connections, high uptime, or other features which make them more desirable trading partners. By providing --quota_mult the peer finder service allows for a variety of trading relationships.
You can cancel a contract that you posted with the unpost_contract command as illustrated below:
dibs.py unpost_contract --contract_name NAME
where NAME is replaced with the name of the contract you are trying to cancel. The contract name will be shown on the contract server show_contracts.cgi page. Also, you can find your currently posted contracts by using the command
dibs.py show_database --only posted_contracts
Obviously you cannot unpost contracts posted by others and nobody can unpost your contract. This is accomplished by storing a revocation password on in your DIBS database which is told only to the server. Your client will automatically supply the required revocation password to the server when unposting.
Once you have browsed www.martinian.com:8000/~emin/cgi-bin/peer_finder/show_contracts.cgi or another contract server site and found a contract you would like to enter into, you use the propose_contract. The following example illustrates how you would respond to the contract posted in Posting a Contract.
dibs.py propose_contract --local_quota 10M --remote_quota 23M --talk
active --listen active --contract_name my_contract
The --local_quota, --remote_quota, --talk, and --listen arguments are the parameters which the peer you are proposing the contract to would use if he were to use the add_peer command. This bears repeating: these arguments are from the point of view of the peer you are proposing the contract to not from your point of view. Thus the --local_quota argument specifies how much space the peer will give you, the --remote_quota argument specifies how much space the peer will get from you, the --talk argument specifies how the peer will contact your machine, and the --listen argument specifies how your DIBS client will contact the peer.
Once you issue the propose_contract command, your peer will try to contact the poster of the contract. Thus, the poster should have an active --listen method otherwise there will be no way for your DIBS client to contact the poster's DIBS client! Assuming that your client successfully contacts the poster, the poster will view the proposed contract and make a yes or no decision. The poster will then contact the proposer to return the decision and potentially exchange key information.
The simplest case is when both poster and proposer accept direct incoming connections and both --talk and --listen are active. Things can still work, however, if the proposer is behind a firewall, but in this case, the proposer must issue the poll_passives command followed by the process_message command to get the response from the poster.
In any case, once the proposed contract is accepted and the proper messages are exchanged, both the posting and proposing DIBS client will exchange GPG key information, and set up their quotas accordingly without the need for explicitly using the add_peer command.
As an aside, the above discussion illustrates another reason for the --quota_mult argument in posting a contract. If your machine is available to accept incoming connections then others who are behind firewalls can still peer with you. On the other hand, if your machine is behind a firewall and cannot accept incoming connections then you can only peer with machines that can accept incoming connections. This would suggest that machines which accept incoming connections should be able to “charge more” for peering than those which cannot accept incoming connections. The --quota_mult facilitates this.
If you are running a private DIBS network (e.g., to backup files within your company or organization), you might want to run your own peer finder service. To do so, look at the README file in the src/peer_finder/cgi-bin directory of the DIBS distribution.
Once the Peer Finder service becomes more stable, more documentation on running your own Peer Finder service will be added to the regular documentation.
The preferred way to use DIBS is by running the DIBS daemon in the background. The daemon listens for and automatically processes incoming messages from peers, and automatically backs up any files, directories, or links in the directory ~/.dibs/autoBackup (or whatever you have autoBackupDir set to). To start the daemon, use the command dibs start_daemon. For peers to contact your DIBS client, YOU MUST HAVE THE DIBS DAEMON RUNNING.
If you have the screen program available, the best way to start the DIBS daemon is inside a screen session. That way you can view what the daemon is doing inside your screen session. If you plan to run the daemon without a connection to a console (e.g., if you want to start the daemon and then close your terminal window) you should redirect the output to /dev/null.
The simplest way to use DIBS is to put links to any data you want backed up in the directory ~/.dibs/autoBackup (or whatever you have autoBackupDir set to). Periodically, the DIBS daemon will execute the auto_check command which will make DIBS incrementally backup your data. Alternatively, if you want the contents of this directory backed up immediately you can issue the auto_check command manually.
If you want to backup a particular file or directory, or write scripts to use DIBS for more sophisticated backup use the store command. For example, to backup everything in your mail directory you would do
dibs.py store --name mail
This will cause DIBS to recursively traverse mail and incrementally store any files it encounters. In this context, incrementally means that a file is only sent to peers if no previous backup exists or if the previous backup differs from the current version of the file.
If you want to backup the same data under different names you can use the --as option to the store command. For example, on a UNIX system you could set up a CRON job to execute the following command every day:
dibs.py store --name mail --as mail.`date +%a`
This would cause dibs to have seven versions of the directory mail backed up: mail.Mon, mail.Tue, etc. Each day of the week, DIBS would compare the mail directory to the backup it made one week ago and re-backup any new files or files which have changed.
You can recover a file with the command
dibs.py recover_file --file FILENAME
Note that FILENAME must correspond to the name which you stored the file as. The auto_check command stores data with the root obtained by concatenating the rootDir variable with the autoBackup variable. On UNIX, the rootDir variable defaults to /. So, on a UNIX system, if you want to recover the file foo stored in ~/.dibs/autoBackup (or whatever you have autoBackupDir set to), you would use the command
dibs.py recover_file --file /autoBackup/foo
This begins the recovery procedure. DIBS will email you when the file is fully recovered unless you have set the value of mailUserOnRecovery to 0.
If you suffer a major data loss you probably do not want to go through the tedious process of recovering every file individually. To recover everything all at once, you can use the command
dibs.py recover_all
This causes DIBS to send a RECOVER_ALL message to all peers asking them to send back everything.
A major feature of the recover_all command is that it only requires the list of peers you are trading data with and does not need the rest of the database. Thus if you have the dibs_database.peerDatabase (either stored in ~/.dibs or wherever the DIBS_DIR variable points to) and your GPG key, you can use the recover_all command to recover from complete data loss.
At first it may seem like the requirement of keeping your dibs_database.peerDatabase and your GPG key safe defeat the purpose of automated backup. After all if you had secure file storage in the first place, why would you need DIBS? The answer is that these two pieces of information take very little room and do not change often. Thus it is quite feasible to save your dibs_database.peerDatabase and GPG key to a floppy, CD, or even paper, and then put this information in a safe place (e.g., a friend's house, a safe deposit box, or in a fireproof safe in your basement).
If you want to clear out all the files other peers are storing for you, you can execute the command
dibs.py clear
This tells all peers to stop storing files for you. It is especially useful if DIBS gets into a weird state due to errors.
Note that issuing the clear command only tells peers to stop storing stuff for YOU, but doesn't make your DIBS client stop storing stuff for them. If you want to drop all the storage for a particular peer, you can use the forget command. See the documentation for clear, forget, edit_peer, and delete_peer for more information.
DIBS commands are issued on the command line using the following syntax:
dibs ARGNAME1 ARGVAL1 ARGNAME2 ARGVAL2 ...
where ARGNAME1 is an argument name starting with -- and ARGVAL1 is the corresponding argument value. For example, to store the file ~/foo/bar.txt the following syntax would be used
dibs.py store --name ~/foo/bar.txt
This rest of this section describes the commands available in DIBS. Optional arguments are enclosed in brackets.
dibs.py add_peer --peer PEER --email EMAIL
--remote_quota REMOTE_QUOTA --local_quota LOCAL_QUOTA
--comment COMMENT --talk TALK --host HOST
[ --port PORT ]
--peer PEER--email EMAIL--remote_quota REMOTE_QUOTA--local_quota LOCAL_QUOTA--talk TALK--listen LISTEN--host HOST--comment COMMENT--port PORTThe add_peer command is used to create an entry in the database for trading files with a peer. For example, if you wanted to trade files with me where I store 1 MB for you and you store 5 MB for me you would issue the command:
dibs.py add_peer --email emin@allegro.mit.edu --peer emin.dibs@alum.mit.edu --local_quota 5000 --remote_quota 1000 --comment "trade with emin" --talk active --listen active --host martinian.com
A k, m, g, or, t can be appended to a quota indicating that the number should interpreted as kilo-bytes (the default if no letter is appended), mega-bytes, giga-bytes, or tera-bytes. For example, the --local_quota 5000 in the example above could be replaced by --local_quota 5M NOT --local_quota 5K.
dibs.py delete_peer --peer PEER
--peer PEERThis command removes PEER from your database and sends an email to the email address associated with PEER to this effect. This command can only be executed if you are not storing anything for PEER and PEER is not storing anything for you (see forget, clear).
dibs.py edit_peer --peer PEER
[ --email EMAIL ]
[ --remote_quota REMOTE_QUOTA ]
[ --local_quota LOCAL_QUOTA ]
[ --comment COMMENT ]
[ --talk TALK ]
[ --host HOST ]
[ --port PORT ]
See add_peer for a description of the arguments. This command changes the value of one or more entries for PEER. It is an error to decrease REMOTE_QUOTA or LOCAL_QUOTA below the respective storage amount. The current storage amount can be seen via add_peer.
This command starts the DIBS daemon.
The daemon must be running for DIBS to automatically send and respond to messages from peers and to automatically backup the data you place in the directory named by the autoBackupDir option. For more information about the DIBS daemon, See The DIBS daemon.
--peer PEERThis command forgets all the files stored for PEER on the local machine and sends a message to PEER about this. The forget command is useful if you want to decrease the amount of storage allocated to PEER, terminate your relationship with peer (see delete_peer), or if your DIBS database gets into a weird state.
Since this command informs PEER that files are being forgotten, the database of PEER will be properly synchronized (i.e., the database of PEER will indicate that you are not storing any files for PEER).
This command asks the daemon on the local machine (IP address 127.0.0.1) listening at the port specified by the daemonPort variable to stop immediately.
This command checks files in the auto backup directory autoBackupDir and backs up any new files or files that have changed.
dibs.py clear
This command sends unstore requests to peers to unstore all the files
stored by your client.
dibs.py store --name ITEM [ --as SNAME ]
--name ITEM--as SNAMEThis command causes DIBS to (incrementally) store the file or directory with peers. If a directory is given, its contents are recursively stored. If a file which is already stored is encountered and it is unchanged, nothing happens. If an already stored file is encountered and the backed up version is different than the current version, the old version is unstored and the new version is backed up in its place.
dibs.py unstore_file --file FILENAME
--file FILENAMEThis command causes DIBS to ask peers to unstore the named file. Note that if an item was stored via
dibs.py store --file foo --as bar
it should be unstored via
dibs.py unstore_file --file bar.
dibs.py recover_file --file FILENAME
--file FILENAMEAsk peers to send us pieces of FILENAME which they are storing. Note that if an item was stored via
dibs.py store --file foo --as bar
it should be recovered via
dibs.py recover_file --file bar.
This command asks all peers to send back everything they are storing. This command can be used to recover from complete data loss. See Recovering Everything for more details.
dibs.py show_database [ --only WHICH ]
This command prints a representation of the DIBS database. If the optional --only option is used with WHICH being one of peers, files, stats, or recovery, probe, storage, posted_contracts, proposed_contracts, then only the indicated portion of the database is printed. For example, the command
dibs.py show_database --only peers
would display only the peers in the database and not the files or recovery status.
dibs.py cleanup
Cleanup empty files and directories. Use this after you use this after
calling forget, after a peer issues the uses the clear
command, or every once in a while to keep things clean and efficient.
DIBS keeps the data your are storing for peers in a tree of directories. Since the file name for a piece of data is determined by the MD5 hash of the data, a file name might look like 380b90f17c9c908d0e59cf0fb1c8e461. This file would be stored in the path DIBS_DIR/<peer>/3/8/0/b/380b90f17c9c908d0e59cf0fb1c8e461. Once you stop storing this piece of data the directory DIBS_DIR/<peer>/3/8/0/b/ still remains. The cleanup command goes through your DIBS directory and removes such directories which no longer contain any files.
dibs.py poll_passives
Poll any peers who are passive (i.e., can not connect to us directly).
If you are using the DIBS daemon, you do not need to use this command
since the daemon will do it periodically.
dibs.py probe [ --file FILENAME ]
Ask all peers storing a file to verify that they are actually storing it. With no arguments, this command probes a random file. This command is called automatically (with no arguments) by the daemon started via start_daemon.
dibs.py process_message [ --file FILENAME ]
--file FILENAMEThis command reads the DIBS protocol message in FILENAME and takes the appropriate actions. If no argument is provided then all messages in the incoming directory are processed.
If you are using the DIBS daemon, you do not need to use this command since the daemon will do it periodically.
dibs.py send_message [ --file FILENAME --peer
PEER ]
--file FILENAME--peer PEERThis command reads the DIBS protocol message in FILENAME sends it to PEER. If no argument is provided then all messages in the outgoing directory are sent to the appropriate peers.
If you are using the DIBS daemon, you do not need to use this command since the daemon will do it periodically.
dibs.py send_hello --host HOST --port PORT
--host HOST--port PORTThis command sends a hello message to the DIBS daemon listening on the specified machine to check if the peer is alive. The recipient should respond with a message listing the version of DIBS it is running.
This command merges all the statistics records in the statistics directory into the database. The show_database command can be used to see the current statistics.
dibs.py post_contract --min_quota MIN_QUOTA
--max_quota MAX_QUOTA
--quota_mult QUOTA_MULT
--lifetime LIFETIME
[ --talk TALK ]
[ --listen LISTEN ]
[ --contract_name CONTRACT_NAME ]
[ --url URL ]
[ --host HOST ]
[ --port PORT ]
--min_quota MIN_QUOTA--max_quota MAX_QUOTA--quota_mult QUOTA_MULT--lifetime LIFETIME--talk TALK--listen LISTEN--contract_name CONTRACT_NAME--url URL--url
none is specified then the contract is not posted to any contract
server. Using the post_contract command and not posting the
contract to any URL is pointless, but this can be useful for testing
purposes.
--host HOST--port PORTThe post_contract command posts an advertisement for a trading contract to a server (see Peer Finder). Someone else can then propose a specific contract matching the posted parameters using the propose_contract command. If the contract is accepted, then GPG keys are exchanged and the appropriate modifications are made to the database of the poster and proposer without the need to manually use the add_peer or edit_peer commands.
dibs.py unpost_contract --contract_name CONTRACT_NAME
[ --url URL ]
This command unposts the contract with name CONTRACT_NAME previously posted with post_contract. By default, the URL to unpost from is obtained from the defaultContractServerURL variable and should not need to be specified.
Also, if --url none is specified then the named contract is
removed from the DIBS database but the contract server is not contacted.
Generally this would be a bad idea, but it can be useful if contract
server has removed (or never received) the contract in question and you
now want to remove the contract from your DIBS database.
dibs.py propose_contract --contract_name CONTRACT_NAME
--local_quota LOCAL_QUOTA
--remote_quota REMOTE_QUOTA
[ --talk TALK ]
[ --listen LISTEN ]
[ --host HOST ]
[ --peer PEER ]
[ --url URL ]
[ --peer_host PEER_HOST ]
[ --peer_port PEER_PORT ]
[ --peer_email PEER_EMAIL ]
Before describing the arguments, we point out that all the arguments that do not start with --peer_ are from the point of view of the peer not the poster.
--contract_name CONTRACT_NAME--local_quota LOCAL_QUOTA--remote_quota REMOTE_QUOTA[ --talk TALK ][ --listen LISTEN ][ --host HOST ][ --peer PEER ][ --url URL ][ --peer_host PEER_HOST ][ --peer_port PEER_PORT ][ –peer_email PEER_EMAIL ] The email address to use in contacting the poster. Usually, this is obtained form the posted contract information and should not need to be specified directly.
The propose_contract command proposes a specific contract within the parameters of the posted contract specified by CONTRACT_NAME. Specifically, if the poster accepts incoming connections, the proposer's DIBS client will attempt to contact the poster.
Once contacted, the poster will examine the proposed contract and respond with an automated email to the proposer describing its decision. If the proposed contract is accepted by the poster, then the poster will attempt to contact the proposer's DIBS client to exchange GPG keys, and enter the trading relationship in each client's database.
Thus if both poster and proposer accept incoming connections, the trading relationship should be automatically established and trading will commence as usual. If either the poster or proposer is behind a firewall and requires passive mode, things are more complicated.
If the proposer is behind a firewall and cannot accept incoming connections, then it will not be able to obtain the poster's response to a proposal until it issues a poll_passives command and followed by a process_message command to the poster. The DIBS daemon should eventually do this automatically, but the impatient user may wish to manually issue these commands after a contract is proposed.
If the poster is behind a firewall and cannot accept incoming connections, then things are even more complicated. In this case, there is no way that the proposer can contact the poster to initiate a proposal. Thus, posting a contract for a DIBS client which cannot accept incoming connections because it is behind a firewall is generally not a good idea.
DIBS has a number of options which you can configure using the dibsrc.py file in you DIBS directory. Your DIBS directory is usually created in ~/.dibs the first time you invoke DIBS. Use the command dibs show_database if you just want to get your DIBS directory created without doing anything else.
To set a particular option, just put the appropriate command in your dibsrc.py file. For example, to specify a particular name for the log file instead of the default, place the following command in the file ~/.dibs/dibsrc.py:
logFile = '/tmp/myDibsLog'
All the options which you can set in this manner are described below including the variable DIBS_DIR, which behaves slightly differently.
The options are separated into two classes “User Options” and “Variables”. The only distinction is that “User Options” are things which you will probably want to customize while variables are things which you probably should not change unless you know what you are doing.
This variable is the name of the GPG key to use in signing messages sent to peers. By default it is set using environment variables to be ${USER}@${HOST}. A warning is printed if either the USER or HOST environment variable can not be determined.
This variable is the name of the GPG key to use in encrypting messages that you ask peers to store for you. By default it is set using environment variables to be ${USER}@${HOST}. A warning is printed if either the USER or HOST environment variable can not be determined.
The value of dibsPrivateKey can be the same or different than dibsPublicKey. If they are the same, you have the convenience of only having to keep track of one GPG key. If they are different and you keep the private key secret, you have the added security that anyone trying to crack your encryption does not have the public key to work with.
Cracking the encryption when the public key is known MAY be easier than cracking the encryption when no public key is available. Nobody has yet figured out a way to break the commonly used public key encryption systems, though, so I personally just use the same public key for both encryption and signing.
This variable is the email address of the user. By default it is set using environment variables to be ${USER}@${HOST}. A warning is printed if either the USER or HOST environment variable can not be determined.
This variable controls whether DIBS automatically emails the user specified in the dibsAdmin variable when it finishes processing a recover_file command recover_file. The default value is 1, set it to 0 if you want to disable emailing when recovery is complete.
This variable sets the maximum size chunk to store with a peer. The default is 10 megabytes. Files larger than this value are broken up into smaller pieces before storage.
This variable specifies the file name to use in logging DIBS information. The default is
DIBS_DIR = DIBS_DIR + '/logfile'
This variable specifies the type of information which will be included in the log file. The different types of messages and their levels are shown below:
All messages with categories above the value for the logLevel variable will be printed to the log file.
LOG_DEBUG -20 LOG_INFO -10 LOG_WARNING 0 LOG_ERROR 10 LOG_CRITICAL 20 For example, if this variable is set to -10 as shown below
logLevel = -10then LOG_DEBUG messages will not be logged but all others will be.
The default value for this variable is 0.
This variable specifies the type of information which will be printed to the console. The different types of messages and their levels are the same as described for the logLevel variable.
All messages with categories above the value for the printLogLevel variable will be printed to the console.
For example, if this variable is set to -20 as shown below
logLevel = -20then all messages will be printed.
The default value for this variable is 0.
This variable specifies how many kilobytes a log file can contain before being rotated.
For example, if this variable is set to the default value of 1000000 as shown below,
maxLogSize = 1000000then logs will be rotated when they exceed one megabyte.
This variable specifies the path to the GPG program. The default is
gpgProg = 'gpg'If you are using Windows, you will almost certainly have to set this variable to point to your GPG program.
This variable specifies the SMTP server to use for sending outgoing mail. The default is localhost, but that almost certainly will not work if you are running Windows and probably will not work on most home UNIX systems. The best way to determine the SMTP server is to ask your system administrator or your Internet Service Provider. For example, if you connect to the Internet using RCN, you might do
smtpServer = 'smtp.rcn.com'
The DIBS daemon maintains a separate log file than other DIBS commands. The name of this log file is controlled by this variable and defaults to
daemonLogFile = DIBS_DIR + '/daemonLog'
When running, the DIBS daemon can be stopped by creating the file with this name. If this file contains an integer, the daemon waits that many seconds before stopping, otherwise it stops immediately. In either case, the daemon removes this file after stopping. Note that the daemon only checks for the existence of this file occasionally as controlled by daemonTimeout. The default value is
daemonStopFile = DIBS_DIR + '/stop_daemon'
When the DIBS daemon is alive it waits in an idle state for incoming connections. After a timeout period given by this variable (specified in seconds), it wakes up and performs various actions such as checking if it should stop daemonStopFile, checking if it should automatically backup files, or if it should attempt to poll passive peers pollInterval. The default value is 600.
This variable specifies how often (in seconds) the DIBS daemon should contact passive peers to ask for messages. Since the daemon only wakes up to do polling periodically (see daemonTimeout), the actual poll interval may be the sum of this variable, the daemonTimeout variable, and the time required to perform intervening actions by the daemon. The default value is 3600.
This variable controls how of then the probe command is automatically called by the daemon. Specifically, a probe will be attempted roughly every daemonTimeout seconds, but no more often than specified by this variable.
This variable specifies how long a probe is allowed to take. If more time (in seconds) has passed for a probe than the value of this variable, the probe is marked as a timeout.
This variables specifies how many redundant pieces of a file will be created. If a file is chopped into k pieces (see kbPerFile), this many extra pieces will be added using a Reed-Solomon code. For example, if a file is chopped into 5 pieces and redundantPieces is 2, then 7 pieces will be sent such that the original file can be recovered from any 5 of those 7 pieces.
This variable specifies the name of the host on which your DIBS client is running. By default, this value is obtained from the HOST environment variable and should not need to be modified. If that environment variable is empty or returns something useless like localhost or 127.0.0.1, then you should probably explictly set this variable.
This variable specifies the port the DIBS daemon should listen on. The default value is 6363.
Sometimes DIBS needs to wait briefly for things like a lockfile being released or an unstore request being sent out before continuing. At these times it sleeps for a number of seconds specified by the sleepTime variable whose default value is 10.
This variable specifies how many seconds a message can wait in the outgoing queue before DIBS complains about it. The default is set to 10 days:
maxMsgAge = 86400 * 10
This variables specifies where the auto_check command looks for files and directories to backup automatically auto_check.
Any error messages which are mailed to the DIBS Administrator specified by dibsAdmin are also stored in the directory named by this variable. You should periodically read and delete messages here.
To prevent the DIBS Administrator from being mail bombed with error messages if something extremely unusual occurs, DIBS stops mailing error messages once the number of error messages in the directory named by errorDir exceeds this threshold.
To prevent the DIBS Administrator from forgetting about error messages stored in the directory named by errorDir, DIBS sends warnings about this directory filling up when the number of messages exceeds this threshold.
When storing a directory DIBS works by creating a queue of outgoing messages before trying to connect to peers to deliver the messages. Once more files than specified by this variable are queued up, DIBS will try to connect to peers to send the messages out.
If you make this threshold too small then DIBS will waste a lot of time with network overhead in making and closing connections to peers. If you make this threshold too large then you prevent pipelining of your DIBS program sending messages out and your peers processing the messages you are sending to them.
This variable determines the location of the user's DIBS files. By default it is ${HOME}/.dibs. You can not set this via the dibsrc.py because DIBS looks for the dibsrc.py file in the directory DIBS_DIR. Therefore to use a directory other than ${HOME}/.dibs to hold your DIBS files set the environment variable DIBS_DIR.
This variable describes the path to the Python executable. By default, it is set to the sys.executable value obtained from Python. You should not need to change it unless you have multiple versions of python installed and are doing something rather strange.
This variable specifies the default URL to use for the Peer Finder contract server. You should only change this if you want to use a special Peer Finder server.
DIBS provides various scripts for automated testing. These automated tests will probably be most useful to developers, but users may find them useful in diagnosing problems.
Furthermore, in addition to adding more tests, one quality assurance goal of DIBS is to add an automated test for each bug which is reported or fixed to verify the existence of the bug and its resolution. Ideally, advanced users should be able to submit a bug report by creating a new automated test which demonstrates the bug.
The testing suite is currently too complicated and poorly documented to reach this goal. But as a first step in achieving better testing and quality assurance, this chapter documents the current automated testing framework.
The tests directory in the DIBS source distribution contains subdirectories for each test group as well as testing utilities which are useful in all tests. The basic testing philosophy is that only general testing utilities and not specific tests go in the tests directory. Actual test scripts are contained in subdirectories of tests which represent test groups. To run a test group, the user sets the environment variables described in Environment Variables Required for Testing, changes directory to the top level DIBS directory, and imports all from the given test suite.
The following is a brief description of the contents of the tests directory:
Contains utility functions useful for all tests.
Contains code to ensure that various environment variables required for the tests are initialized before any tests begin.
This file imports all test subdirectories in the tests directory and causes the corresponding tests to be run. Specifically, the following commands execute all tests:
cd <path>/<to>/dibs
echo from tests import all | <path>/<to>/python
This subdirectory contains scripts to test the contract functionality in the DIBS Peer Finder service. To run only the tests in this subdirectory do:
cd <path>/<to>/dibs
echo from tests.contracts import all | <path>/<to>/python
cd <path>/<to>/dibs
echo from tests.daemon import all | <path>/<to>/python
The following environment variables must be set before any automated tests can be run:
The following environment variables are optional and do not need to be set:
The following are issues which need to be fixed in future releases. If you have other suggestions or would like to fix something in the list below, please contact emin@alum.mit.edu.
add_peer: add_peerauto_check: auto_checkcleanup: cleanupclear: cleardelete_peer: delete_peeredit_peer: edit_peerforget: forgetmerge_stats: merge_statspoll_passives: poll_passivespost_contract: post_contractprobe: probeprocess_message: process_messagepropose_contract: propose_contractrecover_all: recover_allrecover_file: recover_filesend_hello: send_hellosend_message: send_messageshow_database: show_databasestart_daemon: start_daemonstop_daemon: stop_daemonstore: storeunpost_contract: unpost_contractunstore_file: unstore_fileautoBackupDir: OptionsdaemonLogFile: OptionsdaemonPort: OptionsdaemonStopFile: OptionsdaemonTimeout: OptionsdefaultContractServerURL: OptionsDIBS_DIR: OptionsdibsAdmin: OptionsdibsPrivateKey: OptionsdibsPublicKey: OptionserrMaxCount: OptionserrorDir: OptionserrWarnCount: OptionsgpgProg: Optionshostname: OptionskbPerFile: OptionslogFile: OptionslogLevel: OptionsmailUserOnRecovery: OptionsmaxLogSize: OptionsmaxMsgAge: OptionspollInterval: OptionsprintLogLevel: OptionsprobePeriod: OptionsprobeTimeout: OptionspythonExe: OptionsredundantPieces: OptionsrootDir: OptionssendMsgThreshold: OptionssleepTime: OptionssmtpServer: Options