It stores it's backups on tape.
It keeps a database of what's on which tape.
It supports a veriety of underlying backup progams (tar, dump, etc).
Current status: abandoned
Last modification: 1996
Download: here
FROGBAK(1) FROGBAK(1)
NAME
frogbak - schedule and execute backups for a small network
SYNOPSIS
frogbak [-dryrun] [-summary] [-future #h|#d|#w|#m] [-control con-
------- -------- ------- -- -- -- -- -------- ----
trol file]
---------
mkblank tapename
--------
recycle tapename [tapename...]
-------- --------
sum covered control file [control file...]
- ------------ ------------
send offsite
-
DESCRIPTION
These programs provide frogbak services for small networks. The algo-
rithms they use were designed for the following environment: one tape
drive, 20 GB of disk space, several kinds of computers, and a lazy pro-
grammer who was paranoid about dumps. The closer an environment is to
that, the better the system will work for you.
The basic design is that the system choses when to do what kind of
dump. It does two kinds of dumps: incrementals and fulls. Unlike a
differential, an incremental dump saves the files modified from the
time of the last dump at the same dump level to the present rather than
----
the files modified from the time of the last dump at a lower level to
-----
the presetn. These dumps are organized into three separate tracks.
The tracks are independent of one another such that an incremental in
one track does not affect the coverage of an incremental in another
track. Two of the tracks have only incrementals and the last has only
full dumps. This setup means that each file gets saved onto two tapes
by the incremental tracks. This notion of tracks is a convienient way
to think about the behavior of frogbak, but it is not how the behavior
is implemented: it is implemented by making the incrementals save the
files modified between now and the incremental before the last one.
To restore a filesystem, the last good full dump must be read first.
If a full dump is bad, just ignore it and skip to the previous one.
After the full dump, every other incremental from the time of the full
dump until the present must be read. If an incremental is bad, switch
to the other track.
It is not expected that dumps will be bad. Exabyte and DAT media seem
to be very reliable in comparison to the 9-track tapes used last
decade. However, frogbak performs the dumps on live filesystems and
thus sometimes the dump data will be bad even though the media is okay.
The choice of dumps is specified with a control file. For each
filesystem, the control file specifies:
The frequency of incremental dumps. This places a limit on how
often a dump is performed. Dumps will not occur more often than
the specified rate.
The importance of incremental dumps. The importance is combined
with the lenght of time it has been since an incremental dump
has been done compared to specified frequency of dumps to pro-
duce a rating number. The formula is something like rating =
------
time since last dump / frequency * importance. Thus if the fre-
---- ----- ---- ---- --------- -----------
quency is two days, the importance is 50, and it has been ten
--
days since an incremental dump, then the rating would be 10/2*50
------
= 250.
The frequency and importance of full dumps.
The ratings for both incremental and full dumps are compared on a
filesystem by filesytem basis. For each filesystem, either the full
dump or the incremental will be discarded from consideration at this
point.
The ratings for all of the filesystems are then sorted and dumps are
performed in order, based on these priorities.
CONTROL FILES
There are four types of lines that can be in the control files. They
are: comments, variable assignments, filesystem control lines, and
average statements.
Any line beginning with a hash (#) symbol is a comment.
Any line beginning with a legal (C-style) identifier followed by an
equals (=) is a variable assignment. Variable references may be made
in variable assignments and in the dump valuation columns. Variables
are recognized because the only other symbols that may occur in those
locations are numbers and math operatators like times (*) and plus (+).
Filesystem control lines are made up of seven whitespace-separated
fields:
filesystem Names the filesystem to be dumped.
host Names the system that the filesystem is on.
os Names the dump program to be used to dump the
filesystem. The current legal values for the os
field are: solaris, sunos, freebsd, netbsd, hpux,
hp-ux, mach, domain, ultrix, sony, linux, gtar,
dostar, targtar, and xenix. Dumps can be done with
the GNU tar program. On linux it is the default
and is called tar, on other systems it is called
tar. DOS filesystems can be dumped with gtar if
they are mounted on a unix system. The current
dostar setup assumes that the tar program is really
GNU tar.
ifreq Specifies the frequency of incremental dumps. The
format is Where is a deci-
-------------- --------
mal number and is one of h, d, w, or m; cor-
------
rosponding to hours, days, weeks, and months.
ivalue Specifies the relative value of doing an incremen-
tal dump on this filesystem after a duration equal
to the ifreq.
ffreq Specifies the frequency of doing a full dump. Same
format as ifreq.
fvalue Specifies the value of doing a full dump of this
filesystem after a duration equal to the ffreq.
I recommend setting all of the frequencies at one day. That way you
can tell if everything is getting dumped or not. I further recommend
setting a variable to be the relative importance of doing full dumps.
Then when the ivalue is set to x, and the fvalue is set to x/fd, the
- ----
number of full dumps per incremental can be varied by changing the
value of fd. This allows the dump system to be tuned easily.
--
The last sort of line is an average statment. The syntax of an average
statement is average system-name filesystem-1 filesystem-2 etc... Nor-
----------- ------------ ------------ ------
mally, the each filesystem for a given system is considered indepen-
dently. This means that they may not be near each other on the tape
and, futher, they may not both make it onto the same tape if the tape
runs out. It is easier to restor if everything you need is on the same
tape and still easier if it is grouped together. The average statement
causes the averaged dumps to be placed sequentially on the tape. Their
ratings are averaged.
RECORDS
Every time a dump is performed a record of the dump is stored in a file
that lists dumps done for that filesystem. The records for full dumps
and incrementail dumps are stored separately. Full dumps are named by
transforming all the slashes (/) in the filesystem name to dots (.).
Thus /usr/local becomes usr.local.full. As a special case, the root
---------- --------------
filesystem becomes simply .full. Make sure you use the -a option to
-----
ls(1) when listing the directories.
Incremental dumps are similarly named, but with .incr instead of .full.
All of the records for a given system are stored in /mas-
-----
ter path/records/hostname. The master path is the path to the top of
--------- ---------- -----------
the frogbak system's directory. At Berkeley Research and Trading, that
is /y/adm/dump. Thus to find out when the last incremental of /y was
performed, look in /y/adm/dump/records/troy/y.incr.
The filesystem dump record files have the following format:
FROM-DATE TO-DATE TAPE-NAME FILE-NUMBER # COMMENT
--------- ------- --------- ----------- - -------
The FROM-DATE beginning of the time covered by that particular dump.
---------
On full dumps, the FROM-DATE is simply 0. The TAPE-NAME is the sym-
--------- ---------
bolic name of the tape that the dump is on. It is the name that was
given as an argument to mkblank, and is, hopefully, written on the side
of the tape. The FILE-NUMBER field specifies how many files must be
-----------
skipped over on that tape to get to that dump. Thus if FILE-NUMBER is
-----------
17 and you wanted to restore that dump, you would need to use mt -f
device fsf 17 to get to that dump.
------
Although logically the incrementals can be divided into to tracks, they
are not stored that way in the records database. In fact, the logical
division is just an artifact that that incremental dumps cover back-
wards to the incremental dump prior to the previous one.
To find out when something has been backed up, both the .full and .incr
records files must be examined. They give the times and coverages for
the filesystems. To find out if a particular file was backed up, the
dump tape must be read. No index of files saved is kept.
TAPES
The information about each dump performed is also stored grouped by
what tape it is on. In the directory /master path/tapes, information
-------------
about each dump tape is stored. This information includes tape write
speed performance figures and other tidbits.
This information substantially duplicates the information in the
records directory.
RESTORES
Each different kind of system uses a different dump program and thus a
different restore program. The basic idea is that on the system that
was dumped, give a command that pipes the dump output from the tape
into the restore program.
It is usually easiest to forward the tape to the correct file before
logging onto the system to be restored. The number of files to forward
over is listed as the forth field in the system dump records database.
On hp-ux systems, the command is mt -t /dev/rmt/0mn fsf num-
----
ber of files to skip. On BSD-based systems, the command is usually mt
--------------------
-f /dev/nrst0 number of files to skip.
-----------------------
The blocksize used to write the tapes is specified in the beginning of
the frogbak program file. The value that I use is 112 blocks, or 56k.
This size is not arbitrary. On Suns, sizes above 127 blocks are not
reliable. Exabytes physically write data in 8k chunks. Larger block
sizes have less system overhead and are generally faster. 56k is the
largest multiple of 8k smaller than 128 blocks.
Dumps can be written in several different formats depending on the type
of system being dumped. In general the dump(8) command is used, but on
Apollos the wbak(1) command is used, and on Xenix cpio(1) is used. The
command needed to restore depends on what was used. On some servers,
compresssion is possible in which case the dump must be uncompressed to
restore.
At Berkeley Research and Trading the command needed to restore most
systems is: remsh server -n dd if=/dev/rmt/0mn ibs=112b | /etc/restore
------ ------------ ---
-ivf -.
Each of the different programs used to do the dumps handles restores in
a different way. With wbak(1) and cpio(1), the set of files to be
restored must be specified on the command line. With restore(8), the
set of files to be restored can be chosen interactivly (-i flag).
Obviously, you must load the right tape before trying to restore from
it. Hopefully, each tape will have a paper label that identifies it.
If it doesn't or, if the label is incorrect, you can identify a dump
tapes by copying off the first file. The first file on each dump tape
specifies the tape name and it lists which dumps are going to be
attempted. If you loose your dump tape database, you may need to use
this method to restore it.
UTILITIES
There are several utilities that are part of the frogbak package. They
are sum covered which adds up how much disk space is backed up by a
-
control file; recycle which marks the tape as erased; send offsite
-
which figures out which tapes are not needed to do a full restore; and
mkblank which names a tape.
The sum covered command is useful for partitioning the clients among
-
several servers because frogbak doesn't do it for you. As arguments,
you must provide the names of control files.
The mkblank command must be run to initialize blank tapes. Tapes must
be initialized before frogbak is run. The argument to mkblank is the
name for the tape. Each tape should have a unique name. I recommend
that the name be a short string followed by a three digit sequence num-
ber. In case it isn't obvious, the tape must be in the drive when you
run mkblank.
Although it is possible to just keep buying new tapes, it is not necce-
sary. The recycle program lets frogbak know that the dumps on the
recycled tape no longer exist and that it is okay to overwrite the
tape. The arguments to recycle are the list of tapes (by name) that
should be marked as recycled. Nothing is done to the actual tape when
it is marked recycled; the database is updated.
It can be difficult to figure out which tapes are potentially required
to do a restored. The send offsite program will figure out what tapes
-
are not required to do a full restore of everything (assuming, of
course that all the tapes are good). Using, send offsite, it is easy
-
to pick which tapes can be sent away. It also shows you how many tapes
it has been since every system was covered by a full dump. Only the
last few most recent un-needed tapes are shown.
DAILY TASKS
It is possible to run frogbak from cron(1). However, a labeled blank
or recycled tape must be put in the drive prior to running frogbak.
Tapes which are not either labeled blank or recycled will be rejected.
Blank tapes are made with with the mkblank utility. Recycled tapes are
made with the recycle program.
It is important that the output from frogbak be examined each day. If
all the dumps run at somewhat standard priorities, then you can tell if
something has not been dumped recently because its priority will be
off. If priorities are not standardized, every failure must be
checked.
There is no warning system built into frogbak. You have to be very
careful to watch what it does to make sure that nothing gets neglected.
EXAMPLES
Initialize a new tape and dump to it:
# mkblank SEQ-037
------- -------
# frogbak
-------
Recycle an old tape and dump to it:
# recycle SEQ-016
------- -------
# frogbak
-------
Check to see how much disk space is being backed up:
# sum covered control.*
----------- ---------
Restore a single file from a dump(8) full dump:
% rlogin system to be restored -l root
------ --------------------- -- ----
# rsh system with tape -n mt -t /dev/rmt/0mn fsf 8
--- ---------------- -- -- -- ------------ --- -
# rsh system with tape -n dd if=/dev/rmt/0mn | restore -ivf -
--- ---------------- -- -- --------------- - ------- ---- -
Verify and Initialize tape.
Dumped from: Sun May 2 20:02:00 1993
Extract directories from tape
Initialize symbol table.
restore > ls
2 *./ 2 *../ 16384 dev/ 10240 etc/ 18433 tmp/
restore > cd tmp
restore > ls
18433 ./ 18610 backup.ddout5679 18641 dump.remote
2 *../ 18643 backup.list5679 18644 rou5688
18434 5176 18608 bkup.log
restore > add bkup.log
Make node ./tmp
restore > add dump.remote
restore > extract
Extract requested files
extract file ./tmp/bkup.log
extract file ./tmp/dump.remote
Add links
Set directory mode, owner, and times.
set owner/mode for '.'? [yn] n
restore > quit
----
OPTIONS
Frogbak supports a few options:
-dryrun Specifies that dumps should not be performed.
Instead, frogbak looks at its control file and at
the records files and figures out what dumps it
would do. All of its figuring is sent to stan-
dard output for debugging puposes.
-summary Like the -dryrun option except that just the pro-
posed set of dumps is printed. Please note that
the summary you get is a summary of what would
happen if you ran frogbak right now. If frogbak
is invoked from cron(8), then it is likely that
the actions that are reported now will not match
the actions that will actaully occur.
-future amount of time
--------------
Specifies that frogbak should pretend that the
time is really sometime in the future. This is
for use with the -summary option. The
amount of time string is in the same format as
--------------
the dump periods in the control file: a number
followed by the units: h, d, w, or m for hours,
- - - -
days, weeks, or months.
-control control file Specifies that control.control file should be
------------ ------------
used intead of control.hostname.
---------
CONFIGURATION
The real options are the configuration variables like compression must
be specified by changing the frogbak program file itself (frogback is
written in perl(1)).
$do compress turns compression on and off. Compresssion is very handy
-
and I recommend using it when you can. Using it requires a device
driver that allows odd-sized blocks to be written to tape and the end
of the dump. Also, the compress(1) program that comes with most oper-
ating systems is annoyingly slow. The latest versions of compress are
much faster and should be used.
The $eject options controls whether the tape is ejected after a suc-
cessful dump.
If you have installed a version of rsh(1) that allows you to specifiy a
timeout, turn on $timeout rsh.
-
ENVIRONMENT
There are no ENVIRONMENT variables that are used by the frogbak system.
PORTS
The frogbak system can be thought of as having a server and clients.
It is not really a client-server system, but since tape drives are
often on servers and clients are often what is being backed up, the
analogy holds some water.
The server currently works with SunOS 4.*, Mach 2.6, and HP-UX 8.*.
The client side currently supports:
sunos Sun-3, and Sun-4 running SunOS 4.*. The dump(8) program is
used.
mach Mach 2.6 running on i386 systems. The dump(8) program is
used.
hp-ux HP-UX 8.* on HP9000/400, HP9000/700, and HP9000/800 sys-
tems. The dump(8) program is used.
ultrix Ultrix 3.* and 4.* running on MIPS-based systems. The
dump(8) program is used.
sony Sony's BSD4.3 OS running on their NEWS systems. The
dump(8) program is used.
xenix SCO Xenix running on a i386. The cpio(1) program is used.
domain Apollo Domain OS version 9.6 and above. The wbak(1) pro-
gram is used.
PORTING
The frogbak system is kinda a pain to move around. Each of the files
must be customized for each site. Most, if not all, of the portability
switches are in the first few lines of each file. When modifying the
frogbak file itself, search for uses of the various strings like Sun-
OS, and sunos.
Please send any portability changes back for incorporation.
OFFSITE
It is critically important that dumps be stored off-site. Unfortu-
antly, frogbak does not provide any help in chosing which tapes should
go off-site. In fact, it makes it difficult because each tape is a
grab-bag of what was highest priority at the time the tape was written.
BUGS
This system is not very well designed or implemented. It is very
cranky. However it does work reliably. The major bugs have to do with
the design.
The dump sequence, although pretty good, is not optimal. A better
sequence would be a replicated towers-of-hanoi. The dump sequence does
not start off smoothly until every system has been both full and
incremental dumped, frogbak does things in a somewhat odd order.
When using frogbak, nothing prevents systems from being overlooked.
Using the default rsh(1) program (remsh(1) on HP-UX), it is easy for a
system to hang the dumps. Rsh does not have a timeout on input and if
the remote system being dumped crashes, frogbak will hang. The solu-
tion for this is to replace rsh(1) with a special version that has
timeouts.
The frogbak system is only as good as the dump program that is used.
The BSD dump(8) program can write bogus dumps when used on a live
filesystem. This usually is not a problem because everything is dumped
so many times.
The /etc/dumpdates file is faked when using dump(8). Somtimes the
original /etc/dumpdates file is not restored and annoying email is sent
by frogbak.
FILES
/y/adm/dump The top of the frogbak commands and records tree at
Berkeley Research and Trading.
records/hostname The directory of information about dumps of host-
--------- -----
name.
-----
tapes/ The directory of information about each dump tape.
recycled/ The directory of old information about tapes that
have been recycled.
logs/ The directory of dump output logs. This should be
cleaned occaisionaly because they can be fairly
large.
dump.remote A script that runs on the system to be dumped. Its
standard output must a dump and nothing else.
dump.local A shell script that copies dump.remote to the sys-
tem that is going to be dump and then runs it.
control.hostname The control file for hostname.
-------- --------
backup.log.NNNN Dump log files for invocations of frogbak that did
----
not complete cleanly.
/dev/rmt/0mn Tape device on HP-UX.
/dev/nrst0 Tape device on SunOS.
CREDITS
Thanks are due to Bruce Markey for figuring out how to tune frogbak.
Thanks are due to Larry Hubble for allowing a generous copyright notice
to be applied to frogbak.
AVAILABILITY
The copyright on this system is a bit murky. Some work was done on it
on behalf of TRW Financial Systems and they did not give me permission
to take the changes with me. I would be most surprised if they
objected.
Berkeley Research and Trading has disclaimed any rights to frogbak that
they might have.
AUTHOR
David Muir Sharnoff
----- ---- -------- ---------------------------
SEE ALSO
dump(8), restore(8), dd(1), rsh(1), mt(1).
Edition May 17, 1995 FROGBAK(1)