Overwrite, secure deletion utility
by Salvatore Sanfilippo <antirez@invece.org>

WHAT OVERWRITE DOES
===================

  Data stored on magnetic disk media can be recovered using
  sophisticated analysis techniques. This means that, for example,
  even if some your old file was overwritten by news data, maybe
  it's still recoverable. Overwrite is a UNIX utility that try to
  make harder data recovering. What overwrite does is to overwrite files
  using random patterns and deterministic patterns, as suggested in the
  Peter Gutmann's paper "Secure Deletion of Data from Magnetic and
  Solid-State Memory".
  ------------------------------------------------------------------
  WARNING: There are no proofs about the fact that overwrite works,
  I'm just able to check that overwrite is able to overwrite the
  files with specific patterns but I've no proofs that this makes the
  data harder to be recovered.
  Unfortunately, it's possible that if you use some type of hard-disk
  and/or some type of file system, overwrite will NOT able to
  overwrite the file correctly. Use it at your risk.
  ------------------------------------------------------------------

DO YOU NEED OVERWRITE?
======================

  Short answer: if you use cryptography, you need overwrite.
  If you need that after deletion your files will be hard to be recovered
  (for example the plaintext deletion after ecryption) use it. There
  are no guarantees, even if you use overwrite (or softwares like it)
  maybe will be still possible to recover your file, but, of course, with
  more work and/or resources.

HOW TO COMPILE OVERWRITE
========================
  Just try 'make'. Please, if it doesn't work send me a report.
  Install it coping the 'overwrite' binary in some directory contained
  in your PATH. use 'make install' to copy the binary in /usr/local/bin.

OVERWRITE USAGE
===============
  Overwrite is very simple to use. For example, in order to perform
  the secure deletion of the file 'plaintex' just write at shell prompt:

  $ overwrite plaintext
  [plaintext 59545 bytes]: rrrrPPPPPPPPPPPPPPPPPPPPPPPPPPPrrrr
  $

  Overwrite writes a 'r' for each random pass and a 'P' for each
  deterministic pass. After all passes overwrite unlink(2) the
  file. BE CAREFUL! This means that your file will be unrecoverable!
  ---------------------------------------------------------------------
  Overwrite is even more dangerous that the 'rm' command, since _maybe_
  after a wrong rm you will still able to recover the file. After
  a wrong overwrite to recover the file is almost impossible.
  ---------------------------------------------------------------------

  Overwrite supports some command line switch:
    -l    Is used to leave the file on the filesystem after ovewriting,
	  this means that unlink(2) will not call against the file. Anyway
	  the file will result overwritten and will no longer contain some
	  type of useful information. This is useful if you want run
	  overwrite twice (or more) times or to run another secure deletion
	  software against the same file.
    -v    That turns on the 'verbose mode'
    -f    Is the overwrite 'looo mode', that overwrite the specified files
	  forever (or better until ctrl+c). As you can guess -f implies -l.
    -d    This enable the 'device mode', that is usefull in order to
	  overwrite a char or block device, and in general with devices
	  that doesn't support stating. -d switch implies -l.
    -h    Shows a little help.

THANKS
======

The following people had contributed in some way, really thanks!

Nikos Mavroyanopoulos <nmav@hellug.gr>    [about the PRNG first reseed]
David Colburn	      <spookey@ender.com> [reported a Makefile problem]
Keith                 <kaw@Eng.Sun.COM>   [return type of sync(2) fix ]

OVERWRITE INTERNALS (This is intersting only for programmers)
===================

* The Pseudorandom Number Generator
  In order to shuffle the patterns appling sequence and to generate
  the pseudorandom stream for random passes, overdrive has a built-in
  secure (for this application) PRNG. It's just a SHA1 in counter mode.
  This meanse that the pseudorandom stream is the result of
  SHA1(state, counter++). The 'state' is a SHA1 digest that is reseeded
  so that state=SHA1(state, unguessable_data). Currently the unguessable
  data are nano-sec timestamps, specifically the timestamps used are
  obtained at program begin and for every sync, since the time that the sync
  process take is unguessable for an external attacker (but warning, on some
  systems maybe sync(2) just schedules the writing and returns). This isn't
  the "most secure" way to generate random numbers, but I think is enough
  secure for this application. If you want collect more entropy before
  to use the PRNG you may get it from /proc (under linux). To modify overwrite
  in order to get entropy from other sources is very simple. If you have
  some idea and a bit of time, please, send me an email about this topic.

* The passes and how overwrite ... overwrites.
  The patterns are obtained from Peter Gutmann's paper (see the top of
  this file for more information) and are applied opening the file
  for writing, performing and lseek(fd, 0, SEEK_SET) and just writing
  'size' bytes ('size' the file size). In order to speed up, the fixed
  pattern are copied into a 1024*pattern_size buffer and this buffer
  is used as pattern. This drammatically improve the performance at cost
  of some little memory. See KNOWN-BUGS file about the problems that can
  arise with some file system types.

* sync-related stuff
  After each overwriting, the function physic_sync() is called, in order
  to be sure that the OS cache doesn't destroy our efforts. Unfortunately,
  the sync(2) call just flush the buffers from kernel to disk. If your
  HD has a large cache maybe every pattern will not overwrite every part
  of the file. We need to force the HD to flush its cache. This is an hard
  problem, first since it's a lot arch-related, second since the way to do
  this often needs root privileges. The arch-dependent code for Linux/IDE
  is already present in overwrite, however it's necessary to add more
  arch-dependent code in order to reached a better level of security in BSD,
  solaris, and other unix-like systems.

* (overwrite > 0.3) News about the PRNG seed: Why overwrite doesn't
  use /dev/urandom and why it uses /dev/random to reseed the first time.
  I report my reply to Nikos Mavroyanopoulos mail in order to make this
  clean.

Nikos Mavroyanopoulos writes:
> There is only one suggestion. You may use /dev/urandom
> in systems that support it (eg. linux and some bsd implementations) instead of
> the prng. An other alternative is to use /dev/random output for the
> unguessable data of the prng.

It's a good issue and was implemented in overwrite 0.3, also Nikos suggested
to add a timeout when reading from /dev/random in order to avoid too long
delay. My notes about this issue:

1) Is only Linux/FreeBSD/OpenBSD, so if open(2) fails the generator
   will be initialized just with the timestamps.
2) /dev/urandom is too similar to the built-in prng (but better seeded),
   so I don't use /dev/urandom to get bytes, but just in order to seed the
   first time the built-in generator if /dev/random isn't available or too
   slow. This allows to reached the same security without deplate the
   OS entropy pool.

So actually the first /dev/[u]random based prng reseed works following
this diagram:

	1) Try to open /dev/random, if fails jump to step 5
	2) Read from /dev/random using select with a timeout
	3) Timeout reached but read doens't return? jump to step 5
	4) Readed less than needed bytes? Set the timeout to
	   (timeout-1) and jump to step 2. Readed all the bytes?
	   jump to step 8
	5) Try to open /dev/urandom, if fails jump to step 7
	6) Read the bytes from /dev/urandom and jump to step 8
	7) return an ERROR (non fatal, the system lacks /dev/[u]random)
	8) return OK

I hope this little utility will help you to enhance your privacy.
antirez
