-*- indented-text -*-
-BUGS ---------------------------------------------------------------
-Fix hardlink reporting 2002/03/25
-Fix progress indicator to not corrupt log
-lchmod question
-Do not rely on having a group called "nobody"
-Incorrect timestamps (Debian #100295)
-Win32
-
FEATURES ------------------------------------------------------------
-server-imposed bandwidth limits
-rsyncd over ssh
Use chroot only if supported
Allow supplementary groups in rsyncd.conf 2002/04/09
Handling IPv6 on old machines
-Other IPv6 stuff:
+Other IPv6 stuff
Add ACL support 2001/12/02
-Lazy directory creation
-Conditional -z for old protocols
proxy authentication 2002/01/23
SOCKS 2002/01/23
FAT support
-Allow forcing arbitrary permissions 2002/03/12
--diff david.e.sewell 2002/03/15
-Add daemon --no-detach and --no-fork options
+Add daemon --no-fork option
+Create more granular verbosity 2003/05/15
DOCUMENTATION --------------------------------------------------------
-Update README
Keep list of open issues and todos on the web site
-Update web site from CVS
Perhaps redo manual as SGML
LOGGING --------------------------------------------------------------
-Make dry run list all updates 2002/04/03
Memory accounting
Improve error messages
-Better statistics: Rasmus 2002/03/08
+Better statistics Rasmus 2002/03/08
Perhaps flush stdout like syslog
-Log deamon sessions that just list modules
Log child death on signal
-Keep stderr and stdout properly separated (Debian #23626)
-Log errors with function that reports process of origin
verbose output David Stein 2001/12/20
-Add reason for transfer to file logging
-debugging of daemon 2002/04/08
internationalization
DEVELOPMENT --------------------------------------------------------
Handling duplicate names
Use generic zlib 2002/02/25
-TDB: 2002/03/12
+TDB 2002/03/12
Splint 2002/03/12
-Memory debugger
-Create release script
-Add machines to build farm
PERFORMANCE ----------------------------------------------------------
-File list structure in memory
Traverse just one directory at a time
-Hard-link handling
Allow skipping MD4 file_sum 2002/04/08
Accelerate MD4
-String area code
TESTING --------------------------------------------------------------
Torture test
Test large files
Create mutator program for testing
Create configure option to enable dangerous tests
-If tests are skipped, say why.
-Test daemon feature to disallow particular options.
Create pipe program for testing
Create test makefile target for some tests
-Test "refuse options" works
RELATED PROJECTS -----------------------------------------------------
rsyncsh
-http://rsync.samba.org/rsync-and-debian/
+https://rsync.samba.org/rsync-and-debian/
rsyncable gzip patch
rsyncsplit as alternative to real integration with gzip?
reverse rsync over HTTP Range
-BUGS ---------------------------------------------------------------
-
-Fix hardlink reporting 2002/03/25
- (was: There seems to be a bug with hardlinks)
-
- mbp/2 build$ ls -l /tmp/a /tmp/b -i
- /tmp/a:
- total 32
- 2568307 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a1
- 2568307 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a2
- 2568307 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a3
- 2568310 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 a4
- 2568310 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 a5
- 2568310 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b1
- 2568310 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b2
- 2568310 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b3
-
- /tmp/b:
- total 32
- 2568309 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a1
- 2568309 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a2
- 2568309 -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a3
- 2568311 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 a4
- 2568311 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 a5
- 2568311 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b1
- 2568311 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b2
- 2568311 -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b3
- mbp/2 build$ rm -r /tmp/b && ./rsync -avH /tmp/a/ /tmp/b
- building file list ... done
- created directory /tmp/b
- ./
- a1
- a4
- a2 => a1
- a3 => a2
- wrote 350 bytes read 52 bytes 804.00 bytes/sec
- total size is 232 speedup is 0.58
- mbp/2 build$ rm -r /tmp/b
- mbp/2 build$ ls -l /tmp/b
- ls: /tmp/b: No such file or directory
- mbp/2 build$ rm -r /tmp/b && ./rsync -avH /tmp/a/ /tmp/b
- rm: cannot remove `/tmp/b': No such file or directory
- mbp/2 build$ rm -f -r /tmp/b && ./rsync -avH /tmp/a/ /tmp/b
- building file list ... done
- created directory /tmp/b
- ./
- a1
- a4
- a2 => a1
- a3 => a2
- wrote 350 bytes read 52 bytes 804.00 bytes/sec
- total size is 232 speedup is 0.58
- mbp/2 build$ ls -l /tmp/b
- total 32
- -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a1
- -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a2
- -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a3
- -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 a4
- -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 a5
- -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b1
- -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b2
- -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b3
- mbp/2 build$ ls -l /tmp/a
- total 32
- -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a1
- -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a2
- -rw-rw-r-- 3 mbp mbp 29 Mar 25 17:30 a3
- -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 a4
- -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 a5
- -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b1
- -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b2
- -rw-rw-r-- 5 mbp mbp 29 Mar 25 17:30 b3
-
- -- --
-
-
-Fix progress indicator to not corrupt log
-
- Progress indicator can produce corrupt output when transferring directories:
-
- main/binary-arm/
- main/binary-arm/admin/
- main/binary-arm/base/
- main/binary-arm/comm/8.56kB/s 0:00:52
- main/binary-arm/devel/
- main/binary-arm/doc/
- main/binary-arm/editors/
- main/binary-arm/electronics/s 0:00:53
- main/binary-arm/games/
- main/binary-arm/graphics/
- main/binary-arm/hamradio/
- main/binary-arm/interpreters/
- main/binary-arm/libs/6.61kB/s 0:00:54
- main/binary-arm/mail/
- main/binary-arm/math/
- main/binary-arm/misc/
-
- -- --
-
-
-lchmod question
-
- I don't think we handle this properly on systems that don't have the
- call. Are there any such?
-
- -- --
-
-
-Do not rely on having a group called "nobody"
-
- http://www.linuxbase.org/spec/refspecs/LSB_1.1.0/gLSB/usernames.html
-
- On Debian it's "nogroup"
-
- -- --
-
-
-Incorrect timestamps (Debian #100295)
-
- A bit hard to believe, but apparently it happens.
-
- -- --
-
-
-Win32
-
- Don't detach, because this messes up --srvany.
-
- http://sources.redhat.com/ml/cygwin/2001-08/msg00234.html
-
-
-
- -- --
-
FEATURES ------------------------------------------------------------
-server-imposed bandwidth limits
-
- -- --
-
-
-rsyncd over ssh
-
- There are already some patches to do this.
-
- BitKeeper uses a server whose login shell is set to bkd. That's
- probably a reasonable approach.
-
- -- --
-
Use chroot only if supported
If running as non-root, then don't fail, just give a warning.
(There was a thread about this a while ago?)
- http://lists.samba.org/pipermail/rsync/2001-August/thread.html
- http://lists.samba.org/pipermail/rsync/2001-September/thread.html
+ https://lists.samba.org/pipermail/rsync/2001-August/thread.html
+ https://lists.samba.org/pipermail/rsync/2001-September/thread.html
-- --
platforms that have a half-working implementation, so redefining
these functions clashes with system headers, and leaving them out
breaks. This affects at least OSF/1, RedHat 5, and Cobalt, which
- are moderately improtant.
+ are moderately important.
Perhaps the simplest solution would be to have two different files
implementing the same interface, and choose either the new or the
-- --
-Other IPv6 stuff:
+Other IPv6 stuff
Implement suggestions from http://www.kame.net/newsletter/19980604/
and ftp://ftp.iij.ad.jp/pub/RFC/rfc2553.txt
multiple passive addresses. This might be a bit harder, because we
may need to select on all of them. Hm.
- Define a syntax for IPv6 literal addresses. Since they include
- colons, they tend to break most naming systems, including ours.
- Based on the HTTP IPv6 syntax, I think we should use
-
- rsync://[::1]/foo/bar [::1]::bar
-
- which should just take a small change to the parser code.
-
-- --
Transfer ACLs. Need to think of a standard representation.
Probably better not to even try to convert between NT and POSIX.
Possibly can share some code with Samba.
-
- -- --
-
-
-Lazy directory creation
-
- With the current common --include '*/' --exclude '*' pattern, people
- can end up with many empty directories. We might avoid this by
- lazily creating such directories.
-
- -- --
-
-
-Conditional -z for old protocols
-
- After we get the @RSYNCD greeting from the server, we know it's
- version but we have not yet sent the command line, so we could just
- remove the -z option if the server is too old.
-
- For ssh invocation it's not so simple, because we actually use the
- command line to start the remote process. However, we only actually
- do compression in token.c, and we could therefore once we discover
- the remote version emit an error if it's too old. I'm not sure if
- that's a good tradeoff or not.
+ NOTE: there is a patch that implements this in the "patches" subdir.
-- --
-- --
-Allow forcing arbitrary permissions 2002/03/12
-
- On 12 Mar 2002, Dave Dykstra <dwd@bell-labs.com> wrote:
- > If we would add an option to do that functionality, I
- > would vote for one that was more general which could mask
- > off any set of permission bits and possibly add any set of
- > bits. Perhaps a chmod-like syntax if it could be
- > implemented simply.
-
- I think that would be good too. For example, people uploading files
- to a web server might like to say
-
- rsync -avzP --chmod a+rX ./ sourcefrog.net:/home/www/sourcefrog/
-
- Ideally the patch would implement as many of the gnu chmod semantics
- as possible. I think the mode parser should be a separate function
- that passes back something like (mask,set) description to the rest
- of the program. For bonus points there would be a test case for the
- parser.
-
- Possibly also --chown
-
- (Debian #23628)
-
- -- --
-
-
--diff david.e.sewell 2002/03/15
Allow people to specify the diff command. (Might want to use wdiff,
-- --
-Add daemon --no-detach and --no-fork options
+Add daemon --no-fork option
Very useful for debugging. Also good when running under a
daemon-monitoring process that tries to restart the service when the
-- --
-DOCUMENTATION --------------------------------------------------------
-Update README
+Create more granular verbosity 2003/05/15
- -- --
+ Control output with the --report option.
+ The option takes as a single argument (no whitespace) a
+ comma delimited lists of keywords.
-Keep list of open issues and todos on the web site
+ This would separate debugging from "logging" as well as
+ fine grained selection of statistical reporting and what
+ actions are logged.
+
+ https://lists.samba.org/archive/rsync/2003-May/006059.html
-- --
+DOCUMENTATION --------------------------------------------------------
-Update web site from CVS
+
+Keep list of open issues and todos on the web site
-- --
LOGGING --------------------------------------------------------------
-Make dry run list all updates 2002/04/03
-
- --dry-run is too dry
-
- Mark Santcroos points out that -n fails to list files which have
- only metadata changes, though it probably should.
-
- There may be a Debian bug about this as well.
-
- -- --
-
Memory accounting
At exit, show how much memory was used for the file list, etc.
- Also we do a wierd exponential-growth allocation in flist.c. I'm
+ We also do a weird exponential-growth allocation in flist.c. I'm
not sure this makes sense with modern mallocs. At any rate it will
make us allocate a huge amount of memory for large file lists.
our load? (Debian #28416) Probably fixed now, but a test case would
be good.
-
-
-- --
-Better statistics: Rasmus 2002/03/08
+Better statistics Rasmus 2002/03/08
<Rasmus>
hey, how about an rsync option that just gives you the
Perhaps flush stdout after each filename, so that people trying to
monitor progress in a log file can do so more easily. See
- http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=48108
-
- -- --
-
-
-Log deamon sessions that just list modules
-
- At the connections that just get a list of modules are not logged,
- but they should be.
+ https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=48108
-- --
-- --
-Keep stderr and stdout properly separated (Debian #23626)
-
- -- --
-
-
-Log errors with function that reports process of origin
-
- Use a separate function for reporting errors; prefix it with
- "rsync:" or "rsync(remote)", or perhaps even "rsync(local
- generator): ".
-
- -- --
-
-
verbose output David Stein 2001/12/20
- Indicate whether files are new, updated, or deleted
-
At end of transfer, show how many files were or were not transferred
correctly.
-- --
-Add reason for transfer to file logging
-
- Explain *why* every file is transferred or not (e.g. "local mtime
- 123123 newer than 1283198")
-
- -- --
-
-
-debugging of daemon 2002/04/08
-
- Add an rsyncd.conf parameter to turn on debugging on the server.
-
- -- --
-
-
internationalization
Change to using gettext(). Probably need to ship this for platforms
Handling duplicate names
- We need to be careful of duplicate names getting into the file list.
- See clean_flist(). This could happen if multiple arguments include
- the same file. Bad.
-
- I think duplicates are only a problem if they're both flowing
- through the pipeline at the same time. For example we might have
- updated the first occurrence after reading the checksums for the
- second. So possibly we just need to make sure that we don't have
- both in the pipeline at the same time.
-
- Possibly if we did one directory at a time that would be sufficient.
-
- Alternatively we could pre-process the arguments to make sure no
- duplicates will ever be inserted. There could be some bad cases
- when we're collapsing symlinks.
-
- We could have a hash table.
-
- The root of the problem is that we do not want more than one file
- list entry referring to the same file. At first glance there are
- several ways this could happen: symlinks, hardlinks, and repeated
- names on the command line.
-
- If names are repeated on the command line, they may be present in
- different forms, perhaps by traversing directory paths in different
- ways, traversing paths including symlinks. Also we need to allow
- for expansion of globs by rsync.
-
- At the moment, clean_flist() requires having the entire file list in
- memory. Duplicate names are detected just by a string comparison.
-
- We don't need to worry about hard links causing duplicates because
- files are never updated in place. Similarly for symlinks.
-
- I think even if we're using a different symlink mode we don't need
- to worry.
-
- Unless we're really clever this will introduce a protocol
- incompatibility, so we need to be able to accept the old format as
- well.
+ Some folks would like rsync to be deterministic in how it handles
+ duplicate names that come from mering multiple source directories
+ into a single destination directory; e.g. the last name wins. We
+ could do this by switching our sort algorithm to one that will
+ guarantee that the names won't be reordered. Alternately, we could
+ assign an ever-increasing number to each item as we insert it into
+ the list and then make sure that we leave the largest number when
+ cleaning the file list (see clean_flist()). Another solution would
+ be to add a hash table, and thus never put any duplicate names into
+ the file list (and bump the protocol to handle this).
-- --
-- --
-TDB: 2002/03/12
-
- Rather than storing the file list in memory, store it in a TDB.
-
- This *might* make memory usage lower while building the file list.
-
- Hashtable lookup will mean files are not transmitted in order,
- though... hm.
-
- This would neatly eliminate one of the major post-fork shared data
- structures.
-
- -- --
-
-
Splint 2002/03/12
Build rsync with SPLINT to try to find security holes. Add
-- --
-
-Memory debugger
-
- jra recommends Valgrind:
-
- http://devel-home.kde.org/~sewardj/
-
- -- --
-
-
-Create release script
-
- Script would:
-
- Update spec files
-
- Build tar file; upload
-
- Send announcement to mailing list and c.o.l.a.
-
- Make freshmeat announcement
-
- Update web site
-
- -- --
-
-
-Add machines to build farm
-
- Cygwin (on different versions of Win32?)
-
- HP-UX variants (via HP?)
-
- SCO
-
-
-
- -- --
-
PERFORMANCE ----------------------------------------------------------
-File list structure in memory
-
- Rather than one big array, perhaps have a tree in memory mirroring
- the directory tree.
-
- This might make sorting much faster! (I'm not sure it's a big CPU
- problem, mind you.)
-
- It might also reduce memory use in storing repeated directory names
- -- again I'm not sure this is a problem.
-
- -- --
-
-
-Traverse just one directory at a time
-
- Traverse just one directory at a time. Tridge says it's possible.
-
- At the moment rsync reads the whole file list into memory at the
- start, which makes us use a lot of memory and also not pipeline
- network access as much as we could.
-
- -- --
-
-
-Hard-link handling
-
- At the moment hardlink handling is very expensive, so it's off by
- default. It does not need to be so.
-
- Since most of the solutions are rather intertwined with the file
- list it is probably better to fix that first, although fixing
- hardlinks is possibly simpler.
-
- We can rule out hardlinked directories since they will probably
- screw us up in all kinds of ways. They simply should not be used.
-
- At the moment rsync only cares about hardlinks to regular files. I
- guess you could also use them for sockets, devices and other beasts,
- but I have not seen them.
-
- When trying to reproduce hard links, we only need to worry about
- files that have more than one name (nlinks>1 && !S_ISDIR).
-
- The basic point of this is to discover alternate names that refer to
- the same file. All operations, including creating the file and
- writing modifications to it need only to be done for the first name.
- For all later names, we just create the link and then leave it
- alone.
-
- If hard links are to be preserved:
-
- Before the generator/receiver fork, the list of files is received
- from the sender (recv_file_list), and a table for detecting hard
- links is built.
-
- The generator looks for hard links within the file list and does
- not send checksums for them, though it does send other metadata.
-
- The sender sends the device number and inode with file entries, so
- that files are uniquely identified.
-
- The receiver goes through and creates hard links (do_hard_links)
- after all data has been written, but before directory permissions
- are set.
-
- At the moment device and inum are sent as 4-byte integers, which
- will probably cause problems on large filesystems. On Linux the
- kernel uses 64-bit ino_t's internally, and people will soon have
- filesystems big enough to use them. We ought to follow NFS4 in
- using 64-bit device and inode identification, perhaps with a
- protocol version bump.
-
- Once we've seen all the names for a particular file, we no longer
- need to think about it and we can deallocate the memory.
-
- We can also have the case where there are links to a file that are
- not in the tree being transferred. There's nothing we can do about
- that. Because we rename the destination into place after writing,
- any hardlinks to the old file are always going to be orphaned. In
- fact that is almost necessary because otherwise we'd get really
- confused if we were generating checksums for one name of a file and
- modifying another.
-
- At the moment the code seems to make a whole second copy of the file
- list, which seems unnecessary.
-
- We should have a test case that exercises hard links. Since it
- might be hard to compare ./tls output where the inodes change we
- might need a little program to check whether several names refer to
- the same file.
-
- -- --
-
-
Allow skipping MD4 file_sum 2002/04/08
If we're doing a local transfer, or using -W, then perhaps don't
calculating MD4 checksums uses 90% of CPU and is unlikely to be
useful.
- Indeed for transfers over zlib or ssh we can also rely on the
- transport to have quite strong protection against corruption.
-
- Perhaps we should have an option to disable this,
- analogous to --whole-file, although it would default to
- disabled. The file checksum takes up a definite space in
- the protocol -- we can either set it to 0, or perhaps just
- leave it out.
+ We should not allow it to be disabled separately from -W, though
+ as it is the only thing that lets us know when the rsync algorithm
+ got out of sync and messed the file up (i.e. if the basis file
+ changed between checksum generation and reception).
-- --
-- --
-
-String area code
-
- Test whether this is actually faster than just using malloc(). If
- it's not (anymore), throw it out.
-
- -- --
-
TESTING --------------------------------------------------------------
Torture test
-- --
-If tests are skipped, say why.
-
- -- --
-
-
-Test daemon feature to disallow particular options.
-
- -- --
-
-
Create pipe program for testing
Create pipe program that makes slow/jerky connections for
-- --
-
-Test "refuse options" works
-
- What about for --recursive?
-
- If you specify an unrecognized option here, you should get an error.
-
- We need a test case for this...
-
- Was this broken when we changed to popt?
-
- -- --
-
RELATED PROJECTS -----------------------------------------------------
rsyncsh
-- --
-http://rsync.samba.org/rsync-and-debian/
+https://rsync.samba.org/rsync-and-debian/
-- --
Goswin Brederlow suggested this on Debian; I think tridge and I
talked about it previous in relation to rproxy.
+ Addendum: It looks like someone is working on a version of this:
+
+ http://zsync.moria.org.uk/
+
-- --