Dear rsync pros, I have a question:
How can I filter a file that is being transferred? Rsync has a --filter option, but that can only filter the list of files – not the file contents.
I'd like to run sed on the file stream before it is stored.
Is there any way to do this?
I already tried to sneak a script into the --rsh option but that did not go well of course.
@chpietsch If you're talking about filter at the src, have you tried feeding a grep+find output into rysnc to transfer?
@bahmanm I don't mean filter in the sense of selecting files but in the sense of Unix stream filters.
@chpietsch Not sure I understood the use case yet, but I guess it's not possible in within rsync itself. (Hm, should sed change the file in transit or just identify files to filter/exclude?)
Maybe you could use grep to create a list of files to exclude beforehand, and call rsync with --exclude-from=FILE
@david I want all files, so I don't care about --exclude etc. My use case is anonymizing IP addresses before writing them to disk at the receiving end.
@chpietsch Just an ugly idea for a single directory of files, transfering each file on its own:
for file in $(ls -1); do sed 's/…/g' "${file}" | ssh remotehost "cat > /tmp/$(basename ${file})" ; done
@david Thanks but I need to use rsync. I guess I will have to make rsync write logs containing the file names and process these logs later. It would have been nice never to store any PII on that server though.
isn't the main idea of rsync to detect changes between src and dst and copy. So if you change dst or src you defeat this purpose?
my current idea would be something 2 step even it requires 2 times the space.
src (server1) rsync to dst1 (server2)
on server2 let run a script which creates into dst2 the changed files if they don't exist in dst2