Download last version: unixxdos 1.0
Page created on 2010-11-25 by André Gillibert
Unixxdos is free software, released under the WTFPL license.
Unixxdos is a conservative alternative to dos2unix and unix2dos. It provides a symmetrical operation, so that:
Any LF not preceded by a CR becomes a CRLF.
A CRLF sequence becomes a LF unless it's preceded by another CR.
A CRCRLF sequence is kept unchanged.
A CR not followed by a LF is kept unchanged.
Base rules: CR -> CR CR LF -> LF LF -> CR LF CR CR LF -> CR CR LF Significant samples: LF LF -> CR LF CR LF LF CR -> CR LF CR LF CR LF -> CR LF LF
A binary blob passed twice through unixxdos is unchanged.
Lemmas: Any sequence that's terminated by LF is still terminated by LF after transformation. That's pretty obvious from the four base rules. Any sequence that's terminated by CR and a non-CR non-LF char is also transformed into such a sequence.
It's not hard to see we can cut any file in three types of chunks: "(?<!\r)\r*\n" (type N chunk) sequences, "\r+([^\r\n]|$)" (type R chunk) sequences and "[^\r\n]" characters (type C chunk). Note: (?<!\r) is a negative perl5 look-behind assertion which means that the specified sequence is not preceded by a \r (see perlre(1)).
It's easy to see that the program will transform each chunk independently of others so that unixxdos(chunk1 . chunk2) is equal to (unixxdos(chunk1) . unixxdos(chunk2)) where dot is string concatenation. That's trivial if one of the chunks is a type C sequence, but, it's easy to see that it holds true if chunk1 is type N and chunk type R or the reverse, thanks to lemmas. Moreover, each sequence is transformed into a sequence of the same type.
Now, we just have to prove that unixxdos(unixxdos(chunk))=chunk. This is trivial for type C and type R chunks, but it's also easy to see it holds true for type N chunks.
unixxdos < dos.txt > unix.txt unixxods < unix.txt > dos.txt
unixxdos reads data from stdin and outputs transformed data to stdout. It recognizes no command line argument.