It has been said for years that files on Linux does not become fragmented so it doesn't need defragmentation. It is not true!. Large files can certainly be fragmented on Linux, especially if they are written to often. Bittorrented files for example. Here is the proof:
$ sudo filefrag big-500mb-file
filefrag big-500mb-file: 4316 extents found, perfection would be 3 extents
$ sudo sh -c "echo 3 > /proc/sys/vm/drop_caches" # Clears fs caches and forces Linux to read from disk.
$ time cat big-500mb-file > /dev/null
real 0m24.842s
user 0m0.032s
sys 0m0.592s
Thats the time it takes to read the whole file sequentially takes when it is heavily fragmented. Compared to how long it would take if the file was not fragmented:
$ cp big-500mb-file 500mb-copy
$ sudo filefrag 500mb-copy
500mb-copy: 6 extents found, perfection would be 3 extents
$ sudo sh -c "echo 3 > /proc/sys/vm/drop_caches"
$ time cat 500mb-copy > /dev/null
$ time cat 500mb-copy > /dev/null
real 0m6.501s
user 0m0.024s
sys 0m0.508s
Note that the file is still fragmented, possibly because other IO operations are going on in the background. Three things can be learned from this exercise.
- Fragmentation does matter!. It took four times as long to read the fragmented file as it did the unfragmented one. The overhead could be even worse for smaller files because the seek time dominates. E.g. a 2mb file in 10 fragments could in worst case be 10 times as slow to read as if it was in one fragment.
- Bittorrent leaves files in a heavily fragmented state. Likely because thousands of writes are performed to the same file and it is hard to get them all in order. But I don't understand why it could preallocate the files in advance and then write to them?
- cp can defragment files.