Najszybsza metoda usunięcia 1000000 plików z katalogu
Napisał: Patryk Krawaczyński
03/06/2013 w Administracja, Debug Brak komentarzy. (artykuł nr 415, ilość słów: 1648)
J
aki sposób usunięcia 1.000.000 plików (o wielkości 0 bajtów każdy) z pojedynczego katalogu jest najszybszy? Ostatnio wątek ten został poruszony na serwisie Quora. Zamiast używać popularnych poleceń typu rm
+ find
+ xargs
– użytkownik Zhenyu Lee zaproponował użycie polecenia rsync z pustym katalogiem jako źródłowym.
Pierwszy benchmark wykonano na sprzęcie:
Model: HP DL360 G7 Procesory: 2 (z 16) x Xeon E5620 2.40GHz (16 rdzeni) Pamięć: 24GB 1333MHz Dysk: cciss/c0d0 (cciss0): 300GB (4%) RAID-10 Kontroler: cciss0: Hewlett-Packard Company Smart Array G6 controllers, FW 3.66 System: RHEL Server 5.4 (Tikanga), Linux 2.6.18-164.el5 x86_64, 64-bit
Wyniki testów:
# metoda 1 ~/test $ /usr/bin/time -v rsync -a --delete empty/ a/ Command being timed: "rsync -a --delete empty/ a/" User time (seconds): 1.31 System time (seconds): 10.60 Percent of CPU this job got: 95% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:12.42 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 24378 Voluntary context switches: 106 Involuntary context switches: 22 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 # metoda 2 Command being timed: "find b/ -type f -delete" User time (seconds): 0.41 System time (seconds): 14.46 Percent of CPU this job got: 52% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:28.51 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 11749 Voluntary context switches: 14849 Involuntary context switches: 11 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 # metoda 3 find c/ -type f | xargs -L 100 rm ~/test $ /usr/bin/time -v ./delete.sh Command being timed: "./delete.sh" User time (seconds): 2.06 System time (seconds): 20.60 Percent of CPU this job got: 54% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:41.69 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 1764225 Voluntary context switches: 37048 Involuntary context switches: 15074 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 # metoda 4 find d/ -type f | xargs -L 100 -P 100 rm ~/test $ /usr/bin/time -v ./delete.sh Command being timed: "./delete.sh" User time (seconds): 2.86 System time (seconds): 27.82 Percent of CPU this job got: 89% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:34.32 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 1764278 Voluntary context switches: 929897 Involuntary context switches: 21720 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 # metoda 5 ~/test $ /usr/bin/time -v rm -rf f Command being timed: "rm -rf f" User time (seconds): 0.20 System time (seconds): 14.80 Percent of CPU this job got: 47% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:31.29 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 0 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 0 Minor (reclaiming a frame) page faults: 176 Voluntary context switches: 15134 Involuntary context switches: 11 Swaps: 0 File system inputs: 0 File system outputs: 0 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0
Drugi benchmark wykonano na sprzęcie:
Procesor: CPU: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz Pamięć: 4GB Dysk: ST3250318AS: 250G/7200RPM
Wyniki testów:
# metoda 1 Czas: 6m50.638s # metoda 2 Czas: 87m38.826s # metoda 3 Czas: 83m36.851s # metoda 4 Czas: 78m4.658s # metoda 5 Czas: 80m33.434s
Co czyni użycie programu rsync do tego typu operacji najszybszym narzędziem.
Więcej informacji: A faster way to delete millions of files in a directory