GNU/Linux >> Belajar Linux >  >> Linux

Alat Linux Untuk Memperlakukan File Sebagai Set Dan Melakukan Operasi Set Pada Mereka?

Adakah yang tahu tentang alat linux yang dirancang khusus untuk memperlakukan file sebagai set dan melakukan operasi set pada mereka? Seperti perbedaan, persimpangan, dll?

Jawaban yang Diterima:

Dengan asumsi elemen adalah string karakter selain NUL dan baris baru (hati-hati bahwa baris baru valid dalam nama file), Anda dapat mewakili set sebagai file teks dengan satu elemen per baris dan menggunakan beberapa utilitas Unix standar.

Setel Keanggotaan

$ grep -Fxc 'element' set   # outputs 1 if element is in set
                            # outputs >1 if set is a multi-set
                            # outputs 0 if element is not in set

$ grep -Fxq 'element' set   # returns 0 (true)  if element is in set
                            # returns 1 (false) if element is not in set

$ awk '$0 == "element" { s=1; exit }; END { exit !s }' set
# returns 0 if element is in set, 1 otherwise.

$ awk -v e='element' '$0 == e { s=1; exit } END { exit !s }'

Setel Persimpangan

$ comm -12 <(sort set1) <(sort set2)  # outputs intersect of set1 and set2

$ grep -xF -f set1 set2

$ sort set1 set2 | uniq -d

$ join -t <(sort A) <(sort B)

$ awk '!done { a[$0]; next }; $0 in a' set1 done=1 set2

Setel Kesetaraan

$ cmp -s <(sort set1) <(sort set2) # returns 0 if set1 is equal to set2
                                   # returns 1 if set1 != set2

$ cmp -s <(sort -u set1) <(sort -u set2)
# collapses multi-sets into sets and does the same as previous

$ awk '{ if (!($0 in a)) c++; a[$0] }; END{ exit !(c==NR/2) }' set1 set2
# returns 0 if set1 == set2
# returns 1 if set1 != set2

$ awk '{ a[$0] }; END{ exit !(length(a)==NR/2) }' set1 set2
# same as previous, requires >= gnu awk 3.1.5

Menetapkan Kardinalitas

$ wc -l < set     # outputs number of elements in set

$ awk 'END { print NR }' set

$ sed '$=' set

Uji Subset

$ comm -23 <(sort -u subset) <(sort -u set) | grep -q '^'
# returns true iff subset is not a subset of set (has elements not in set)

$ awk '!done { a[$0]; next }; { if !($0 in a) exit 1 }' set done=1 subset
# returns 0 if subset is a subset of set
# returns 1 if subset is not a subset of set

Setel Serikat

$ cat set1 set2     # outputs union of set1 and set2
                    # assumes they are disjoint

$ awk 1 set1 set2   # ditto

$ cat set1 set2 ... setn   # union over n sets

$ sort -u set1 set2  # same, but doesn't assume they are disjoint

$ sort set1 set2 | uniq

$ awk '!a[$0]++' set1 set2       # ditto without sorting

Setel Pelengkap

$ comm -23 <(sort set1) <(sort set2)
# outputs elements in set1 that are not in set2

$ grep -vxF -f set2 set1           # ditto

$ sort set2 set2 set1 | uniq -u    # ditto

$ awk '!done { a[$0]; next }; !($0 in a)' set2 done=1 set1

Tetapkan Perbedaan Simetris

$ comm -3 <(sort set1) <(sort set2) | tr -d 't'  # assumes not tab in sets
# outputs elements that are in set1 or in set2 but not both

$ sort set1 set2 | uniq -u

$ cat <(grep -vxF -f set1 set2) <(grep -vxF -f set2 set1)

$ grep -vxF -f set1 set2; grep -vxF -f set2 set1

$ awk '!done { a[$0]; next }; $0 in a { delete a[$0]; next }; 1;
       END { for (b in a) print b }' set1 done=1 set2

Set Daya

Semua kemungkinan subset dari set yang ditampilkan dengan spasi terpisah, satu per baris:

$ p() { [ "$#" -eq 0 ] && echo || (shift; p "[email protected]") |
        while read r; do printf '%s %sn%sn' "$1" "$r" "$r"; done; }
$ p $(cat set)

(mengasumsikan elemen tidak mengandung SPC, TAB (dengan asumsi nilai default $IFS ), garis miring terbalik, karakter wildcard).

Terkait:Perangkat lunak kompresi file mana untuk linux yang menawarkan pengurangan ukuran tertinggi??

Tetapkan Produk Kartesius

$ while IFS= read -r a; do while IFS= read -r b; do echo "$a, $b"; done < set1; done < set2

$ awk '!done { a[$0]; next }; { for (i in a) print i, $0 }' set1 done=1 set2

Uji Himpunan Terpisah

$ comm -12 <(sort set1) <(sort set2)  # does not output anything if disjoint

$ awk '++seen[$0] == 2 { exit 1 }' set1 set2 # returns 0 if disjoint
                                             # returns 1 if not

Pengujian Kumpulan Kosong

$ wc -l < set            # outputs 0  if the set is empty
                         # outputs >0 if the set is not empty

$ grep -q '^' set        # returns true (0 exit status) unless set is empty

$ awk '{ exit 1 }' set   # returns true (0 exit status) if set is empty

Minimum

$ sort set | head -n 1   # outputs the minimum (lexically) element in the set

$ awk 'NR == 1 { min = $0 }; $0 < min { min = $0 }; END { print min }'
# ditto, but does numeric comparison when elements are numerical

Maksimum

$ sort test | tail -n 1    # outputs the maximum element in the set

$ sort -r test | head -n 1

$ awk '$0 > max { max = $0 }; END { print max }'
# ditto, but does numeric comparison when elements are numerical

Semua tersedia di http://www.catonmat.net/blog/set-operations-in-unix-shell-simplified/


Linux
  1. 5 Alat Baris Perintah untuk Menemukan File dengan Cepat di Linux

  2. Cara Mengarsipkan dan Mengompresi File di Linux

  3. Temukan banyak file dan ganti namanya di Linux

  1. Linux – Disimpan Di File /dev/pts Dan Bisakah Kita Membukanya?

  2. Linux – Direktori Standar Dan/atau Umum Pada OS Unix/linux?

  3. Linux Hapus File dan Direktori

  1. Salin file antara Linux dan FreeDOS

  2. Membuat dan men-debug file dump Linux

  3. Perintah Linux untuk menggabungkan file audio dan mengeluarkannya ke ogg