#StackBounty: #bash #grep grep with PTR records and domain+TLD match

Bounty: 200

I’m trying to determine if the domain+TLD are present in a list, after running the host command for an IP.

My script looks like this:

while read ip; do
  PTR=$(host $ip | rev | cut -d" " -f1 | rev | sed 's/.$//')
  if grep -q "$PTR" list.txt
  then
    echo "Match in list"
  else
    echo "No match in list"
  fi
done <ips.txt

The list.txt will contain:

dns.google
shodan.io

If I run my script for 8.8.8.8, which returns dns.google the script works as expected. If I run it for 198.20.99.130 it will fail (not match) as the result is census4.shodan.io.

Is there a way I can have grep match only if the domain+TLD (in this case shodan.io) is in the list?

While census4.shodan.io should match the list.txt, a domain like shodan.io.example.net shouldn’t.


Get this bounty!!!

#StackBounty: #shell-script #shell #grep #zgrep Slow performance of zgrep in multiple files

Bounty: 50

I have a 9.8GB gzip file A.gz and other file i have is 79MB B.txt which has some text in each line.
I want to grep B’s text in A.gz and write to a new file.

Initially, I used this command

zgrep -f B.txt A.gz > C.xml

But this command was hanged and it created an empty C.xml for very long time.

Then after googling I came to know that because B.txt is huge it hangs when it keeps the text in buffer.

So I splitted the text file in 20000 text each

split -l 20000 -a 4 B.txt B

I created files like Baaaa, Baaab….

and then iterated over each file

cd B
for f in B*; do
  zgrep -f "$f" ../A.gz >> C.xml
done

It is very slow and still running.

Any better approach for this?

Will gunzipping the gz file will improve performance?


Get this bounty!!!