Wget - Downloading Only Specific HTML Files Using Title Tag Information

SpesMelior · 30 September 2018 17:12

Having searched extensively I have been unable to find a solution so would be grateful if anyone can help me. Basically, what I am trying to achieve is to get Wget to download linked files but only specific files which meet a particular criterion. What I have in mind is to find out if wget is capable of doing this by only downloading those files which have a specific keyword in the Title Tag ie within . Say for instance a website had 100 html pages on various models of cars and included among these were 10 pages dealing with Vintage cars. The 90 Title tags would specify “Currently In Production” + the manufacturer + the car model. The other 10 would have “Vintage” + the Manufacturer + the car model. I got all 100 pages using wget. What I want is just the 10 pages dealing with the vintage cars. I have tried everything I could think of but nothing works. I had thought that perhaps using “–follow-tags=” might do the trick but I couldn’t make that work. I apologise for being so long winded but I just wanted to explain properly.

It may be that this is outwith the capabilities of wget but I just thought that as searching engines make use of Title tag info that perhaps wget did likewise. If anyone can help I would be grateful.
Thank you