iPhone and Linux

Thursday, February 11, 2010

Forum grabber

I use various methods to visit forums. If I'm at my computer, I use a browser, obviously. On my iPhone, I use a forum app named Kaytri a lot, but I visit a couple of forums that have very low thread counts and I thought it would be nice to get a "daily digest" of all the new subjects to see if there is anything interesting.

This can be done pretty easily with links or lynx and their -dump switch, but they both split the subject line and have so much additional output, that it was simpler to start from scratch.

Starting with wget spitting out the raw html in quiet mode, I looked at the code to a couple forums and found that most use "thread_title" in their html, so it's just a matter of grepping "thread_title" and filtering out the tags with sed and (optionally) aligning the text to the left with sed also.
wget -q -O - [FORUM] | grep thread_title | sed -e 's#<[^>]*>##g' -e 's/^[ \t]*//'
One possibility for using this is to get a daily forum update by putting the output in a file, comparing it with an older version, and sending yourself only the new content.
cd /path/to/tmpfile/directory
[wget command] > .b
newcontent=$(cat .b | grep -v -f .a)
# if there is new content, do something with it
# email it to yourself, send a push notification, etc
[ -n "$newcontent" ] && [do something with $newcontent]
mv .b .a
This can be useful for both fast and slow paced forums to watch for new subjects, then you can decide whether to visit them and jump into the argument or not.

Blog Archive