Aka: How to do POST requests using Curl
My parents, being Catholic, signed up for a newsletter from the Vatican (as one does).
After a year or so of receiving it, they eventually realised they weren’t really reading it, and wanted to unsubscribe.
No problem. At the bottom of each message is the legally required footer:
Cancellare – that sounds about right (why the footer is in Italian when the rest of the newsletter is in English I have no idea).
Clicking Cancel(lare) takes you to this page:
With the default language set to Italian, despite coming from our English newsletter (nice one).
You can change the language at the top from Italiano to Inglese (USA) – although again, why it’s necessary to specify the USA version of English (not just “Inglese”) on a subscription management page, I have no idea.
It then looks like this:
Ok, so far so good.
Now. Just enter your email address (easy) and password, and we can begin.
Oh. What’s the password?!? Well hell, do YOU remember the passwords to every obscure service you signed up to a year ago that you haven’t thought about since?
Minor speed bump.
What about if we try it without our password? After all, unsubscribing should be easy. I mean, The Vatican, of all places, wouldn’t consciously want to SPAM people, would they?
So, enter your email address, click “Unsubscribe” and voila. The page refreshes, and you get a helpful message:
Except nothing actually happens. The unsubscribe email never arrives.
Now, there’s another fairly major niggle here – why the hell is it a double opt-in to unsubscribe? That’s VERY spammy. Lists should be hard to get onto (ie, double opt-in) – so, say, someone else can’t accidentally put your email address on a list. They should be easy to get off of. The Vatican has this completely ass-backwards. It’s spammy.
Ok, anyway, maybe we can get our password, and then change our email address to something that doesn’t work (firstname.lastname@example.org is a personal favourite).
Now our email address is entered we can click the Password Reminder button “remind”.
Except the password reminder button doesn’t work either.
You can click it – as many times as you like – and nothing happens. Ever.
So. What next? Well. We could try emailing them directly (no response), or via their contact form (no response). Several times, in fact (no response).
One day, The Vatican in its infinite kindness will get around to fixing this page. In the meantime they’ll keep happily blasting out emails every day, no doubt delighted in how many (angry and frustrated) believers they’re reaching.
However, as soon as the page IS fixed, we’d like to be off the damn list please. And I mean, within the day.
Enter Curl, stage left.
Curl is a nifty little unix app mostly used for getting web pages. It’s very helpful for things like debugging sites, because you can use it to view the headers on a site, like this:
curl -I -L google.com
(-I == just show the headers not the page content, -L == follow redirects)
Which shows lots of groovy useful technical stuff. If you’re into that kind of thing (I occasionally am).
However, a lesser known use of curl is to make POST requests (like entering a form) not just GET (getting a web page). Unsurprisingly, being unix, there’s a billion other things you can do with Curl, but we’ll stick to POST for today.
To do this, we just have to have a nosey into the page content, find the forms we want to enter the data for, and which fieldnames we want to use.
So, if we look at our Cancellare page, the HTML looks like this:
<FORM action="../options/visnews_it" method="POST" > <INPUT name="email" type="TEXT" value="" size="20" > <INPUT name="password" type="PASSWORD" value="" size="20" > <INPUT name="login" type="SUBMIT" value="Log in" > <INPUT name="login-unsub" type="SUBMIT" value="Unsubscribe" > <INPUT name="login-remind" type="SUBMIT" value="Remind" > </FORM>
I’ve stripped out all the other HTML junk (tables, text etc) and just left the form and inputs.
So, you can see – there’s a place to enter email, password, and then three submit buttons. One to login, one to unsubscribe, one to send a password reminder.
Why would we want Curl here?
Well, we can make Curl do this form request for us. We don’t have to start up a browser and enter all our details manually. Every day. Forever. Until they fix their site.
The command line goes like this:
curl –data “name=value&name=value&name=value” FORM_ACTION_URL
The form action URL (if you follow their up-and-down–into-the-same-directory bit) is http://mlists.vatican.va/mailman/options/visnews_it
We also want to enter our email address, and then finally, click the unsub button.
Thus, the name/value bit ends up:
email=[our email address]&login-unsub=Unsubscribe
If we wanted to get curl to send us a password reminder, we’d just change it to
email=[our email address]&login-remind=Remind
The only gotchya is that you can’t send @ characters. You have to escape them. So, instead of “email@example.com”, you’d have to use “email=bob\%40smith.com”
Thus the entire line become
curl –data “email=bob\%40smith.com&login-unsub=Unsubscribe” http://mlists.vatican.va/mailman/options/visnews_it
Finally, chuck the whole thing into crontab to run every 6 hours:
3 */6 * * * curl –data “email=[our email addr]&login-unsub=Cancellami” http://mlists.vatican.va/mailman/options/visnews_it > /dev/null 2>&1
and voila. (The “> /dev/null 2>&1” is just dark magic to say “ignore all output, even errors”)
Oh, login-unsub is set to Cancellami just because that’s what the original Italian form had. I figured they’d be more likely to fix the Italian unsubscribe before they fixed the English one (if they’re set up independently).
Once the Vatican gets their site working again, we’ll get an email (well, four a day), then we can click unsubscribe and I’ll delete the one line cron job. Easy. Until then, we’ll just delete the spam, happy in the knowledge that our little curl robot is taking care of things for us in the background.