Blowback From the War On Spam 1
So, the deluge of spam blowback continues. The problem seems widespread enough at this point that I feel like contacting the authors of major anti-spam software and suggest that they just immediately drop all e-mail with a jslopez@xman.org return path forever. I have added an SPF record to the domain’s DNS in the hopes that this will help other MTA’s realize that the e-mail is forged and not to send a bounce message, but I haven’t seen much in the way of impact.
Some fun stats:
- Since I created the jslopez@xman.org account in Google Apps, it has received over 920,000 e-mails.
- The total size of the e-mail that has been routed to the gmail account is 3.75GB. Fortunately, I have a 25GB quota, but at this pace I can expect to exceed the quota given to normal gmail users by the end of the week!
- Meanwhile, my old mail server continues to receive some jslopez@xman.org, although the rate of delivery has tapered off significantly. At its peak I was processing on the order of 500 jslopez@xman.org e-mails per second, and now it is more like two or three per minute.
- My old mail server logs show 550,000+ e-mail delivery attempts to jslopez@xman.org. That is over and above all the e-mails sent to Google Apps.
- My logs were totally overwhelmed by the deluge of spam and so they only go back to the afternoon of the 25th… in other words this is all pretty much after I had created the Google Apps account.
- This means I’ve received roughly 1.5 million e-mails probably around 5GB in total ever since I first started publishing SPF records which made it trivial to prove that the messages were forgeries. I published the SPF records immediately after adding the MX records for Google Apps, so the nearly 1 million messages that have been sent to the Google Apps account in particular have no excuse for being there.
- I conservatively estimate another 400,000 or so rejects that were lost in my logs. I expect by the end of the day today, jslopez@xman.org will have received on the order of 2 million bounces in total, representing approximately 8 GB of bounce messages.
- Most bounce messages are terser than the original messages, so I suspect this means the total for the original messages that got bounced is measured in tens of gigabytes.
- I’d like to think most spam delivery attempts don’t result in bounces, either because they get through (otherwise, why bother?) or are rejected/swallowed without a bounce (surely some MTA’s are correctly configured). This one attack probably represents hundreds of gigabytes if not terabytes of e-mail bouncing all around the Internet.
- Had this bandwidth not been used for of spamming the Internet, the spammer could have used all this bandwidth for a good cause: like stealing a half a million songs, or torrenting a thousand movies or watching Internet porn 24/7 for a year.
It’d be fun to do some more stats, like estimating how many watts this one deluge of spam likely consumed, just so I can come up with some convoluted way of demonstrating that spammers are “with the terrorists”, but I’ll stop now, because it just makes me want to cry.
All this is making me think that small mail servers need a very efficient way to discard e-mails sent to an invalid recipient. I still haven’t made an embedded database of valid e-mails for my domain, but that is the logical next step. I need to make sure the check is done very early in my e-mail pipeline: before grey listing, before domain verifications, baysian filtering, virus checks, etc. Packages like postfix should have a setting that will allow them to automatically build a cdb database of e-mail addresses and hosted domains whenever they are presented with an LDAP/SQL backend for their datastore.
I’m also increasingly thinking I should perhaps change my e-mail config: have my VPS server just serve to filter out invalid spam, and then forward the good stuff to my server at home. It’s insane, but if spamming economics don’t change, I suspect hosting mail for even a small domain may require fairly significant computing resources and bandwidth.
Linux AIO sucks less
So, the last little while on the Zumastor project, I’ve been working on integrating AIO in to the code base in order to absorb some of the latency penalty that we experience from disk seeks.
Critical for this was getting AIO to work with poll(2), because the ddsnapd daemon follows the tried-and-true “poll then do something” loop that allows for efficient, scalable, and relatively simple Unix server design. Unfortunately (or fortunately, if you are familiar with POSIX ;-), Linux’s native AIO doesn’t follow the POSIX AIO spec, and instead implements it’s own event queue for notification of completion of IO operations. This event queue isn’t exposed as a file, so you can’t poll it. So, I hacked together a library that spawns a separate thread which does nothing but read in events and copy them out to a pipe, so that the main thread can poll said pipe just like any other file descriptor. Ugly? Yes. Wasteful? Yes. Easier to work with than the apparent alternatives? Yes.
I got most of the way through the process. I discovered what appears to be some kind of race condition in AIO where the vast majority of the time I was losing completion events if I submitted multiple IO requests at once. I still haven’t tracked it down, but while looking for possible sources of the problem, I discovered a heretofore unknown (well, by those of us on the project at least) syscall: eventfd(2).
eventfd does for AIO what signalfd(2) does for signals. In other words: it does the obvious thing that we wanted in the first place but were too mentally challenged to find. The moral of the story: even if you think a Linux API (AIO in this case) sucks, expect it to suck less, and question why when it doesn’t.
Standards, Standards Bodies, and "complete, utter, unadulterated bullshit"
If you haven’t read it already, run over to Tim Bray’s blog about the ISO BRM around the OOXML standard. He is a quiet, staid, technical expert, representing quiet, staid, Canada. Despite this, he was so disgusted by what happened that he wrote the following:
What Was Bad · The process was complete, utter, unadulterated bullshit. I’m not an ISO expert, but whatever their “Fast Track” process was designed for, it sure wasn’t this. You just can’t revise six thousand pages of deeply complex specification-ware in the time that was provided for the process. That’s true whether you’re talking about the months between the vote and when the Responses were available, the weeks between the Responses’ arrival and the BRM, or the hours in the BRM room. As the time grew short there was some real heartbreak as we ran out of time to take up proposals; some of them, in my opinion, things that would really have helped the quality of the draft. This was horrible, egregious, process abuse and ISO should hang their heads in shame for allowing it to happen. Their reputation, in my eyes, is in tatters. My opinion of ECMA was already very negative; this hasn’t improved it, and if ISO doesn’t figure out away to detach this toxic leech, this kind of abuse is going to happen again and again.
Blowback 1
So, with a bit more investigation, it is now clear what exactly was going on with my mail server. It appears that some spammer has decided to send out massive numbers of spams with a forged return path, and said forgery pointed to jslopez@xman.org. As per usual, there are still massive numbers of domains that will bounce such messages, and on top of that there are mlm’s and vacation programs that automatically respond to the return path of anything they get, so my MTA has been consumed by the blowblack/backscatter.
Awesome.
I did some more tweaking, and concluded that my best moves were the following tweaks:
- Reduce the # of slave processes for the MTA to 2.
- Set up an explicit access rule for jslopez@xman.org that causes an immediate rejection and a nice little “don’t be an idiot and bounce forged return path’s” public service message.
- Get the accept queue depth as deep as possible for the slave processes.
- Reject any messages without a proper e-mail address in the FROM: envelope.
The killer solution was Google Apps for Domains though. I have registered for the service, updated my MX records, and once that information propagates through the Internets all my domain’s e-mail will get routed to Gmail, which has exactly one registered account: jslopez@xman.org. Gmail is configured to route any e-mails to an unknown address to my mail server. The net effect is that all this backscatter will get swallowed by the Gmail black hole, and everything else will remain outside the event horizon and hopefully get delivered to my mail server at something approaching the speed of light.
The other lesson learned from this is that openldap is slow, so one shouldn’t using it for accessing one’s MTA configuration. I intend to set up a cron job that will periodically dump the contents of LDAP in to files and then have postfix just read those files directly. This should prove to be infinitely more scalable and efficient, at the cost of updates being somewhat delayed.
Mail DDOS
It appears as though I am experiencing an e-mail based DDOS. As near as I can tell, thousands if not millions of messages addressed to jslopez@xman.org are bouncing around the Internets as I write this. I have no idea why this e-mail address was selected (AFAIK, this address has never been a valid address). Furthermore, the DATA segment of the e-mails appears to be empty. Greylist rejects seem to cause many of the
The net effect of all this was to completely tie up my mail server and for the most part prevent any mail delivery. I’ve now tweaked the server a bit so I do eventually get mail, but it is still rather grim. So far, I’m killing connections to clients after two errors, I’ve trimmed my accept queue depth, and dramatically increased the number of simultaneous connections I will process. The overall effect has been pretty taxing on my mail server, and I still see significant delays in delivery times, so I’m all ears to any brilliant suggestions on how to address this problem.
If you are a mail admin and are wondering why your queues are backed up with tons of jslopez@xman.org e-mail, please, please kill it. I suspect thought that most of my mail is coming from bots, so I’ll probably need to start adding immediate filtering at connect time that drops suspected bots.
Is this happening to anyone else?
Still Standing
For those of you who still care about Vista, I “upgraded” to SP1 and didn’t encounter any of the problems that a lot of others have had.
The 127 MB download took forever (like 45 minutes despite our pricey DSL connection), and then the installer popped up a dialogue saying that it needed at least 3GB of disk space on the main Windows partition and wouldn’t install until I made room. I’m trying to figure out how a rational person could think that a 127MB download might need 3GB of disk space to complete (yeah, I can see contrived situations where it’d be possible, but come on!). I dutifully cleared away like 15GB of downloads that were just sitting around on my drive, ran the install, dutifully rebooted the system as one would expect from Windows, and then confirmed that the entire install had consumed ~800MB of disk space (I’m guessing a 2:1 compression ratio, binary diffs, plus backing up the old files for rollback pretty much explains all of that).
You know, it is sad enough watching those progress bars behave like meandering drunks stumbling to the next bar during the later hours of a pub crawl, but really, is it so hard to have a reasonably accurate estimate of how much disk space is needed for an install?
Two Speeches
It’s a black and white difference that you can see with your eyes closed.
It's Full Of Stars
Apparently Arthur C. Clarke is rendezvousing with Rama or whatever is out there. Really, there is too much to say to say anything at all. He was truly a unique and interesting man, and his contributions to science cannot be understated (really, when was the last time you thought about the contributions to science by a man known primarily for writing fiction?).
Two Links
Upon reflection, I don’t think this was a particularly amazing or eloquent speech. In a lot of ways, it was very calculated and a repetition of campaign speeches going back as much as a year (and in some ways even echoing Obama’s 2004 Convention speech). It still brought a tear to my eye. Why? Because this is the kind of response I’ve been hungering for in reaction to the usual “gotcha journalism”, holier than though punditry, politics of division, reducing lifetimes to one unfortunate sound bite or image, focus on the horse race not the challenges, political ADD, and dammit-we-won’t-stop-this-insanity-until-someone’s-career-is-over, campaign against some talking head rather than your opponent, yellow journalism masquerading as political correctness, idiocy that has plagued us for so long, and has basically owned the 2008 presidential campaign process almost before it started.
Seriously, I’d only have been prouder if he’d just walked up to the podium with a sign behind him saying something along the lines of, “A Message For Those Concerned By Ferraro, Wright, etc.”, waited for silence, and then with both hands, emphatically performed the Trudeau salute, then silently stepped down from the podium. Unfortunately, that path leads to the PMO in Ottawa rather than 1600 Pennsylvania Avenue in D.C.
…plus, this was a bit more uplifting.
Never Say Never Again
In what I can only presume is the result of some obsession with an aging Sean Connery, the folks at Mindwire have developed a new kind of feedback device for gaming: electric shock. Yup, you too can pay money to have the pain of crashes, collisions, etc., simulated in the form of electro shock therapy. One wonders how long before it gets bundled with the logical title of choice: “Abu Graib: Be The Prisoner”.
Older posts: 1 2