At Some Point You Just Have to Laugh

Posted by Christopher Smith Sun, 30 Dec 2007 10:33:00 GMT

…and who better to get it started than Harold and Kumar.

The Glory of the Daily Show Archive

Posted by Christopher Smith Wed, 31 Oct 2007 00:41:00 GMT

You know the greatest thing about the Daily Show Archive? It means I can embed stuff like this to demonstrate just how far off the deep end cable news is:

and know that I’m helping to promote the show and I’m not going to get a DMCA cease and desist letter.

How To Catch Terrorists

Posted by Christopher Smith Thu, 04 Oct 2007 16:30:00 GMT

I like Bruce Schneier. I loved reading Applied Cryptography in the days of my youth, not to mention The Electronic Privacy Papers, not to mention being generally impressed with his work on Blowfish and Twofish and generally liking his other books as well as his Crypto-Gram Newsletter and his various essays. So, it is with some trepidation that I’ve decided to publicly criticize his essay entitled How To Not Catch Terrorists.

First off, I don’t, in fact, take issue with his central thesis: data mining through the general population as a means of generating a list of suspects to be investigated by FBI agents doesn’t seem like a good way to hunt down terrorists. My problem is that it is a straw man argument that likely misrepresents the manner in which authorities are attempting to use data mining.

I know a thing or two about the applications of data mining. While not an expert in the field, I’ve worked in search engine companies for the last five years, three of which I spent in research, where I had a chance to talk to researchers about successes and failures of the field. So I have some sense of how one can usefully (and not so usefully) use data mining techniques. As Bruce states, it is a terribly useful tool in all kinds of situations, but it is far from a magic bullet. You need to know how to use it. His essay seems to assume the government is astonishingly ignorant in this regard, despite investing countless billions on the technology.

I think the central flaw in Mr. Schneier’s thinking likely stems from where his thinking started from, which I suspect is best summarized with this line:

There is something un-American about a government program that uses secret criteria to collect dossiers on innocent people and shares that information with various agencies, all without oversight.

Of course, the truth is, this kind of thing is (with apologies to Robert Wuhl) “as American as apple pie”, which is exactly why Mr. Schneier is fearful of it (I’d say “paranoid about it”, but that implies that his fear isn’t rationally justifiable).

So, let’s assume three things:

  1. The government isn’t completely incompetent with understanding how to employ these technologies (remember, this is all “brought to you by the people who brought you ECHELON”), and more importantly is informed by the multiple experts in the fields machine learning and data mining that they employ.
  2. Folks working on these programs are highly motivated to catch terrorists.
  3. Folks working on these programs are at least somewhat fearful of the same kind of abuse of powers that Mr. Schneier is, particularly if they think it could be directed at them or their friends and loved ones.

You might take issue with any of those assertions, but they all seem highly plausible to me, although my “American as apple pie” links make me somewhat dubious about #3. Frankly, if the other two aren’t true, we have got way bigger problems to worry about than a little misuse and abuse of data mining and machine learning techniques.

Now, if we go with these assumptions, how might the government employ something like ADVISE? Let me suggest some ways:

Sifting and sorting the raw data

This one comes right from the horse’s mouth. Mr. Schneier quotes Michael Chertoff as saying, “It is an experiment to see how you can better analyze data that you already have, that you’ve already legally collected, to see if you can understand it, sort it and make use of it more readily than simply doing it manually.” Let’s say for a moment, you’ve got all this data on 1000 people, and someone tells you, “there is a decent chance one of them is a terrorist with plans to attack innocent Americans in the coming year, could you go through this data yourself and give us a best guess as to who to investigate and in what order?” Okay, I’d probably get a team of people reading over every scrap of data right away. I’d start to collect more detailed data on the top candidates, but I’m not going to start asking for court orders, because all I’d have would be someone’s best guess as to whether this person is a terrorist. Now imagine you have the same data on 300 million people, or worse still, everyone on the planet? Are you really going to assemble a team to go over the data of all those people? Of course not! They’d likely retire and perhaps die before completing the task and by then the output would be moot anyway. Now, what if you could get a computer to sift through and sort the data such that it could produce a list of the top 1000 candidates. I’d sure be interested in that computer’s list. Sure, I’d know that I don’t even think there are anywhere near 1000 terrorists and odds are only like 1 in 100 that even one of the candidates is actually a terrorist, so I’m not going start arresting these people or opening up files on them, but I might assemble a team exactly like in the first example in order to refine the list further. Suddenly, I’ve gone from literally having no actionable data to being able to act on it, albeit in a limited way.

Trimming down a suspect list

Okay, our crack investigative team have identified and arrested some middle man who we know sold supplies to a terrorist with plans to strike. The problem is, this guy sold supplies to a lot of people, some innocent civilians, some organized crime types, and this one terrorist. We also have no idea of where or when he sold supplies to this terrorist. You do have a tape recording of him talking to the terrorist about that crazy waiter at Random Regional Restaurant. This guy has been in business for a while and has been successful, so he has a LOT of customers over the years. We have a customer list of all his customers, but the list is understandably quite extensive, possibly ten thousand customers over the years. You only have enough resources and time to investigate ten people in more detail. How do you decide which ten to look at? Well, you know that your target either lived in or visited Random Region in the past. You just might want to have a computer program go through the customer list, exclude all the candidates who’d never been to Random Region. You’d know that it is entirely possible that you don’t have complete data on the movements of all the individuals (particularly since some are shady characters who don’t like their movements tracked) but filtering in this manner brings you down to a small enough candidate list you can do some basic investigations, like calling each of them to see if their voice sounds like the one on the tape. From that, maybe you can identify candidates for further investigation.

Prioritizing

You have information suggesting someone might try to strike Metropolis in the next couple of months. You don’t know who, but you’ve narrowed it down to people either in Metropolis or who are visiting Metropolis in the next couple of months and you have some additional information that narrows the candidates down to a list you should be able to investigate in a month, now that you have wiretap authorizations for each of their phones. Here’s the problem: the terrorist could strike a lot sooner than a month. So, the order you investigate the candidates is very important, only you have no idea where to start beyond gut instinct! Wouldn’t it be nice to shove that list in to a computer and have it sort the list based on the probability that they are a terrorist and how soon they’ll be in Metropolis? Even if you knew the thing was only 70% accurate, you’d take it, because it would still give you better odds of stopping an incident than if you didn’t use it.

It is not to hard to come up with other scenarios, but these are the first three I could come up with that most closely resemble Mr. Schneier’s original straw man. All of these scenarios would benefit from the application of data mining and machine learning, even if you have crummy false positives and negatives and a target population of one that you are trying to identify. None of these are ridiculous “24”-style scenarios that never happen in reality.

So, what are the problems, beyond motivation, in Mr. Schneier’s reasoning?

Well, first, “base rate fallacy” is only a problem if the cost of a false positive or false negative is high enough as to outweigh the benefits of identifying at least some of the true positives. The classic example of this is with medical tests. Yes, if a disease is rare in a general population, even if there is a significant cost for doing subsequent tests, it may make economic sense to perform the test anyway if the savings from a correctly identified patient is astronomical enough. Sometimes only one true positive is enough to justify the expense of mislabeling half of a population. I see lawyers throwing up ads for Mesothelioma all over the place, even though they know most of the time the ads won’t be seen by a single person with the disease and that they’ll undoubtedly encourage several unsavory types to show up at their office and wasting costly staff time as they try to pretend to have the disease. Why? Because for every single real candidate they do identify is worth a jackpot of money to them.

Second, not having a “well defined terrorist profile” assumes that because one is using a computer a deductive reasoning model must be employed to reach a conclusion. Supervised learning methods like SVM’s excel at performing classifications based on doing the statistical equivalent of inductive reasoning. Sure, they get it wrong some of the time, but in cases where there a multitude of factors that help in the decision making process, these methods can often outperform a human working with the same data (and obviously taking far more time). Humans tend to excel at intuitively or logically identifying a few key indicators out of a possible set of millions, but computers excel at statistically identifying a complex model of interactions between millions of key indicators, which is handy if we don’t even know if any of the factors we’re considering have a real direct cause and effect relationship with the prediction we’re looking for.

Finally, there is an assumption that using these techniques necessarily leads to the government surveilling innocent people they’d otherwise have left alone. That is a matter of application rather than intrinsic to the technology. One could just as easily use the technology to identify innocent people that don’t need a file opened up on them. Heck, if we’d had this kind of thing in Hoover’s day, we might have been able to more easily identify the irregularities in how he was selecting candidates for wire tapping. We’re always hearing reports of authorities using racial profiling or just irregularities in one’s behavior leading to inappropriate scrutiny or persecution (remember the insanity over the whole Trench Coat Mafia thing?), well data mining and machine learning techniques could not only help identify inappropriate police behavior, but also provide a “second opinion” about whether someone was genuinely worthy of further investigation.

In general, tracking down the needle in the haystack is a very hard problem and one would want all tools available to do the job. Sure, I can see how the technology could be used to infringe upon people’s privacy, but that is no reason to throw out the baby with the bath water. Some basic oversight and rules ought to be sufficient to prevent the worst abuses..

Boston: Neo-Luddite Hot Spot?

Posted by Christopher Smith Fri, 21 Sep 2007 23:19:00 GMT

I find it bizarre that the home of MIT & Harvard (not to mention one of the larger centers for post-secondary education in the world) is somehow run by bureaucratic technophobes. Now, the bureaucratic part isn’t too shocking (we’re talking about Massachusetts’s here ;-), but someone has got to put an end to the rest of it before someone gets hurt.

I appreciate the fact that Boston Logan airport was where the disastrous events of September 11th, 2001 all started, but that doesn’t give you a license to be an idiot. BTW, in case you haven’t seen it this is apparently what a bomb looks like. Yup, it’s a bread board with some wires and a battery. Apparently there are some eight or so LED’s positioned on the sweat shirt that flash (‘cause just like movie bombs, real live bombs have lots of menacing looking flashing lights to draw attention to themselves). Oh, and apparently she had some play dough in her hands (although that is far from clear at this point).

Weird people walk in to airports all the time. It’s just a fact of life. Hang out in a big airport like Logan for a few hours, and you get a fair sampling of humanity. A portion of humanity looks really weird (at least from any one person’s cultural context). Somehow, you don’t hear stories of kids getting arrested in airports because they’ve got shoes with flashing LED’s in them. As far as I know, noone has had their Nintendo DS confiscated from them. But if it isn’t all packed up in a nice consumer oriented fashion… it must be a BOMB!

And yes, this isn’t the first “improvised electronic device” incident. There is a genuine fear of any electronics that didn’t come preassembled in a nice package at the store.

Some people have defended the authorities on this matter, citing the fact that officers cannot be expected to be familiar with bread boards and other electronic paraphernalia. I’d actually agree they can’t be expected to be familiar with this stuff. What I’d expect to be familiar with is what bombs look like and maybe be a little familiar with the widely reported fashion trends at the city’s world famous technical school. Yes, this might have vaguely resembled a bomb in a summer blockbuster maybe, but it doesn’t take a genius to think that it doesn’t exactly fit the mold of anything particularly dangerous. Seriously, at worst I could imagine the thing being considered a radio of some kind. I’d actually have been okay if an officer had approached her and casually asked her a few questions, just to get an idea of who she was, on the grounds that the he thought she looked unusual, but drawing a gun? Seriously?

I’ve seen people make comments that “after 9/11, it’s just stupid to go to the airport wearing this”. Yeah, because we all know that on 9/11 we were attacked by people armed with…. box cutters. Oh, and before 9/11, the worst incidents we’d ever had were from a truck bomb made out of fertilizer and various other chemicals and a van bomb made out of urea pellets and various other chemicals (are we seeing a trend here?). No electronics necessary, just our beloved automobiles.

Sure, you always want to look out for the “new” attack, but honestly, anything that could be done with that bread board could be much more effectively disguised as a cell phone, a children’s toy, or whatever. I doubt anyone on a terrorist mission really wants to take the time to wire up a bunch of flashing LED’s to draw attention to themselves.

And of course in typical Boston fashion she was arrested for “possession of a hoax device”. I’m not kidding you.

The overall sense I get these days is that the Luddites are back. As some kind of blow-back from all the scientific advances of the last half century or so, a mistrust and outright fear of science and technology has taken hold on a level that I’d normally imagined to be the stuff of Hollywood thriller/horror/disaster flicks. This has got to stop, or pretty soon we won’t be able to wear the clothes they sell at ThinkGeek. There goes my idea for picking up my wife at the airport with a scrolling LED belt buckle with her name scrolling by.

P.S.: Note to TSA: the wife says, “thank you”.

Keeping Your Story Straight

Posted by Christopher Smith Wed, 23 May 2007 18:05:00 GMT

It is so hard to keep your story straight. Remember that “we’re fighting them over there so we don’t have to fight them over here” canard? Well, now, to remind everyone that the terrorist threat is real, our government is now releasing previous classified information showing that in fact the folks over there have been working on fighting us over here.

So, +1 for reminding us that we should be scared and thankful for whatever protection you can provide, -10 for proving that you’ve knowingly been misleading us for two years.

Security and the Culture of Fear

Posted by Christopher Smith Thu, 05 Oct 2006 17:05:00 GMT

So, CBS managed to get their hands on the “No Fly” list. It has 44,000 unique names (quick: if there are 44,000 terrorists out there, can you really believe that fighting “them” in Iraq means there aren’t any left in the US?), some of which are so ridiculously common that undoubtedly they cover thousands of innocent would-be travellers.

Of course, none of this should be a surprise to anyone. The efficacy of a name-based “No Fly” list has beeen questioned by security experts from the day it was first implement. There are so many ways it is a bad idea it’s hard to know where to start: terrorists aren’t exactly stupid enough to operate under a known alias, if they are you should be able to catch them fairly quickly, what are the odds that you know someone is intending a terrorist attack while on a plane without having a more precise fix on their identity than their name, names don’t map to a single person, and then there’s the fact that the list is so widely distributed that they have to keep a lot of names off the list to avoid tipping off targets that they are on their trail.

Given my own name, I’m intensely aware of the problems with name-based identification. I still remember being “Christophe cinque” in my high school French class because there were 6 Christophe’s in the class (representing about 1/4 of the class). At that same school there was another student with the same first and last name as me (and to add to the problem we had about the same size and build, similar physical traits, our dad’s worked for the same company, etc.). This was in a school with less than 500 students. I’ve had similar problems whenever dealing with bureaucracies of managing more than a few hundred people.

This problem is magnified by the fact that certain cultures tend to have limited name space. For example, in Arab culture (bets that there are a lot of Arab names on that list?) most of the population’s first names are drawn from a very small set of names of religious and cultural figures, with an even smaller subset being the most common. As a consequence Arabs are sometimes uniquely identified by citing their family tree (“ibin X ibin Y…”). Combine that with last names being common because families are often large, and you quickly discover that with perhaps 1,000 names you could probably cover a majority of the Arab population (this probably explains how the names of 14 of the 19 dead 9/11 hijackers are on the list).

Now, one could argue that maybe the security checks are more sophisticated than the public is aware of. Maybe if they get a hit on a list that just means the TSA calls up the FBI, sends an ID number or a photo, and clears a person quickly and quietly. Even better, since these days you have to provide an ID number just to buy a ticket, these hits could be prescreened. The only problem is that CBS found 12 people named “Ralph Johnson” who are detained “almost every time they fly”. Here’s a thought: if he’s cleared to fly one time, perhaps this particular Ralph Johnson should be cleared to fly subsequent times, particularly if a new terrorist “Ralph Johnson” has yet to be identified.

The truth that we all know is that a lot of the security measures that have been instituted since the 9/11 attacks, particularly those related to air travel, serve more to instill the public with confidence about their safety rather than to provide a real security benefit. Building up the public’s confidence has been necessary because politicians and the media have been self-servingly fanning the flames of the public’s fear rather than appealing to reason. The end result is unreasonable fear and unreasonable security protocols that if anything harm public safety by increasing paranoia without providing any practical security benefit.

The government isn’t the only source of this “all show and no go” approach to security. I’ve seen news reports raising the panic flag over the realitively easy access contractors have to small quantities of cesium, freaking out of “terrorism futures” markets, and playing to xenophobia by trumpeting how US ports will be run by a UAE-based conglomerate. Appealing to the public’s irrational fears is good business, regardless of whether you are a politician tyring to convince people that only you can protect them, a news outlet trying to get eyeballs, or Roger Corman. At least in the latter case the audience knows going in that it’s just a fantasy.

It’s time for everyone in the theatre to stand up and tell the people shouting fire to shut up. We simply don’t have the resources to be waste on feel good measures that accomplish little if not nothing. Security is a business that requires the same cold and calculating process that is employed by those most successful in overcoming it.

Look Up

Posted by Christopher Smith Mon, 11 Sep 2006 16:00:00 GMT

See, in the last few years, we’ve stumbled… and when you stumble a lot, you… you start looking at your feet. Well, we have to make people… lift their eyes back to the horizon, and see the line of ancestors behind us, saying, “Make my life have meaning.” And to our inheritors before us, saying, “Create the world we will live in.” I mean, we’re not just… holding jobs and having dinner. We are in the process of building the future.
—John Sheridan, Babylon 5


It’s hard not to think about what happened five years ago today. Leave aside the horrific act of violence, the likes of which had never been seen before on US soil; the all-out political and media blitz makes it pretty much impossible to avoid confronting one of the more painful moments in the world’s recent history.

Surprisingly, remembering that day is difficult. Oh, I can remember a lot of the specific events of my day, but remembering how I felt is far more difficult. I honestly thought I felt much the same as I do now, and to a certain degree that is true. I still feel sorrow and anger when I think about that day, but it’s not the same kind of feeling. I didn’t realize this until I chanced upon a YouTube clip of Letterman’s interview with Dan Rather six days after the attacks.

These two men were candidly emoting the rawness of the moment, and it started coming back to me: both the feelings on the day of the attacks, and the emotional chain reaction that followed in the coming days, weeks and months.

September 11th, 2001 is often referred to as the “day that changed everything”, and several cynics have (correctly) pointed out that a lot of things didn’t change. The big change, I think, was at the psychic level. The psyche of the world, and the US in particular, was scarred that day, and from where I stand, I don’t think we’ve moved past it. But I think the time for that has come.

We’re going to hear a lot of speeches from a lot of people today. I am fearful that many of them will reflect where we are, rather than where we need to be. We stand as a nation and a world sharply divided, with the focus on the minutiae of day-to-day struggles for tactical advantage over one another. We need to be a nation and a world united and focused on a strategic vision to build the world our ancestors’ struggles demand and our inheritors deserve.

It would be nice if someone would rise up and seize that vision. I think the world is ready for it, that we have been ready for it for five years, but the horrors of that day have gotten the better of us. Rather than striding forward we’ve been stumbling, and I’m afraid the odds are against someone helping us find our stride again. I think we’re going to have to do it ourselves. Sometimes leaders inspire greatness, but I often feel humanity’s best moments come when greatness inspires leaders.

So I encourage everyone to seize the pathos of the day and harness its energies to the hope, courage and confidence needed to build the future we want to have. Let’s make our kids proud.

NOTE: Since posting this article, I’ve noticed that CBS has demanded and achieved the removal of this interview from YouTube. Unfortunately, when going through CBS’s website, I can’t find this rather historic video reel (actually I can’t find anything from that evening’s show). I’ve found the video on vodpod, and updated the link accordingly, but for the life of me I can’t believe this video isn’t prominently available on CBS’s site.