2011年12月

NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL

NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL

Which method is best to select values present in one table but missing in another one?

This:
SELECT l.*
FROM t_left l
LEFT JOIN
t_right r
ON r.value = l.value
WHERE r.value IS NULL


,this:
SELECT l.*
FROM t_left l
WHERE l.value NOT IN
(
SELECT value
FROM t_right r
)


or this:
SELECT l.*
FROM t_left l
WHERE NOT EXISTS
(
SELECT NULL
FROM t_right r
WHERE r.value = l.value
)


Differences between the methods

These methods are quite different.

First of all, LEFT JOIN / IS NULL and NOT EXISTS are semantically equivalent, while NOT IN is not. These method differ in how they handle NULL values in t_right

LEFT JOIN is guaranteed to return every row from t_left, and then filtering is applied to the values returned from t_right. If for some row in t_left there is no corresponding row in t_right (which means no row with that exact value is present in t_right), the row from t_left will be returned once, and the NULL values will be substituted instead of t_right‘s actual values.

Since NULL values can never satisfy an equality JOIN condition, the NULL values returned by the query are guaranteed to be substituted by the LEFT JOIN, not fetched out of the actual t_right‘s row. This means that LEFT JOIN / IS NULL is guaranteed to return at most one row from t_left, and these row’s value is not equal to one of those in t_right.

The same holds for NOT EXISTS. Since it’s a predicate, not a JOIN condition, the rows from t_left can only be returned at most once too. EXISTS always returns TRUE or FALSE and it will return TRUE as soon as it finds only a single matching row in t_right, or FALSE, if it find none.

NOT EXISTS, therefore, will return TRUE only if no row satisfying the equality condition is found in t_right (same as for LEFT JOIN / IS NULL).

Note that NULL values do not safisfy the equality conditions, so both LEFT JOIN / IS NULL and NOT EXISTS will always return rows from t_left that have value set to NULL, even is there are rows with value IS NULL in t_right.

NOT IN, however, behaves differently.

IN predicate (unlike EXISTS) is trivalent, i. e. it can return TRUE, FALSE or NULL:

  • TRUE is returned when the non-NULL value in question is found in the list
  • FALSE is returned when the non-NULL value is not found in the list and the list does not contain NULL values
  • NULL is returned when the value is NULL, or the non-NULL value is not found in the list and the list contains at least one NULL
  • value

IN predicate does not give a definitive answer to whether or not the expression is contained in the list as long as there are NULL values on either side of the expression, returning NULL instead.

This of course makes no difference when using the positive form of NULL: predicates returning NULL are filtered out by the WHERE clause as well as those returning FALSE.

However, NOT IN is different, since negation of NULL is NULL as well.

That’s why NOT IN condition will never hold for any list with a NULL value in it.

  • If a row is found in the list, IN will return TRUE and NOT IN, therefore, will return FALSE
  • If a row is not found in the list, IN will return NULL, and NOT IN on its turn will also return NULL

Both conditions will of course be filtered out by the WHERE clause.

Summary

In SQL Server, NOT EXISTS and NOT IN predicates are the best way to search for missing values, as long as both columns in question are NOT NULL. They produce the safe efficient plans with some kind of an Anti Join.

LEFT JOIN / IS NULL is less efficient, since it makes no attempt to skip the already matched values in the right table, returning all results and filtering them out instead.



The original posting URL: http://explainextended.com/2009/09/15/not-in-vs-not-exists-vs-left-join-is-null-sql-server/

CEO Friday: Why we don’t hire .NET programmers

Written by David Barrett
March 25, 2011 at 7:00 am
Posted in CEO Friday, Job
Original Post URL: http://blog.expensify.com/2011/03/25/ceo-friday-why-we-dont-hire-net-programmers/






As you might know, we’re hiring the best programmers in the world. Sure, everyone says that. But my coders will beat up your coders, any day of the week. For example, Mich is barely 5 foot tall, but is a competitive fencer. Witold is a 6’3″ former professional hockey player. Nate practices knife fighting for fun. And they’re pretty decent programmers, too.

But finding such people is more than a full time job. Indeed, I spend about half my time on it, while Lisa spends about all hers. And on top of that we have an army of recruiters out scouring the globe, leaving no resume unturned. Taken all together, this creates a lot of resumes. So many that we don’t look at them, because resumes — let’s be honest — are totally worthless. Rather, we pay all our attention to a few key questions we ask every candidate, each aiming at uncovering a single key point: is this someone we really want to work with?

See, experience is cheap. All it takes is time. Skill is harder, but really only requires hard work — a lot of people can get that. But attitude. You either have it, or you don’t. The right sort of person is so passionate about coding, they can’t be stopped from doing it. They typically started before high school — sometimes before middle school — and never looked back. They write everything from assembly to jQuery, on PCs to mobile phones, doing hard core computer graphics to high level social networking. They’ve tried everything.

Everything, that is, but .NET.

Now let me clarify — .NET is a dandy language. It’s modern, it’s fancy, it’s got all the bells and whistles. And if you’re doing Windows Mobile 7 apps (which the stats suggest you aren’t), it’s your only choice. But choosing .NET is a choice, and whenever anybody does it, I can’t help but ask “why?”

Don’t get me wrong: .NET on your resume isn’t an instant showstopper. But it will definitely raise questions during the phone screen, for reasons that are best explained by simile:

Programming with .NET is like cooking in a McDonalds kitchen. It is full of amazing tools that automate absolutely everything. Just press the right button and follow the beeping lights, and you can churn out flawless 1.6 oz burgers faster than anybody else on the planet.

However, if you need to make a 1.7 oz burger, you simply can’t. There’s no button for it. The patties are pre-formed in the wrong size. They start out frozen so they can’t be smushed up and reformed, and the thawing machine is so tightly integrated with the cooking machine that there’s no way to intercept it between the two. A McDonalds kitchen makes exactly what’s on the McDonalds menu — and does so in an absolutely foolproof fashion. But it can’t go off the menu, and any attempt to bend the machine to your will just breaks it such that it needs to be sent back to the factory for repairs.

Instead, we look for a very different sort of person. The sort of person who grew up cooking squirrels over a campfire with sharpened sticks — squirrels they caught and skinned while scavenging in the deep forests for survival. We don’t want a short order chef, we want a Lord of the Flies, carried by wolves into civilization and raised in a French kitchen full of copper-bottomed pots and fresh-picked herbs. We need people who can not only cook burgers, but cook anything, from scratch.

See, Microsoft very intentionally (and very successfully) created .NET to be as different as possible from everything else out there, keeping the programmer far away from the details such that they’re wholly and utterly dependent on Microsoft’s truly amazing suite of programming tools to do all the thinking for them. Microsoft started down this path when they were the only game in town, explicitly to maintain their monopoly by making it as hard as possible to either port Windows apps to non-Windows platforms, or to even conceive of how to do it in the first place.

This decision — or this mandate for incompatibility, perhaps — has produced countless ramifications. Small things, like using backslashes in file paths rather than forward slashes like any dignified OS., or using a left-handed coordinate system with DirectX instead of right-handed as was used since the dawn of computer graphics. Big things, like obscuring the networking stack under so many countless layers of abstraction that it’s virtually impossible to even imagine what bytes are actually going over the wire. And a thousand other things in between: programming tools that generate a dozen complex files before you even write your first line of code, expensive servers that force a remote GUI terminal on you to do essentially anything despite a few keystrokes being perfectly adequate for everybody else, a programming culture almost allergic to open source licensing. The list goes on and on.

None of this makes you a “bad programmer”. All these differences are perfectly irrelevant if you just want to make 1.6 oz burgers as fast as possible, and commit the rest of your career to an endless series of McDonalds menus. But every day spent in that kitchen is a day NOT spent in a real kitchen, learning how to cook real food, and write real code.

Even worse, every day spent learning a Microsoft kitchen takes TWO days to unlearn, meaning once you get a reasonable way down the .NET career path, there is almost no going back. You become so steeped in tools and techniques that have absolutely no relevance outside of .NET that you are actually less valuable to a startup than had you just taken a long nap.

So what’s the moral of this whole story? Two things:

If you ever want to work in a startup, avoid .NET. It does you no favors.
If you are a startup looking to hire really excellent people, take notice of .NET on a resume, and ask why it’s there.

It might sound harsh, and it is. But sometimes we only have squirrels to eat, and we need them to taste good.

-david (Follow us at https://twitter.com/expensify)

Saturday edit: Wow, this post has hit quite a nerve. Had I known it was going to take the internet by storm I probably would have been a bit more careful with my word choice (especially referring to .NET as a language, doh!). But just to clarify a couple points, 24-hours into the storm:

Yes, .NET developers are great at what they do. It’s the most modern platform for application development on the planet. Microsoft has always produced the best tools for building internal business applications, and .NET is their masterpiece. There’s a reason why they own that space; they earned it. That space employs millions of people, and those people are unquestionably the masters at what they do.
But what they do is very, very rarely startups. I’m sorry if that sounds offensive, but that’s a simple fact. You can measure this yourself: go to any list of startups, maybe look at YCombinator’s recent graduate class, or Scoble’s list of picks for 2010, or Sequoia’s list of seed companies — or go find a list you like better. Do a curl (or your .NET equivalent) on each domain, and see how many are running a Windows server: I think you’ll find the fraction very small. Don’t get me wrong: there are a lot of people who aren’t interested in working at a startup, for a lot of very valid reasons. But if you are the odd person who is interested, it’s worth asking: why do so few use .NET?
Because .NET is designed to extend, not disrupt. The same could largely be said about J2EE, but in general anybody who “grows up” with these self-contained, highly-automated platforms can’t help but view computers in a different way than those who start from a lower, less automated, more difficult level. (As one of the coherent commentators says below, Joel Spolsky himself laments schools teaching Java with the same basic reasoning of my article above, albeit more diplomatically stated.) .NET is designed to tightly integrate with and seamlessly extend the Microsoft stack in extremely powerful but ultimately incremental ways. Again, there’s nothing wrong about that if that’s what you want to do. But if that’s not what you want to do, then .NET probably isn’t the right choice, as evidenced by how few people in the startup world choose it.

Again, this isn’t a rant against .NET — it’s fine. It’s not even a rant against .NET developers being incapable of learning — they can learn as well as anyone. It’s a rant against .NET teaching the wrong things for startups. And about how people who use it for too long, can’t help but learn those things. This is a rant about how startups, by definition, need to think of things from new angles, and those angles typically don’t involve .NET.

Anyway, I was going to leave all this to blow over before making any comments in a follow-up post next Friday. Indeed, for all the outrage and fury, this is actually common knowledge amongst startups, and I was going to support that with a bit more data to win over the skeptics (at least, those who were open to listening).

But I’ve seen some recent comments that this post might have upset and offended the SMB community that we serve, and that I cannot abide. For that, I’m truly sorry. As someone who grew up using and ultimately becoming frustrated with the limitations of Microsoft tools, and grew up submitting and hating small-business expense reports, I hope I can convince you that we’re fighting for the same goals. Even if our tools and techniques might differ wildly.

Sunday edit: Incredible that this post is still going strong, on a weekend no less. But what’s most interesting is how the comments have gradually shifted over time. Granted, me even pointing this out will probably have an effect on comments going forward, but the general trend has been:

First wave: Raw, incoherent hate, constructed from near-pure profanity. I took the liberty of deleting most of this, as this is a PG-13 site. Perhaps 1% of the first wave agreed with me.
Second wave: “Reasoned” hate, though most of that reasoning was done by extrapolating my misuse of one word (“language” instead of “platform”) into every conceivable sin. I’d say maybe 15% of the second wave agreed with me (if weighted by actual coherent text).
Third wave: A very heated discussion. It would appear that every possible insult had already been slung so there was nothing left to do but actually discuss the merits of the post. I’d say maybe 25% of the third wave agreed with me.

It’s harder to get a sense of agreement/disagreement from Twitter as most people just retweet without comment. But I’d say those who do make comments have fit the rough curve above.