SecureSql

Alright, after having seen a fair share of SQL injection bugs, and an even larger amount of useless but usability-damaging attempts as solving it, I'm going to write a guide to securing your SQL against injections, as well as a few related security measures.

Don't worry, it's not hard at all, and it doesn't require a lot of code.

Think Like a Hacker

A hacker, in it's modern sense, is a person who tries to gain access to resources that he is not supposed to have access to (in the digital world). Some do it because they are bored, some do it to prove something, steal something, spread some message or for revenge... or for any reason, really. You may think of them as evil, but in a way they're doing us a favour by giving us an incentive to keep our sites secure. Without them, nobody would bother, and we would all be very vournable.

In order to keep these people out, you must learn how to think like them. This will allow you to discover holes in your own security as though you were trying to break into it. So let's try it out.

String sql = "SELECT * FROM Users WHERE username='" + user + "' AND password = '" + pass + "'";

What is wrong with this example? A whole lot. For usual inputs, such as user="Bart", pass="sk8te", it acts like it's intended to and becomes:

SELECT * FROM Users WHERE username='Bart' AND password='sk8te';

But if we put on the black hat, and put a string like the following into user :

'; DROP DATABASE ImportantDatabase; --

The full SQL becomes:

SELECT * FROM Users WHERE username=''; DROP DATABASE ImportantDatabase; -- Bart' AND password='sk8te';

What happens here is that some user is selected, but ignored. Then a database is deleted. And then there's a comment (begins with -- and ends at the end of the line). Back up there, did we just delete an entire database? This is certainly not an option you would like to give to your site's visitors.

Quoting

So somebody discovered this, and thought of a fix. They wrote their code like:

String sql = "SELECT * FROM Users WHERE username='" + user.replace("'", "\'") + "' AND password = '" + pass.replace("'", "\'") + "'";

Since the backslash makes the tick a literal character rather than making it signal the end of the sting, now it feels nice and cozy. But since we're still wearing black hats, let's try this:

\'; DROP DATABASE ImportantDatabase; --

After the quoting above, it becomes:

\\'; DROP DATABASE ImportantDatabase; --

Which makes the SQL become:

SELECT * FROM Users WHERE username='\\'; DROP DATABASE ImportantDatabase; -- Bart' AND password='sk8te';

That is, select the user with the name \ and then delete ImportantDatabase. So our own quoting scheme didn't work. And if you think that taking care of the backslash case will help you any, you're wrong. There are tonnes of subtle ways around it, some related to unicode and specific implementations details of unicode libraries. It varies between languages, libraries, sql driver versions and brands. You're never going to make your own quoting scheme catch all of it.

Luckily most SQL libraries have a built in function for quoting. Most of the time, you can count on this to make your strings harmless. So let's use quoting:

String sql = "SELECT * FROM Users WHERE username='" + sql.quote(user) + "' AND password = '" + pass + "'";

We're safe. No, wait, we forgot to quote the pass parameter. Forgetting just one in a large code base is easy, and very hard to discover, especially in much longer SQL statements, which often occur in real code. And one is all it takes.

I've Fixed It! ... NOT

Before offering a solution to the quoting problem, let's look at some other broken security schemes. Some SQL libraries refuse to execute a strings with multiple statements in it, or refuse to execute updating statements where a query is expected. Some database administrators think they can fix it by providing the appropriate access rights.

This does indeed fix the multi-statement exploit we used above. But there are other ways to exploit SQL.

\' OR username='Administrator' --

I got the power! I'm sure by now you can imagine what kind of SQL that would result in. Basically this allows us to log in as Administrator without providing a password. And the two mentioned attempted security meassures do not do anything to fix this. What they do provide is limitations to what you can do as a user of these systems, slowing down development.

If you have a setup where administrators cannot log in from remote computers, you have fixed the exact above problem, but you could still steal information from other users using the same approach.

There are even examples of blacklisting or whitelisting SQL keywords and characters, which leak through even to end users. This is utterly useless and annoying. Imagine a guy named O'Toole who can't log in because you've banned ticks. We need something better.

Parameterized Queries

To the rescue comes parameterized queries. What they do is seperate the command from the input, making it impossible for the input to be executed. It's pretty much fool proof, and it's much more readable for large SQL statements.

String sql = "SELECT * FROM Users WHERE username=@user AND password=@pass;";
ParameterizedStatement statement = connection.createParameterizedStatement(sql);
statement.setParameter("user", user);
statement.setParameter("pass", pass);
statement.execute();

The parameters are automatically quoted. Notice that the sql is one single string literal, with no concatenation going on. It's a lot easier to read, and very hard to mess up. This is pseudo-code, the actual code depends on what language and libraries you've chosen.

Prepared statements are a subset of parameterized statements. The benefit is that the actual command is only sent and evaluated once, and it can then be reused multiple times for different parameters.

Stored Procedures

A stored procedure is a bit like a prepared statement, but more flexible. You create it once and store it in the database for later use. When you need it, you simply supply it's name and the appropriate parameters. That way you avoid embedding SQL in strings in some host language entirely, which is a nice property. One of the major benefits is that you get all syntax errors immediately, instead of later when your program decides to execute the query. Use these or parameterized statements as you please.

There's a problem unrelated to security here though, namely that you're going to lock yourself in real tight to your current database vendor, since stored procedure syntax is very database-specific. It also might work badly with your version control software.

That concludes my advice on avoiding SQL injections.

A Deroute

There is one glaring problem in my example code, unrelated to SQL injection, that is also very common. The passwords are stored in plaintext. Should a hacker ever gain access to a copy of the user table (and trust me, this happen much too often), he can read their passwords, log in as them (possibly gaining access to even more information and ressources), or expose their passwords to the public. Most users have one or two passwords that they use for everything. While this is of course their own mistake, any site that wants to be taken seriously must perform some sort of damage control in the event of an attack. At least make the problem local to your site. Besides, you have no business storing the passwords of users - it should be their secret alone, as much as possible.

One really easy solution is to hash your passwords. A hashing function takes something akin to a fingerprint of it's input. Like it's pretty hard to construct the original person from a fingerprint, it's pretty hard to construct the original data from a hash. You can sometimes find collisions, however, and to avoid that you need a salt. A salt can be as simple as some string you append to your input. I won't go into the details here, but make sure your salt is unique to each user. This ensures that even if two users chose the same password, the hash will differ. Always check the vulnerabilities of the hashing algorithm you chose.

An even better approach to passwords is to use OpenID , which means you don't have to deal with passwords at all. And the user will only need to remember one username/password, which can then be used to log onto any page that supports OpenID, securely. You might want to give your users both options. Show them that you care about keeping their credentials a secret.

Conclusion

Securing your SQL queries is easy. It's not so much SQL that's the problem, as the process of building queries dynamically. It is really a more general problem, that exists wherever it's possible to execute user supplied values. Many languages (including some variants of SQL!) have a function called eval that executes an arbitrary string. Avoid it like the plauge, or at least only use it as the last measure, after much consideration. If you're feeling a little lazy, don't use eval! Lazyness in combination with eval is a recipy for disaster.

Personally I have not always followed this advice myself, and I'm sure much of the code I've produced so far has security holes. It makes me uneasy, and rightly so. But from now on... If only SQL libraries didn't directly encourage security holes, it would be a lot easier not to make these mistakes. You might wonder why it hasn't changed since the first SQL libraries, since the database companies have known this exploit for decades (sure there are alternative methods, but the one still taught in most tutorials is the wrong one). But it would break backwards compatability. At the very least they could deprecate it.

If you disagree with what I've written above, you're very welcome to contact me (using the link in the upper right of this page). I want to be proven wrong, it's a great way to learn about these things.