Addslashes, Stripslashes, mysql_real_escape_string?

Hi all,
I have come across a very common problem, or should I say, I am in a dilemma about something very common. That’s about the use of addslashes(), stripslashes & mysql_real_escape_string . Now I have a couple of forms. The first one takes user input, validates it & if everything is all right, takes them to the second form, which is the Confirmation form. As soon as they confirm that the data on this form is right, the info is posted to another page, which inserts the data in Mysql Database. I am using PHP to do this. I am using a function & passing this data through that function before it gets added.

function sql_quote_no( $value )
{
     if( get_magic_quotes_gpc() )
    {

          $value = stripslashes( $value );
    }
    
    if( function_exists( "mysql_real_escape_string" ) )
    {
          $value = mysql_real_escape_string( $value );
    }
    
    else
    { 
          $value = addslashes( $value );
    }
    return $value;

} 

The above is the function that I use. Now say I have entered the following message initially in the form:
a’;DELETE FROM table_a WHERE m_id = 1""

I want to allow my visitors to use " & ’ in the form & even show it out the same way when I pull info from the DB, but want to maintain it secure in the process. If I echo out the message after passing it through the function above & just before inserting it into the DB, it shows me the following:

a’;DELETE FROM table_a WHERE m_id = 1""

But when I visit my database, I see the message in the DB as:

a’;DELETE FROM table_a WHERE m_id = 1""

It does not show me the slashes. Which means, just before adding into DB, it is adding slashes to it but when it added that info into the DB, it removed the slashes automatically. 2 questions:

  1. Now does this make it secure? (I don’t think so)
  2. Also is the DB supposed to contain the slash () from our addslashes function? I mean the current data in DB is great considering that I don’t have to worry about the search results when I implement a search function in the website, but is the data supposed to contain the slashes & I can show the data out by using stripslashes?

Please let me know point by point, the correct answer. My whole website security depends upon this info & I really want to understand the pros & cons of the method that I am using above. I tried php.net & other sites via google, but this question does not seem to be answered or the answers were not clear.

Would really appreciate an early reply. Thank you in advance.

Your current function seems fine to me. It even takes the (very confusing) magic_quotes_gpc setting into account. In case you’re unsure what magic_quotes_gpc is, it’s a setting in PHP that automatically backslashes all user input variables (ie. values in $_GET, $_POST and $_COOKIE, hence “gpc”). In fact, this feature was added in an attempt to automatically prevent novice PHP users to (unknowingly) incorporate SQL injection holes in their code. Unfortunately, for people who know what they’re doing, this is just a huge nuissance.

The behavior that the query sent to your DB contains escaped quotes but the actual result as it is stored in the DB does not is completely normal, and it’s exactly the reason why you should always make sure to escape quotes in queries.

The backslashes are added to make sure your value doesn’t break the syntactical rules of your query: a backslashed single quote is used to signify a single quote where an actual single quote can’t be used because it would break the query syntax. Because both you and the DB know this convention, this makes sure that the database can correctly read your value from the query: it will remove the backslashes from the quotes again when parsing the query.

The reason your SQL injection attempt failed (ie. it inserted the exact value as the user entered it rather than breaking the query) is exactly because you have escaped the input with mysql_real_escape_string, and that’s a good thing.

Consider how your query looks after having applied mysql_real_escape_string to the user input:

INSERT INTO yourTable(..) VALUES('a\';DELETE FROM table_a WHERE m_id = 1');

It’s essential here that the quote after the a gets escaped with a backslash, otherwise that quote would end the starting quote after “VALUES(” which would break the query. Even worse, it would enable your users to break your queries, and to insert their own querystring parts. But, you applied mysql_real_escape_string, thereby making sure that doesn’t happen. MySQL will unescape the quotes again when parsing the query, and correctly read the value

a';DELETE FROM table_a WHERE m_id = 1

from that query, exactly as it was entered by the user.

A more dangerous SQL injection scenario would be the following query:

UPDATE table SET value = '$user_input';

Suppose a malicious user enters the value:

a';DELETE FROM table_a WHERE 1 = 1 OR 'pie' = '1

If you hadn’t applied mysql_real_escape_string, the query would end up like this:

UPDATE table SET value = 'a';DELETE FROM table_a WHERE 1 = 1 OR 'pie' = '1'

Which is obviously very bad. mysql_query won’t execute the DELETE FROM query since it will only take single queries (exactly for this reason), but you can see how a sufficiently savvy evil user can still cause major damage in various other queries if you don’t apply mysql_real_escape_string.

So to solve this, simply wrap mysql_real_escape_string around it, which will end up like this:

UPDATE table SET value = 'a\';DELETE FROM table_a WHERE 1 = 1 OR \'pie\' = \'1'

And so, if you go looking in the DB afterwards, you’ll find this value:

a';DELETE FROM table_a WHERE 1 = 1 OR 'pie' = '1

Exactly as the user entered it, without having broken your query.

All in all, as long as you consistently apply mysql_real_escape_string to all user-input variables that you use in a query, you have nothing to worry about. It will ensure that the values get inserted into the database exactly as they were entered by the user, without getting mangled in the query by quotes and various other characters. You’d be amazed how many people there are out there using PHP that simply insert user input directly in their queries without any escaping.

Oh, and you should use mysql_real_escape_string over addslashes whenever possible (which your function already does). There are a bunch of reasons for that, but I want to go have breakfast so you can just take my word for it.

Splendid! Excellent! Thanks a ton for being so detailed & quoting it with examples. That really helped me. So I am happy that my query is safe now.

In fact let me tell you what I did, since I was thinking that there was something wrong with the query & that’s why it was inserting data in DB without the slashes. I modified the PHP function to this:

function sql_quote_no( $value )
{
          $value = addslashes( $value );

    return $value;

}  

I entered the following query in my form:

a';DELETE FROM table_a WHERE m_id = 1""

I pressed submit & it showed me the same thing in my confirm page. When I clicked on submit button, the data was send to the processing page & just before the data entered my sql query,I echoed the message out. It showed me the following:

a\\\';DELETE FROM table_a WHERE m_id = 1\\\"\\\"

Now after this has been inserted into the DB, I visited the DB & I was over joyed to see the following:

a\';DELETE FROM table_a WHERE m_id = 1\"\"

Notice the slashes in front of the quotes. I thought that maybe now my function worked right & it has escaped my data correctly & my DB is safe. So when I went to check out how this info shows up in my page, just like it is in DB. Now I did not want to show the slashes to my visitors. So I simply placed the row result in stripslashes() function & the resultant output it, lo & behold:

a';DELETE FROM table_a WHERE m_id = 1""

Which is exactly the same as the user entered it & I am 100% sure that my DB cannot be messed with even though I have the magic quotes ON or OFF & i dont have to worry about that.

So please confirm which way is the right one to follow. The one which you mentioned earlier (use mysql_real_escape_string) or force it to addslashes & echo it out with stripslashes? Please let me know.

BTE, I am concerned about how I can show such data in search results if it contains slashes in it.

Also if I retrieve data from the db, and store it elsewhere in the db do I get unquoted data and do I have to apply quoting again?? If yes, how?

Looking forward for your reply. Thanks vey much.

The correct way is definitely storing the data in the database as-is, without any slashes added. It makes no sense to add slashes before storing stuff in the database just to remove them again when retrieving it. Not only that, by adding slashes you are effectively making your db data unusable (or hardly usable) for any other purpose than your specific PHP application, which is just bad practice.

Another reason why it’s best to save the data as-is is because addslashes and stripslashes are not character set aware, ie. they assume the strings they are fed are always ASCII strings. Under certain conditions this can actually enable people to bypass the escaping performed by addslashes and stripslashes, thereby still posing a risk to your query safety.

From your example after having removed the magic_quotes checking and the data showing up with extra slashes, I would assume that you have magic_quotes enabled. My advice to you is to turn it off (or at least undo its effect like your original sql_quote_no function did, but it’s better to turn it off alltogether) and always use mysql_real_escape_string function like I illustrated in my previous post. You can’t rely on magic_quotes_gpc to do the escaping for you; quotes are not the only characters escaped by mysql_real_escape_string.

By doing this, you’re assured that you are always storing the data exactly as it was entered by your users without being vulnerable to SQL injections. That means you will also retrieve the data exactly as it was entered by the users, and you won’t have to worry about doing additional stripslashing. However, you must never simply output the user input as you received it from the database because this will expose you to XSS vulnerabilities (eg. if the user enters <script>// cookie stealing goes here</script>); make sure to always apply htmlentities before echoing user input on a page.

I can’t help it, but say WOW! Your posts have been real eye-opener for me & I understand their purpose better now. Otherwise it’s always confusing everywhere else. I must say, where were you all this time? :thumb2:

Thanks for all the guidances. In fact to all the input, I was also applying striptags() function, which would basically remove the <script> </script> tags & even make it more secure. But this striptags was giving me headaches when I was trying to input the data in the DB as it was removing my <br /> tags, that became a part of the message because the user entered new lines in the form. So I had to submit my data without using striptags to retain the
<br /> tags as is. So I think I will have to use the htmlentities to display all that info now.

In fact, I think I should display all the info from the DB using htmlentities. Am I right? Does it mess up the <br /> tags that were coming from the database?

Also which one should I use: htmlentities or htmlspecialchars() ???

Thank you so much.

I’m sorry, I should have differentiated when I said to always use htmlentities.

When dealing with user input, you have to make a difference between 2 kinds of user input: user input where HTML is allowed, and user input where HTML is not allowed. For example, the text we’ve been writing on this board is the kind where HTML is not allowed (which is the reason why it uses BBCode tags instead), and that’s the kind of input you’ll find 99% of the time. In fact, pretty much the only place you’ll usually find user input where HTML is allowed are WYSIWYG editors in CMS’es.

For user input where HTML is not allowed, it’s very simple: always apply htmlentities before echoing any of it. The reason for this is simple: suppose you have a site where you can log in through a cookie. If you had failed to apply htmlentities, a malicious user could enter some javascript code that reads the cookie and sends it to him, thereby effectively enabling him to log in as anyone who has viewed the page where his javascript code is displayed (and executed, because his <script>…</script> input was simply inserted into the HTML code). This is known as an XSS exploit; they’re pretty common these days and people (or bots) will try it on your site.

For user input where HTML is allowed, you have to think about whether you trust your users to enter non-malicious input or not. Since this kind of input is usually found in a CMS which only the site owners can access, you can usually output the HTML as-is, without applying any escaping (which is the expected behavior from a WYSIWYG editor).

The real problem comes when you need to allow potentially malicious users to enter HTML input. This is where you would have to manually strip out any possibly malicious stuff (such as <script> tags, etc) before displaying it. The problem with this approach is that this can get very complex to do reliably; most browsers allow for such a high degree of malformed input that it has proven to be virtually impossible for you to take all of it into account, and there have been countless examples of savvy people bypassing the input stripping by making it malformed, but in such a way that the browsers can still understand it.

That is why, to avoid this entire hassle but to still allow for some basic HTML capabilities for the grand public, alternatives like BBCode and UBB are frequently used. Major advantage here is that you control explicitly which tags can be used and which can’t, and that tags that are not allowed simply won’t be parsed. Since this approach replaces the usual HTML tags (which are parsed client-side) with equivalent BBCode tags (which are parsed server-side), you’ll still want to apply htmlentities though - otherwise users could still use regular HTML tags. Writing a BBCode or UBB parser can be more complicated than it looks at first sight, so I’d advise you to use some of the excellent existing libraries out there rather than write your own.

As for htmlentities vs htmlspecialchars, the difference is that htmlentities escapes more characters than htmlspecialchars. For the purposes of escaping user input htmlspecialchars would work fine too, but why not escape more?

For your <br/> problem, here’s what you do: the user enters his input in a standard <textarea>. This textarea contains his input with the lines separated by a newline character (
). You simply take the input from the textarea, and insert it into the DB as-is. Then, when displaying it again, you apply htmlentities to it, and then run it through nl2br - this will convert the newlines to <br/> tags. Otherwise the newlines would only show up in the HTML source, and not on the rendered page. Note that you first apply htmlentities and then nl2br; if you had done it the other way around, the <br/>'s would have escaped by htmlentities.

When you want to enable the user to edit his input again, you apply htmlentities and echo it back between <textarea> and </textarea>. Here, you don’t have to apply nl2br - the textarea control picks up on the newlines in the source code. Also, the entities escaped by htmlentities will be un-escaped again by your browser before sending the form data; this way you’ll keep their original input instead of saving the escaped entities in the DB.

Excellent, once again! That’s very kind of you to be so detailed & I truly appreciate that.

I will use htmlentities as you have specified. Would like to filter out more…keep myself more secure…

Regarding the textarea, here’s how it actually works: (it all originated here :slight_smile: )

  1. On send message page, visitor enters text in the textarea. Now he can enter text in this. (I even wanted him to be able to enter only enter selected tags like <br /> <p></p> <b></b><ul></ul><li></li>)

  2. Now on pressing submit the visitor will be taken to a confirmation page & his input information is shown to him but not in a textarea, rather in the page, just like you would display any html page. Which means if the visitor has used <b>hi</b> tags in the textarea on the first page, then in the confirm page, he will see " hi " (without quotes) in bold. Also if he enters any break lines, those will be parsed correctly & shown with those taken into account. Example, if he enters:

hi <br /><br /> I am here

then this would appear in the confirmation page as:

hi

I am here

Does that make sense?

  1. Now this info is collected in 2 hidden fields & I am applying htmlentities to collect this info into these hidden fields. The first hidden field is embedded in form1 & 2nd hidden field is embedded in form 2. There are 2 cases here:

a) If the visitor, thinks that all the message is right, he can press Confirm button & his information will be processed via form 1 & hidden field 1 data is posted to the final page which adds the information to the DB.

b) If the visitor sees that he did a spelling mistake or does not like some part of formatting or wants to edit his message, he then presses the “Edit Message” button & the form 2 is triggered & the hidden field 2 is posted to the send message page where he will see all the information intact with all the tags that he entered(if he did originally). Example, if he enters:

hi <br /><br /> <b>I am here</b>

then this would appear in the confirmation page as:

hi

** I am here

**Now if he presses the Edit message button, this info is posted back to the send message text area & he will see the textarea populated as hi <br /><br /> <b>I am here</b>. He can start the process all over again.So once this this info gets added into the DB, if the visitor visits the send message page, he will see the message that just got inserted into the DB & in the bottom, he will see the form to send the message again.

Now all in all, it’s been an issue where to apply the strip_tags. I am able to use strip_tags on the top line of the confirm page & am able to strip out all the malicious tags. But if I am using the same strip_tags on the final page, just before the data gets added to the DB, it is removing my <br /> tags & adding it to the DB. I am using the following to replace the
:

 $order   = array('
', '
', '\r');
$replace = '<br />';
// Processes 
's first so they aren't converted twice.
$str = str_replace($order, $replace, $str); 

When I pull the data from the DB, it is appearing without the break lines :q: :q: :q:. But if I am passing the data through strip_tags on confirm page & disabling that function on final page, it is inserting those <br /> tags correctly & the message is appearing with breaks in between the lines as intended. The bad thing about this approach is that iam not filtering any malicious tags & any <script> tags can appear if used. This can lead to XSS. If I allow only <br /><p> & <b> tags in the strip_tags, on the confirm page, the <br /> tags are not taken into account & they disappear. The text appears on the same line. For example, if user entered:

hi <br /><br /> <b>I am here</b>

then output is:
hi **I am here

**When in fact it should have been:

hi

** I am here

**Iam going crazy with this. Can you please help me further. Appreciate your cooperation. Thank you once again.

Hi,

Some very good info here—thanks!!!

I’m a little unclear on why you htmlentities() [or htmlspecialchars()] data from a DB (e.g. MySQL) before displaying in a textarea (for editing and reinsertion into the DB). I have been playing around some with a dummy page that I set up to better understand what goes on when text/characters go from:

MySQL Table => textarea (for editing) => $_POST ($_REQUEST) => mysql_real_escape_string() => MySQL Table and so on…

My understanding is that PHP and MySQL don’t really care about each other’s encoding and that PHP’s mysql_real_escape_string() takes care of that conflict when placing data into MySQL. I’m just not getting why the MySQL data needs to be ‘escaped’ (if that’s the right word) when placed into a textarea?? What’s going on behind the scenes when placing that data into a textarea?

Thanks

Hi,
What we are doing by the process of escaping is that we are cleaning any malicious data or should I say, rendering the bad data as incapable. Something like:

' gets converted to \'

which makes your query, retain it’s purity. Well I had to deal with a lot of stuff as the data in the text area is filtered off & shown on another page as confirmation. If someone does not like it, they can go back & change it. But yes, using mysql_real_escape_string protects you from most of the stuff, but you will need to take care of CSS attacks & other data validations as applicable.

Hope that gives you a clear idea.