Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assignment deleted by SQL parameterization codemod #441

Open
drdavella opened this issue Apr 4, 2024 · 1 comment
Open

Assignment deleted by SQL parameterization codemod #441

drdavella opened this issue Apr 4, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@drdavella
Copy link
Member

I was experimenting with adding support for the .raw method to the SQL parameterization codemod. This would be useful for supporting various ORMs (e.g. SQLAlchemy, Django) that enable pass-thru execution of raw SQL queries. The change I made looked like this:

diff --git a/src/core_codemods/sql_parameterization.py b/src/core_codemods/sql_parameterization.py
index 8c5042c..0636023 100644
--- a/src/core_codemods/sql_parameterization.py
+++ b/src/core_codemods/sql_parameterization.py
@@ -558,7 +558,7 @@ class FindQueryCalls(ContextAwareVisitor, LinearizeStringMixin):

     def leave_Call(self, original_node: cst.Call) -> None:
         maybe_call_name = get_function_name_node(original_node)
-        if maybe_call_name and maybe_call_name.value == "execute":
+        if maybe_call_name and maybe_call_name.value in ["execute", "raw"]:
             # TODO don't parameterize if there are parameters already
             # may be temporary until I figure out if named parameter will work on most drivers
             if len(original_node.args) > 0 and len(original_node.args) < 2:

I then ran codemodder against PyGoat. When I did, I got a change that looked like the following:

changed:
  - introduction/views.py
    diff:
      ---
      +++
      @@ -152,11 +152,11 @@

                   if login.objects.filter(user=name):

      -                sql_query = "SELECT * FROM introduction_login WHERE user='"+name+"'AND password='"+password+"'"
      +                sql_query = "SELECT * FROM introduction_login WHERE user=?"+"AND password=?"
                       print(sql_query)
                       try:
                           print("\nin try\n")
      -                    val=login.objects.raw(sql_query)
      +                    val=login.objects.raw(sql_query, (name, password, ))
                       except:
                           print("\nin except\n")
                           return render(
      @@ -257,7 +257,6 @@
           startInd = text.find('>')
           endInd = text.find('<', startInd)
           text = text[startInd + 1:endInd:]
      -    p=comments.objects.filter(id=1).update(comment=text)

           return render(request, 'Lab/XXE/xxe_lab.html')

The fix itself looks good; it parameterizes the raw query and removes the potential vulnerability.

However, I'm concerned about the deletion of this line:

           p=comments.objects.filter(id=1).update(comment=text)

@andrecsilva informs me that this is occurring due to the fact that p is unused subsequently in the code. However, the problem is that the RHS of the assignment may have a side effect: in this case it appears to be updating the database, but it could have a perfectly arbitrary side effect in general.

My understanding is that the unused variable removal is part of the cleanup phase of the SQL parameterization codemod. However, we need to try to restrict this logic somehow to apply only to those variables that were involved in the actual fix.

@clavedeluna
Copy link
Contributor

I started looking at this issue but decided we should have a discussion before deciding on the fix here as there isn't a clear fix. Right now the RemoveUnusedVariables transformer runs on the entire file, which leads to this issue of removing unused variables.
One option is to decide to do nothing. This codemod is MERGE_AFTER_CURSORY_REVIEW, so pixee users would be responsible for the cleanup.

Another option we have is to decide not to run that transformer and leave any unused variables be, to prevent removal of unused variables with side effects. Likewise, users would be responsible for removing unused variables.

A middle ground option, which I expect to be quite a challenge, would be to not blanket run RemoveUnusedVariables on the entire module, but to somehow be able to only remove variables related to sql query creation. I expect it's possible we will also end up accidentally removing side effect variables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants