Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Oysters Autobio
Mar 13, 2017
This isn't really a coding question but not sure about any good resources because of all the enterprise software click bait and I don't know if there's like a dataops / data engineering megathread here.

I'm being asked to help provide input for metadata governance/ Master Data Management stuff but the only stuff I can find online are ads for enterprise software. Any good actual resources for best practices? Checked Reddit and only advice o got out of it was to run which I'm slowly realizing is pretty good advice

Adbot
ADBOT LOVES YOU

Oysters Autobio
Mar 13, 2017
Question: Does the line comment syntax ("--") in most SQL dialects (in this case, impala) guarantee that every text after it will always be read in as a comment? i.e. if you were to enter in a variable in a SQL query behind a line comment, is there any way that there's some combination of characters that could escape that comment field into a normal query statement? I know SQL sanitization for query statements is important when it comes to user input because of SQL injection attacks, but what if user input is limited to only behind SQL comment syntax?

for example, say the SQL query was SELECT * FROM mytable -- {variable}, so that when user types in 'foo', the end query becomes SELECT * FROM mytable -- foo . Without any sanitization on the input, can a SQL line comment be escaped back into regular SQL commands? I know if this was a multi-line comment then users could enter in '*/' to end the comment and then enter in other commands afterwards, but are there escapes available for a single-line comment syntax?

Here's the problem: For stupid reasons, mgmt wants every query on our database to contain a comment with a Jira ticket identifier (i.e. "DBT-123", "DBT-3234", "DBT-3433") which has a bunch of info on the query (i.e. purpose of query, etc.). This is so that when they do random audits of the database logs, its easy for them to match up the query and why they were doing it based on the ticket. This is stupid and tedious but fine with our DBAs, they just put in a comment at the end of the line (e.g. SELECT * FROM mystupidtable -- DBT-123) when doing manual queries, and they have a bunch of open tickets for generic testing purposes they just reuse over and over.

Mgmt is adamant that users also must also have a Jira ticket and they add that ticket identifier in a comment query too so that every query they make will include the comments in the impala database logs for auditing purposes

The trouble we're running into is that we also manage user-facing dashboards which are live connected to impala database with kerberos delegation which means that when a user loads in a dashboard, the server uses the credentials of the user to send in the request rather than some sort of server-account credentials. This means that results have row-level security implemented anyways so users can do all they want with dashboards/vizzes in Tableau without risk of exposing data they aren't allowed to see.

The only way that I know of doing this in Tableau is through custom SQL queries and parameters. Generally this is a feature used for when users can be presented with a dropdown menu of options which are entered into a custom parameterized query (SELECT * FROM mytable WHERE region = <region_parameter>. We can enable Parameters in Tableau which allow anything rather than a fixed list, but for obvious reasons (i.e. SQL injection) this is turned off by default in admin settings.

If we were to turn that setting off, and create an integer parameter (i.e. user can only enter in integers between a min and max value)), whats the security risk like here for us to create a parameterized custom SQL query in Tableau like SELECT * FROM mytable -- <parameter> and instruct their Jira ticket number. Is there anyway here that a user could escape the comment field? If yes, what if we instead used a paramaterized query which only allowed to a user inputted integer value, like SELECT * FROM mytable -- DBT-<parameter> (Tableau lets you validate against different types of parameters like this) and prompt them to only enter in the number portion of their ticket (i.e. their ticket is "DBT-1234", they would enter in "1234" when prompted).

Bearing in mind in the above scenario, the database is protected by kerberos auth which is mapped to user LDAP profiles, so even if they could escape the comment, the results would be filtered out for any data they're not allowed to access anyways (i.e. row-level security).

How dumb of an idea is either of those two? (Ignoring the fact that there's row-level security anyways so why mgmt is adamant to mandate these stupid unvalidated comment text to begin with i dont know).

Oysters Autobio
Mar 13, 2017
Ah there it is, I didn't realize the dashes only allowed commenting until a newline, so yeah unsanitized someone could escape it.

Yeah another reason to prefer open source. I have no idea what Tableau uses here for their custom SQL option, so I don't know how that query is actually materialized before it goes into impala. I imagine they must be using some kind of JavaScript ORM to handle passing the values from the GUI to the SQL.

I'm assuming the same issue applies even if I use Tableau's built-in numeric parameter? IE because we don't know how these Tableau params are being entered into the SQL, it's probably a bad idea to assume that it wouldn't be possible to do SQL injection if I simply used numeric parameter only?

That being said might be overthinking this depending on how Kerberos manages user auth in impala, ie does it with being able to send DDL to certain ACLs only. Because if at least DDL is protected, then even if a malicious actor escaped the string, the most they could do is what? Read all the data they're entitled to read anyways?

This exercise is purely for "auditing" purposes and management has at least accepted the risk that this method (ie entering comments) has nothing enforcing it (we'll prob add in some passive monitoring of the logs to flag queries without it though).

maybe I should just use their enumerated parameters with a lookup list containing every number 1 - 9999 lmao. I would love to see the look on someone's face.

Oysters Autobio fucked around with this message at 13:13 on Mar 27, 2024

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply