|
I don't know your exact circumstances or how important keeping the storage size small is but my instinct is to try to not be overly clever here. I'd do the second one (single table with nullable user_id), and if you don't mind ordering the resultset by the key name you can avoid the subquery like so:SQL code:
TheFluff fucked around with this message at 15:11 on Feb 24, 2021 |
# ? Feb 24, 2021 15:03 |
|
|
# ? Jun 8, 2024 05:29 |
Im trying to set up a server agent job to run a stored proc and I am rapidly running out of ideas as to why its not working. If I run the stored proc manually, it works fine. The stored proc calls sp_send_dbmail, and I get the email with the required attachment, which is the results of a query. If I run the stored proc through the server agent, the tasks complete successfully, but I dont get an email, and there are no entries made in msdb.dbo.sysmail_event_log Things I have tried : Given the agent service account db_reader and public on the database the proc runs on, as well as DatabaseMailUserRole in msdb, Given the agent service execute permissions on the stored proc. Set up an operator with an email and made sure they are on duty Set the service account as the owner of the job Tried using a different account to run the job. Restarting the server agent after most steps. Anyone else got any hints on what im screwing up?
|
|
# ? Mar 11, 2021 03:46 |
|
I'm on the chapter of my Learning PostgreSQL textbook about security. It looks like two-way encryption with a single pair of public and private keys has the same effect as row-level security, wherein a table has a username column and the user has to have a username matching the one in the row to see or do things with the row. I don't see the use for storing hashes aside from the hashes that correspond to RDBMS usernames, though. Can anyone name an example use for having a column for hashes in a database?
galenanorth fucked around with this message at 15:13 on Mar 19, 2021 |
# ? Mar 19, 2021 15:05 |
|
I dont have a answer, but what textbook are you using? Do you like it?
|
# ? Mar 19, 2021 15:15 |
|
Learning PostgreSQL 11: A beginner's guide to high-performance PostgreSQL database solutions, 3rd edition, by Salahudin Juba and Andrey Volkov from Packt (2019) It says it's for beginners, but it's not. It uses a ton of terms that I have to look up on the regular, and sometimes it explains them later in the book, which is bizarre. I'm using it with the PostgreSQL documentation, looking up the pages which match the chapter information as I go along, and Database Systems: Design, Implementation, & Management by Carlos M. Coronel and Steven Morris from Cengage Learning (2018). This is a much better book for beginners. After reading a few chapters of it, I realized I should have read the whole thing before starting on the PostgreSQL textbook. The two chapters on SQL include the basic SQL syntax for Oracle, MySQL, Microsoft SQL Server, and Microsoft Access, though the procedural code examples are for Oracle. Using the three together works well because the PostgreSQL textbook omits a lot of good information on things like procedural SQL from the documentation, but elsewhere the documentation includes way too much information I'll probably never use. Now that I've got a system set up, I feel like I'm not having any problems. galenanorth fucked around with this message at 15:27 on Mar 19, 2021 |
# ? Mar 19, 2021 15:24 |
|
Sweet thanks for the post. Yeah I know a decent amount of Postgres but nothing advanced so that sounds what I need
|
# ? Mar 19, 2021 15:36 |
NPR Journalizard posted:Im trying to set up a server agent job to run a stored proc and I am rapidly running out of ideas as to why its not working. Finally figured this out. Server agent service didn't have permissions on a completely different msdb.storedproc and it wasn't kicking up an error message so it was a jerk to track down
|
|
# ? Mar 20, 2021 07:18 |
|
galenanorth posted:I'm on the chapter of my Learning PostgreSQL textbook about security. It looks like two-way encryption with a single pair of public and private keys has the same effect as row-level security, wherein a table has a username column and the user has to have a username matching the one in the row to see or do things with the row. I don't see the use for storing hashes aside from the hashes that correspond to RDBMS usernames, though. Can anyone name an example use for having a column for hashes in a database? Are you asking about about client connections. https://www.postgresql.org/docs/13/ssl-tcp.html Or pgcrypto type encryption. https://www.postgresql.org/docs/13/encryption-options.html Or row level permissions. https://www.postgresql.org/docs/13/ddl-rowsecurity.html Or application password hashes. https://x-team.com/blog/storing-secure-passwords-with-postgresql/ There are plenty of applications where hashes matter, so as a field it certainly makes sense and provides for all the typical uses of a hash. ps Sounds like the person who wrote that book needs to read the PostgreSQL documentation first.
|
# ? Mar 20, 2021 15:41 |
|
pgcrypto type hashes I hadn't gotten to application programming and I'd assumed that application users would be a subset of database users, so that explains it
|
# ? Mar 20, 2021 15:50 |
|
I've got a weird question. I basically am designing a table whose rows contain durations. But if they are members of the same group (as defined by a foreign key) their durations are related to each other. E.g. if I have three rows that look kinda like: code:
Would a standard approach just be to add an "order" column which is just an int column that defines the asc order of the rows relative to each other? Are there other approaches to consider?
|
# ? Mar 27, 2021 08:10 |
|
Skyarb posted:I've got a weird question. I basically am designing a table whose rows contain durations. But if they are members of the same group (as defined by a foreign key) their durations are related to each other. Is the ordering immutable once the rows are inserted, or can it be changed later? Even if it's fixed, can a record be inserted in the middle of a group later? If it's fixed and you can't insert rows in the middle of the sequence later then you don't need an ordering column, you can just use insertion order (well, at least if you have a serial integer primary key). If the ordering is mutable though then you probably want to think a bit more carefully about this, because if the groups get large changing the order can start getting very expensive - for example, if you have an integer ordering column and just increase it by 1 for each row, if you want to insert a row immediately after the first one, you have to update every row in the group after that one. There are a number of solutions that you can use to optimize this, each with their own pitfalls - this blog post is a good place to start reading, I think.
|
# ? Mar 27, 2021 12:36 |
|
TheFluff posted:Is the ordering immutable once the rows are inserted, or can it be changed later? Even if it's fixed, can a record be inserted in the middle of a group later? If it's fixed and you can't insert rows in the middle of the sequence later then you don't need an ordering column, you can just use insertion order (well, at least if you have a serial integer primary key). If the ordering is mutable though then you probably want to think a bit more carefully about this, because if the groups get large changing the order can start getting very expensive - for example, if you have an integer ordering column and just increase it by 1 for each row, if you want to insert a row immediately after the first one, you have to update every row in the group after that one. There are a number of solutions that you can use to optimize this, each with their own pitfalls - this blog post is a good place to start reading, I think. Interesting! Thanks for the read, these groups should likely never exceed 10 items so I think a simple int is fine for now and will keep it elegant (anytime something is inserted, the whole group will be redefined anyway)
|
# ? Mar 27, 2021 17:52 |
|
Is what this page calls "read skew" just another name for what my textbook calls "inconsistent retrievals"? I'm also guessing they're are allowed to happen when phantom reads are allowed to happen, e.g. aggregation which is running while a new row is added. https://vladmihalcea.com/a-beginners-guide-to-read-and-write-skew-phenomena/ I came across three pages which all put MVCC within the category "optimistic concurrency control" https://stackoverflow.com/questions/5751271/optimistic-vs-multi-version-concurrency-control-differences https://www.tutorialdba.com/2017/10/what-is-mvcc-multi-version-concurrency.html https://vladmihalcea.com/optimistic-vs-pessimistic-locking/ but Andy Pavlo at Carnegie Melon University suggests that it's a larger framework of design decisions that can be used with concurrency control and brings up acronyms like MV2PL and MVOCC, so I'll go with the more authoritative source https://15445.courses.cs.cmu.edu/fall2019/ Anyway, I couldn't help but notice "In-House DJ: DJ Drop Tables" at the end of this course page. ---- How do advisory locks work in PostgreSQL? I read https://www.postgresql.org/docs/current/explicit-locking.html#ADVISORY-LOCKS and https://www.netguru.com/codestories/advisory-locks, but I still don't get it. Where do you get the number provided to the function? The example uses an id attribute, so for a row, is it a surrogate key? Is it that you pass an arbitrary obscure number and using it somewhere in the application is a signal? https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADVISORY-LOCKS says "key value", so I'm guessing that means surrogate key galenanorth fucked around with this message at 22:33 on Mar 28, 2021 |
# ? Mar 28, 2021 14:26 |
|
You can think of the number as being a pointer to an actual lock object in memory, if you like. If two things try to lock using the same pointer, one will succeed and the other will need to wait, while if two things try to lock different pointers they will both succeed and not interfere with each other. If two things want to use the same lock then they need to use the same number - how exactly they agree on what number to use is up to you.
|
# ? Mar 29, 2021 03:27 |
|
galenanorth posted:Is what this page calls "read skew" just another name for what my textbook calls "inconsistent retrievals"? I'm also guessing they're are allowed to happen when phantom reads are allowed to happen, e.g. aggregation which is running while a new row is added. galenanorth posted:I came across three pages which all put MVCC within the category "optimistic concurrency control" Also if you're looking for people to chat with about the course (or learning DBMS internals), there's an unofficial 15-445/645 Discord that people are using to go through the course lectures and projects together. galenanorth posted:How do advisory locks work in PostgreSQL? I read https://www.postgresql.org/docs/current/explicit-locking.html#ADVISORY-LOCKS and https://www.netguru.com/codestories/advisory-locks, but I still don't get it. Where do you get the number provided to the function? The example uses an id attribute, so for a row, is it a surrogate key? Is it that you pass an arbitrary obscure number and using it somewhere in the application is a signal? edit: Also if you love learning about DBMS internals, the current (and previous) online seminar series have been great. Today's got Badrish from Microsoft Research talking about their awesome KV store FASTER. Previous highlights for me are 2020's DuckDB, Jepsen, SQL Server (focused on Cascades), and 2021's HarperDB talk (this one is wild and remarkably candid). Star War Sex Parrot fucked around with this message at 16:35 on Mar 29, 2021 |
# ? Mar 29, 2021 16:20 |
|
Hey all, got a conceptual issue with Oracle's Advanced Queueing. I'm just getting into it for a smaller project, but it's got me thinking about an ETL process where I think it might be useful. Key points: a) The sender needs to dump processed rows into some sort of buffer. A queue, a staging table, whatever. It can't wait on the receiver to be ready for processing. b) The receiver needs granular control over when and which rows from that buffer it should process. c) Processing may be manually triggered, on a schedule or (tricky) when X time has passed with no new rows matching condition Y being added to the buffer. d) Everything in the buffer should be processed once and only once, adding new rows to the buffer must not affect any running processing. e) Processing of sets of messages should be independent. Just because one set of messages is waiting for time X to pass before they're processed should not affect the processing of other messages in the buffer. Fairly sure I could cover most of that and d) in particular made me think of AQ, but my worry is the tricky bit mentioned in c). I'd need to dequeue the last message in BROWSE mode to get its enqueue time, but from what I can see that's not a direct option. I may be able to hack the queue into more of a stack so it generally runs as LIFO, which I think would be okayish for this particular application, but really I'd rather have the queue run FIFO and have a different method to get the minimum age of an enqueued message matching a set of conditions. And as much as possible I'd like that solution to use the queue instead of going around it. Ideas?
|
# ? Apr 1, 2021 16:48 |
|
I've got a view I want to alter to add a column to. The view has several inner / left outer joins to it. I want my column to use some of those joins but not all and add one extra where clause just for this column but not the view as a whole. Is the best way to accomplish this just to create a subquery in the select statement and replicate the joins I want in my subquery?
|
# ? Apr 1, 2021 17:04 |
|
Spikes32 posted:I've got a view I want to alter to add a column to. The view has several inner / left outer joins to it. I want my column to use some of those joins but not all and add one extra where clause just for this column but not the view as a whole. Is the best way to accomplish this just to create a subquery in the select statement and replicate the joins I want in my subquery? it's extremely difficult to parse what you're asking for (for example - inner / left joins, do you mean the view itself is constructed using these joins, or other queries referencing this view are left/inner joining to it?), can you provide some code or pseudocode to give us an idea of what you're trying to accomplish?
|
# ? Apr 1, 2021 18:07 |
|
Yeah sorry I'm sure what I wrote is only clear to le because I'm staring at it. SELECT a, b, c FROM S inner join I on S.id = I.id Inner join CWS on I.cws_id = CWS.cws_id and CWS.completed = y Bunch more joins Where (I.start_date is not null) and (I.end_date is null) I want to add column d with a few filters attached from the CWS table where I.end_date is not null, but not affect what the rest of the view pulls. The CWS table has multiple rows for each s.id. If this isn't clear enough let me know and I'll try again.
|
# ? Apr 1, 2021 18:38 |
|
code:
|
# ? Apr 1, 2021 18:58 |
|
I need those joins to stay active for picking samples to show in the view, I just don't want them active specifically for the new column.
|
# ? Apr 1, 2021 19:13 |
|
Spikes32 posted:I need those joins to stay active for picking samples to show in the view, I just don't want them active specifically for the new column. It sounds like you may need to do a whole-assed subquery (are those even legal in views?) instead of additional joins. The fact that CWS joins from I, and the WHERE has I.End_Date IS NULL means you can't really independently filter rows where I.End_Date IS NOT NULL instead. I haven't put a ton of thought into this, but you're still running at an abstraction that makes it hard for me to determine your intent with the view or what the additional column represents. The whole End_Date IS NULL / End_Date IS NOT NULL seems like it would be incompatible to me in the query. For a given row in the view now, what should the column you want to add even have?
|
# ? Apr 1, 2021 19:25 |
|
I've got something going on in an aggregation query that is confusing me.code:
code:
|
# ? Apr 1, 2021 19:31 |
|
Nth Doctor posted:It sounds like you may need to do a whole-assed subquery (are those even legal in views?) instead of additional joins. The fact that CWS joins from I, and the WHERE has I.End_Date IS NULL means you can't really independently filter rows where I.End_Date IS NOT NULL instead. To try and explain what im doing, this view is used to create a report where all samples currently being incubated are listed. Samples can have more than one incubation phase, the view currently only grabs info from the current incubation phase that is ongoing (from cws/i table) . I need to add a column that grabs the start date of the initial incubation even from samples that are no longer in that initial incubation. Thus changing the join is null clause for this column but not the view as a whole.
|
# ? Apr 1, 2021 19:39 |
|
Portland Sucks posted:I've got something going on in an aggregation query that is confusing me. if you have rows with the same email address with two different regions, because you're grouping your count distinct is going to catch the email twice: once for region A and once for region B
|
# ? Apr 1, 2021 19:41 |
|
Spikes32 posted:To try and explain what im doing, this view is used to create a report where all samples currently being incubated are listed. Samples can have more than one incubation phase, the view currently only grabs info from the current incubation phase that is ongoing (from cws/i table) . if i'm reading this right, this sounds like subquery territory SELECT a , b , c , d = (select whatEver from CWS CWS2 where CWS.cws_id = CWS2.cws_id and I.end_date is not null and 1=1 and 2=2 and...) FROM S inner join I on S.id = I.id Inner join CWS on I.cws_id = CWS.cws_id and CWS.completed = y Bunch more joins Where (I.start_date is not null) and (I.end_date is null)
|
# ? Apr 1, 2021 19:55 |
|
Thank you, so my initial idea of just replicating the joins again in the subquery was correct I appreciate the help.
|
# ? Apr 1, 2021 20:49 |
|
kumba posted:if you have rows with the same email address with two different regions, because you're grouping your count distinct is going to catch the email twice: once for region A and once for region B Want to make sure I understand this logic...so if I have two rows a@b.com, 2021, Q1, USA a@b.com, 2021, Q1, EUR when i just count distinct email i end up with 1 if I group and count distinct i end up with two groups 2021, Q1, USA 1 2021, Q1, EUR 1 then the select distinct is evaluated and summed which ends up giving me 2. I think I get the problem, is there a more standard way to ask this question that'll prevent multiple counting or is this more of a data quality issue?
|
# ? Apr 1, 2021 20:55 |
|
Portland Sucks posted:Want to make sure I understand this logic...so if I have two rows Well if you have the same email in both USA and EUR, but you only want it to contribute to one of the two... which region do you want it to contribute to?
|
# ? Apr 1, 2021 21:02 |
|
Spikes32 posted:To try and explain what im doing, this view is used to create a report where all samples currently being incubated are listed. Samples can have more than one incubation phase, the view currently only grabs info from the current incubation phase that is ongoing (from cws/i table) . I don't know what database you're using but if you have lateral joins (called cross apply/outer apply in MSSQL) this is like the textbook use case for them. Something like SQL code:
e: I'm unsure what your data looks like but you may or may not want to convert the cross join to a left join. TheFluff fucked around with this message at 22:09 on Apr 1, 2021 |
# ? Apr 1, 2021 22:05 |
|
Thanks! I ended up writing a sub query that used a where clause to make the sub query Sample_id equal to the Sample_id in the main query which worked well and didn't add processing time that I could see. The dB is a Microsoft sql 2012,ill look into lateral joins to see if that would be better.
|
# ? Apr 1, 2021 23:06 |
|
Hey goons simple question I hope: Is there a way to cascade order by in Postgres? For example say I have the following table: name | staffid | attribute_id ----------------------------------------------------- Horse | null | 1 Bird | null | 10 Cat | 1 | 8 | Dog | 2 | 8 Cat | 1 | 9 Dog | 2 | 9 Can I have it return so that the order is by attribute_id asc and then it sorts by staffid when it sees multiple entries in the order? Such as: Horse | null | 1 Cat | 1 | 8 Cat | 1 | 9 Dog | 2 | 8 Dog | 2 | 9 Bird | null | 10 edit: I forgot to add, not everything has a staffid. ? Empress Brosephine fucked around with this message at 14:48 on Apr 8, 2021 |
# ? Apr 8, 2021 14:12 |
|
Empress Brosephine posted:Hey goons simple question I hope: Yep, you can simply add multiple columns to your ORDER BY clause. So ORDER BY name, attribute_id, staff_id
|
# ? Apr 8, 2021 14:49 |
|
Kuule hain nussivan posted:Yep, you can simply add multiple columns to your ORDER BY clause. So ORDER BY name, attribute_id, staff_id Thats what I thought but i'm finding its returning this: Horse | Null | 1 Cat | 1 | 8 Dog | 2 | 8 Cat | 1 | 9 Dog | 2 | 9 I posted this on stackoverflow also and added a little descriptor: quote:Hey Cetin I just updated the list to make more sense and be more correct?
|
# ? Apr 8, 2021 14:53 |
|
They solved it by doing this:quote:order by min(attribute_id) over (partition by name), which i'm not sure what a min & a over or a partition is but hey it works lol
|
# ? Apr 8, 2021 15:12 |
|
Which version of postgre are you using? I just tried it out on a fiddle with 9.6 and am getting the wanted result. Is your table structure just text NOT NULL, int NULL, int NOT NULL?
|
# ? Apr 8, 2021 15:13 |
|
How do you tell the postgres version? I'm actually using different data then this example, which now that I apply it to my data isn't really working as well as it should. Here's the actual data i'm using: ignore the fake data in here: code:
code:
code:
attrib_id / company_name/ attribute_name / value / staff_id code:
code:
attrib_id - int8 NOT NULL ------------------ company_name varchar NOTNULL ----------------- attribute_name varchar NOTNULL ----------------------- value varchar NULL ------------------------------ staff_id uuid NULL Empress Brosephine fucked around with this message at 15:32 on Apr 8, 2021 |
# ? Apr 8, 2021 15:22 |
|
which I think what I need is something like (pseudo code):code:
|
# ? Apr 8, 2021 15:39 |
|
order by case when staff_id is null then attribute_id else min(attribute_id) over (partition by staff_id) end http://sqlfiddle.com/#!17/08d351/6
|
# ? Apr 8, 2021 17:16 |
|
|
# ? Jun 8, 2024 05:29 |
|
Hammerite posted:order by case when staff_id is null then attribute_id else min(attribute_id) over (partition by staff_id) end THANK YOU SO MUCH! Here's what worked: code:
|
# ? Apr 8, 2021 19:59 |