|
|
# ? Oct 29, 2018 19:08 |
|
|
# ? Jun 8, 2024 04:32 |
|
Ruggan posted:
How many options will there be? Int, String, DateTime, ?? It's pretty simple if it's just a handful of data types. Also, if only one column will contain data inside of it, per question, then I'd skip CASE and do a COALESCE on the value columns, and determine the type in the DOM/somewhere else.
|
# ? Oct 29, 2018 19:55 |
|
that won't work in T-SQL at least, trying to do that will result in a conversion error because the datatypes of the columns you're coalescing don't match
|
# ? Oct 29, 2018 20:23 |
|
True. Not sure what DBMS he's working with, so the solution will depend on what's available to work with. For funsies:
|
# ? Oct 29, 2018 21:49 |
|
Yeah, I dunno... probably any datatype: - int (e.g. score of 1-5) - numeric (e.g. % of x, $) - datetime (e.g. followup date) - varchar/nvarchar (e.g. comment) - bit (e.g. yes / no) I'm using MS SQL. That would work but only because you're casting all the datatypes to nvarchar. Doing a post-coalesce sort with ints and dates would get very wonky. Ruggan fucked around with this message at 22:36 on Oct 29, 2018 |
# ? Oct 29, 2018 22:34 |
|
Okay that's not what I thought you were doing. I really hate to say this, but isn't this kinda the whole point of ORM UML blah blah blah? You probably want some custom form configuration (JSON was mentioned) and then you type 'make' and it should automatically create the necessary tables, maybe even handle data migrations and such. The last quizzy system I built permitted multiple choice (which provides T/F) and free form text, no drop downs, etc., for the reasons you have uncovered. It's a mess in SQL and a mess in an OO program and so forth. I'd wonder if you're concentrating on the right thing here, namely the collection of data to make project managey decisions, instead of building a kick rear end arbitrary form entry system with every conceivable methodology known to man supported. Edit: Sorry a bit harsh, had to get off the bus. But I'm still asking myself, What if the data was stored someplace else and the database updated by some other process later in the data, en batch? Does it have to be live? Can integers be converted and cast later for reporting purposes? One thing to keep in mind, you're trying to record both "form layout and content information" as well as "customer data that has been collected from said forms". Those are probably different structures. PhantomOfTheCopier fucked around with this message at 23:25 on Oct 29, 2018 |
# ? Oct 29, 2018 23:17 |
|
PhantomOfTheCopier posted:Okay that's not what I thought you were doing. I really hate to say this, but isn't this kinda the whole point of ORM UML blah blah blah? You probably want some custom form configuration (JSON was mentioned) and then you type 'make' and it should automatically create the necessary tables, maybe even handle data migrations and such. Not too harsh. Yes you’re right, they are different structures. Structure vs Instance/Responses. What I have now works fine for the scale I’m at. I have the form built in the app in JSON with a generic form React element that uses it to create the actual form. The website interacts with a web API whose endpoints are named after the forms for get and post. My .NET Web API authenticates the user and hits a stored proc on the server specific to the endpoint that loads / saves data. Each form stores its data in a table, one row per response. Appropriate FK constraints and the works. It’s easy to report on. There are three issues with the current setup: 1. Need to rebuild the app and edit database schema for changes to form structure - no on the fly editing. Not a huge deal considering I’m gatekeeping edits to the forms. 2. Multi response questions need to file to a single column via concatenation. Could split them to a separate table, will do if I need to. Not a huge deal since I can use STRING_SPLIT and branch off a separate table if I absolutely have to for performance. 3. Can’t reuse a question answered in one form in another. Well, I could with some triggers or SP logic but that’s inherently hairy. So moving to EAV is not high priority but it would solve these problems at the high cost of complexity. Will probably continue to think about it and only do it in the future if I have to. ORMs (like Entity Framework) don’t solve these problems either (and I tend towards using Dapper anyway because I can control the queries) because it still requires dev intervention and only moves the problem as far as I know. Maybe I’m missing something - open to suggestions / callouts if I am.
|
# ? Oct 30, 2018 01:03 |
|
I'm trying to migrate our local SQL Server database to Azure SQL Server, which this article gives a high-level overview of: https://docs.microsoft.com/en-us/azure/sql-database/sql-database-cloud-migrate We are hoping to minimize both downtime and strain on the local server. I thought if we could put one of our frequent backups on an empty Azure SQL Server instance and then replicate any changes since the backup was made, the only strain on the local server is pushing the changes, which wouldn't be worrisome. I tried setting this up with my dev server as a transactional replication publisher, but the only changes propagated to the Azure SQL Server were those that were created after the subscriber was also created. The potential problems I see with this are: - If I set up the Azure SQL Server as a subscriber before the backup is finished being loaded into Azure SQL Server (apparently necessary to capture all changes after a backup), wouldn't it replicate changes before the backup is completed, possibly throwing data integrity exceptions if it replicated a row that had a foreign key to a different table row that didn't exist yet? Is there a way to tell the replication to hold on to any changes until my restoring is done and then open the flood gates to the subscriber? If that isn't a reasonable solution, we could setup transactional replication with snapshot isolation, the documentation of which suggests that with concurrent snapshot processing, we shouldn't see long table locking times. Can anyone confirm that? Our db is ~100GBs. This isn't the first choice because we are assuming it will put a lot more strain on the local server than the first option. If anyone has better suggestions I'm all ears. EDIT: It looks like if I set the subscriber's agent to Run on Demand I can achieve holding onto all changes and only replicating them when I manually tell it to run, which I think will solve my problem with the first method. EDIT: Anyone who has done this before, would it be bad if there was some intersection of data between the backup and the change data captured with the replication? Opulent Ceremony fucked around with this message at 02:40 on Nov 3, 2018 |
# ? Nov 2, 2018 23:25 |
|
SQL newbie here. I want to create a new table from an existing table. The existing table has three text fields: comments, notes, and options. For my new table, I'd like to include new columns which show the length of each of these text fields. New column names "comments_length", "notes_length", and "options_length". Would I have to create CASE statements to make this happen? Or could I enter this in the WHERE clause? For example: code:
code:
Tortilla Maker fucked around with this message at 23:51 on Nov 5, 2018 |
# ? Nov 5, 2018 23:45 |
|
Would it be easier to make a view of that table, and compute the 3 new columns in the view? Assuming future rows inserted won’t be getting the length columns populated. e: you can certainly do the calculation in the select statement too
|
# ? Nov 5, 2018 23:53 |
|
kloa posted:Would it be easier to make a view of that table, and compute the 3 new columns in the view? Thank you for your response. As to your question, I don’t know. My current data consists of 5.2million rows and 74 columns. Since I don’t need all 72 columns, I wanted to make a table that pulls in about 20 columns plus the new length columns for a particular. Is a view better in this situation?
|
# ? Nov 6, 2018 00:00 |
|
Tortilla Maker posted:Thank you for your response. As to your question, I don’t know. What you're proposing is called denormalization and is generally frowned upon. Let the data live in just one spot in your DB You can compute the lengths on the fly in your select statement. What is the data you're storing in a 72 column table? Did you design the base table or did someone else?
|
# ? Nov 6, 2018 00:09 |
|
Tortilla Maker posted:SQL newbie here. SQL code:
|
# ? Nov 6, 2018 00:14 |
|
Nth Doctor posted:What you're proposing is called denormalization and is generally frowned upon. Let the data live in just one spot in your DB Dataset pertains to vehicle sale listings. Each row applies to a unique VIN number and features include price, mileage, exterior color, interior color, doors, engine size, fuel type, etc. In terms of why the table was set up the way it is, we received a 38gb flat file that we uploaded into a Postgres database. (By “we” I mean “me” since my lazy project mates didn’t bother helping)
|
# ? Nov 6, 2018 00:20 |
|
Even index card filing systems provided normalisation. Edit: Wikipedia says the late 1890s. Glad to see we've stepped backwards over the last 130yrs. Edit too: vvv nth dr you're assuming that the values in the Excel spreadsheet were actually the same in the first place. Experience shows that your black car is not the same as my Black car. PhantomOfTheCopier fucked around with this message at 00:56 on Nov 6, 2018 |
# ? Nov 6, 2018 00:26 |
|
Tortilla Maker posted:Dataset pertains to vehicle sale listings. Each row applies to a unique VIN number and features include price, mileage, exterior color, interior color, doors, engine size, fuel type, etc. Oh, honey. So: how is this data going to get used? Because if you want to normalize this out, it can be done and you'll end up with way more concise tables. If this is just for some one off project it may not be worth it, but my spider sense is tingling.
|
# ? Nov 6, 2018 00:34 |
|
Why do you care about the comment length and what are you going to do it once you have that data?
|
# ? Nov 6, 2018 03:02 |
|
Ruggan posted:Why do you care about the comment length and what are you going to do it once you have that data? AI/ML/Data Science/buzzwords/etc. Anyways, how's the custom form project going Ruggan??
|
# ? Nov 6, 2018 03:14 |
|
kloa posted:AI/ML/Data Science/buzzwords/etc. Yeah I figured. Mostly trying to ascertain whether it’s something that should be materialized and maybe indexed (sorting, filtering, etc IN SQL). If that were the case, I’d recommend a separate table, materialized view, or calculated columns. If not I’d say just calc it in the select. Like if you’re just pulling the data into an R dataframe then who cares. Although if it’s a static dataset and one time project then do whatever is reproducible and gets you to your end state fastest. Form thing is ok. For now I’m just building the forms out as separate tables that match the C# class structure (to ease the object-relational impedance mismatch). If I end up having to blow out a million forms I’ll probably switch to the EAV set of tables for structure and separate tables for responses. I have most of that whiteboarded if anyone really cares or is interested. For now I’m focusing on releasing new code for a different project I’m working on... also had surgery last Thursday (triple hernia blowout) so I’m still recovering on that front.
|
# ? Nov 6, 2018 03:26 |
|
Ruggan posted:For now I’m focusing on releasing new code for a different project I’m working on... also had surgery last Thursday (triple hernia blowout) so I’m still recovering on that front. I'll bet the Royal Navy never anticipated they'd have as many hernia surgeries in their IT department. PhantomOfTheCopier fucked around with this message at 00:09 on Nov 7, 2018 |
# ? Nov 7, 2018 00:06 |
|
Really stupid question. Using MySQL 5.7. This works. SQL code:
SQL code:
Query Error: Error: ER_PARSE_ERROR: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '< "2001"' at line 1 Is it not possible to use NOT and < like that? I understand that I could just put >=. I'm just curious and going to be showing SQL to a class next week. I'd like to understand why that's failing.
|
# ? Nov 25, 2018 11:40 |
|
Sad Panda posted:Really stupid question. Using MySQL 5.7. You are looking for != or <> for "is not equal to". This has to do with type comparison between different data types within the database, where MySQL can compare either similar types (int != int, str != str) or by implicitly casting types (which is why you can compare a date with a string in the first place, even though the date is probably not a string in your data). "IS" and "IS NOT" tests for equality only against a very limited set of non-arbitrary values. Essentially, TRUE, FALSE and NULL are special values that, depending on the underlying programming, are stored as distinct data types, and to highlight this difference, they use different equality operators. In some ways, the fact that you can use != to test both INTEGERS or STRINGS against their respective pairing hides a fundamental type difference between the underlying data types. PostgreSQL, for example, doesn't let you compare an INTEGER with a STRING, while MySQL lets you do so. edit: code:
code:
Hollow Talk fucked around with this message at 12:43 on Nov 25, 2018 |
# ? Nov 25, 2018 12:40 |
|
You can do what you want but you just need to get the syntax right. The comparison operators ( =, <, >=, etc.) are higher precedence than NOT so you don't have to worry about brackets. SQL code:
|
# ? Nov 25, 2018 18:21 |
|
Same info said differently, I ran into this when I was first learning. NOT isn't an operator like < or = etc. You apply it to the beginning of a valid expression to flip it
|
# ? Nov 26, 2018 13:16 |
|
Thank you very much. The detailed explanation was good, as was the bit about the order.
|
# ? Nov 26, 2018 19:15 |
|
Let's say you have a web application that does standard CRUD operations on a table of items. The web app displays a list of said items and allows you to filter the list using any combination of a few properties (e.g. status). Let's assume there are hundreds of thousands of items so filtering in the database is the only reasonably performant option. I know a few ways of implementing this in SQL, or via SQL interfaces. Option 1: Kitchen sink query This works well, but if you end up caching a bad query plan you can really end up paying for it. Can make up for that with the OPTION RECOMPILE hint at the cost of recompiling a query plan each time. code:
This works better but is a bitch to maintain for obvious reasons. You also end up with a lot of discrete query plans - one for every permutation of filter sets. code:
This approach allows you to layer on the appropriate filters but gives you much less control over how the query is written, which can make some really nasty nested queries. It allows for developers to introduce some tight database coupling (directly to tables) which I personally have a strong distaste for... code:
|
# ? Nov 27, 2018 03:17 |
|
Are the filters a list (set number of items), or user entered?
|
# ? Nov 27, 2018 04:13 |
|
kloa posted:Are the filters a list (set number of items), or user entered? Both. Enumerated values (predefined lists) and text search for specific columns. How does my answer change yours?
|
# ? Nov 27, 2018 09:15 |
|
If all of the filters were pre-defined (Status/Code/Type), they'd all eventually get cached in the DBMS, so I would've said to just let the devs stick with EF queries. Is the wildcard search performant now, or is it already slow and you're trying to figure out a quicker way to retrieve data? I don't know how complex the columns for wildcard searching are, but I'd like to think the DBMS can handle searching through hundreds of thousands of rows pretty easily, with or without indexes.
|
# ? Nov 28, 2018 05:05 |
|
I have all three implementations and they are all performant enough. What I’m looking for is simplicity of maintenance / minor development and preferably minimal database coupling. All three approaches have what I consider to be significant downsides, so I guess I’m just hoping I’m missing some obvious magic bullet.
|
# ? Nov 28, 2018 06:10 |
|
Ah okay. I think you've found the typical choices for building most CRUD apps. I'd say go for EF so you can at least dump it into source control, and not worry about keeping track of a SQL script somewhere. Also, you won't have business logic in multiple places to maintain. From personal experience, there are more programmers than people that know SQL, so it would end up easier to maintain within the code. If you do find a magic bullet, I hope you post it itt
|
# ? Nov 28, 2018 06:47 |
|
We go the route of kitchen sink queries, generally. Sometimes there are performance problems but by parameterizing the filters, you can tell SQL Server to optimize the plans for unknown parameter values which can help reduce the risk. Sometimes for extremely complex queries, we do router sprocs (once or twice where the child sprocs were identical but one called for tenants with large data sets and the other with small data sets), d-sql, or other half clever half insane techniques.
|
# ? Nov 28, 2018 06:59 |
|
Worth noting that you can write a kitchen sink query in EF Nth Doctor posted:We go the route of kitchen sink queries, generally. Sometimes there are performance problems but by parameterizing the filters, you can tell SQL Server to optimize the plans for unknown parameter values which can help reduce the risk. How do you do that?
|
# ? Nov 28, 2018 14:54 |
|
Munkeymon posted:Worth noting that you can write a kitchen sink query in EF Here's an old blog post about it I'm pretty sure there's a way to do this at the server level, since I believe we've done it, but I couldn't find an article about it in the 3 minutes I spent looking. I'll poke around a bit more. E: Here's an article showing how you can disable parameter sniffing at multiple levels. E2: It helps to include the link. Nth Doctor fucked around with this message at 21:27 on Nov 28, 2018 |
# ? Nov 28, 2018 21:09 |
|
In TSQL, is there a difference between putting a clause in the JOIN statement versus in the WHERE statement? I'm still learning SQL and I've seen it mentioned but I'm not sure I fully understand what occurs. SQL code:
SQL code:
|
# ? Dec 1, 2018 21:17 |
|
There can be a difference, yes. See: https://community.modeanalytics.com/sql/tutorial/sql-left-join/ https://community.modeanalytics.com/sql/tutorial/sql-joins-where-vs-on/
|
# ? Dec 1, 2018 21:40 |
|
Hughmoris posted:In TSQL, is there a difference between putting a clause in the JOIN statement versus in the WHERE statement? Well, that would entirely depend on which table Event comes from. If it's in table 1 then it doesn't matter, if it's in table 2 then the second one stops it being a left join by forcing a value to be required. Version 1 gets T1 with all columns from T2 where the keys match and Event = AAA, or nulls. The second one gets T1 with all columns from T2 where the keys match or nulls, and then demands that Event = AAA, which is impossible if it's not set.
|
# ? Dec 1, 2018 21:42 |
|
Moonwolf posted:Well, that would entirely depend on which table Event comes from. If it's in table 1 then it doesn't matter, if it's in table 2 then the second one stops it being a left join by forcing a value to be required. Version 1 gets T1 with all columns from T2 where the keys match and Event = AAA, or nulls. The second one gets T1 with all columns from T2 where the keys match or nulls, and then demands that Event = AAA, which is impossible if it's not set. Yeah this. You may find yourself doing things either way depending on your needs.
|
# ? Dec 2, 2018 00:01 |
|
Although it doesn't determine the results of the query in many cases, if something relates to the logic for pairing up the rows in the tables then it belongs in the ON clause, and if it relates to the logic for filtering rows from one or more tables then it belongs in the WHERE clause. Following this principle will mean that your queries are more readable, and it will also mean that in the event you chose to perform an outer join, you won't need to deviate from a practice you follow everywhere else in order to obtain the outer join behaviour.
|
# ? Dec 2, 2018 00:13 |
|
|
# ? Jun 8, 2024 04:32 |
|
Contrary to that, since outer joins are so special and inner joins are routine, I want outers to look different and will continue to write inners as straightforward queries with all the tables in one place, all the field equalities together, all the restrictions together, and let my query planner do its job.
|
# ? Dec 2, 2018 00:20 |