Doh004 posted:Someone just got a promotion. I'm the lead/sole backend dev at a startup in LA and have been for a while, we hit first bookable revenue this month which is a nice milestone.
|
|
# ? Oct 4, 2017 07:18 |
|
|
# ? May 15, 2024 03:38 |
|
A MIRACLE posted:I'm the lead/sole backend dev at a startup in LA and have been for a while, we hit first bookable revenue this month which is a nice milestone. Even better! Congrats
|
# ? Oct 4, 2017 12:10 |
|
The CI pipeline for one of our applications seems to be recreating the database every time it runs, by running all migrations. I've pointed out that the rails documentation (and the boilerplate comments in schema.rb) say not to create databases with migrations, but the lead programmer for this product is concerned that schema.rb may not work correctly because it may not be truly database agnostic (dev machines use Sqlite3, production server uses Oracle). Schema.rb looks to me like it's pretty much completely abstracted away from the actual SQL, and I'd have assumed that whatever DSL runs to keep migrations database agnostic also keeps rails db:reset (or its equivalents) agnostic. One former employee even said that at his previous job, they didn't keep the migrations around in source control once they started to get old, because they just aren't very useful after a while. So: 1) Is CI an exception to the general rule, where migrations should be used for creating the database, or should it also use schema.rb? 2) Do we need to worry that a schema.rb generated on a dev box using Sqlite3 might not work properly under Oracle?
|
# ? Oct 4, 2017 19:26 |
|
CI should do whatever gets the DB up fastest while still somewhat resembling reality. Typically a `schema:load` is faster than running all the migrations, so I would normally do that. Running the migrations might get you a little test coverage for the migration files, but the only thing that will ever give you real confidence that the migrations work is running them against production data. If you were using the same database everywhere, I'd say you don't have much to worry about with schema.rb. But since you're split between sqlite and oracle, there are legit concerns. Using the `schema.rb` workflow will only really work if you're strictly using stuff that's supported by both DBs Some examples of things that might not work * Configuration statements for your DB (like 'CREATE EXTENSION' in postgresql) might not be dumped into the schema file * Foreign keys (not supported by SQLite, `add_foreign_key` will run but isn't dumped into the schema file) The best practice is to use the same DB in test that you use in production, for a lot of reasons, but I understand this is probably easier with MySQL and PostgreSQL than Oracle. Individual migration files shouldn't live in the code forever because they're just cruft that you'll have to ignore in your searching and linting tools. The practice I've been following is to squish all migrations once a year or so into the initial migration file. There's a gem that does this: https://github.com/jalkoby/squasher But if everyone typically loads the DB from schema, I guess it doesn't matter if you just outright delete the migrations you think never need to run again.
|
# ? Oct 5, 2017 02:56 |
|
Sivart13 posted:Good stuff Thanks! I admit that isn't exactly what I wanted to hear, but at least maybe I won't make an rear end of myself at the next meeting now. Is it worthwhile to switch my dev db to MySQL? By "worthwhile", I mean are there less likely to be compatibility issues between MySQL and Oracle than between SQLite and Oracle, or is it just exchanging one set of issues for another? I admit I like how easy it is to use SQLite, despite having some reservations about it when I started using it, but it wouldn't be hard to use MySQL on my dev box instead.
|
# ? Oct 5, 2017 16:23 |
|
Using mysql isn't really any better for compatibility, I'd just stick with sqlite for local dev.
|
# ? Oct 5, 2017 22:53 |
|
If you're using MySQL (or Aurora) in production it's not a bad idea to have MySQL locally so you can encounter odd behaviour before it hits prod. The other scenario is that you're doing something over and above ActiveRecord that is specific to MySQL but I really, really recommend against that.
|
# ? Oct 6, 2017 06:16 |
|
Yes but he's using Oracle in production.
|
# ? Oct 6, 2017 08:40 |
|
A minor optimization issue came up the other day in a project, and I was surprised how much trouble I had speeding up some work with models. I did some crude profiling, found a couple of sections of a page load that were hammering the database (relatively speaking), and tried a few things to eager load association data to reduce the database hits. I was successful in reducing the number of queries in at least one place, but it increased the total page load time for the page in question. I tried using includes() and eager_load() on an object's associated objects (so, on a CollectionProxy, I think), and at least one time I did manage to collapse a bunch of queries down to one, but it didn't have the desired effect. I also tried `outer_left_join` at one point, with similarly disappointing results. I can't spend much more time on this right now (it's just something another developer was concerned about, and just a nice-to-have refactoring issue for now), but I'm curious if there are any obvious or common pitfalls to using these methods. Possible issues that I would explore if I had more time:
The method being run to populate and display this page is heavy with mathematical computations, and in a pinch we can probably speed things up by tweaking the schema to store some partial results, but I'd like to clean up the SQL before deciding if we need to denormalize the database. This project uses Rails 5.1.4, and ruby 2.4.0. Peristalsis fucked around with this message at 17:17 on Nov 15, 2017 |
# ? Nov 15, 2017 17:12 |
|
Peristalsis posted:A minor optimization issue came up the other day in a project, and I was surprised how much trouble I had speeding up some work with models. I did some crude profiling, found a couple of sections of a page load that were hammering the database (relatively speaking), and tried a few things to eager load association data to reduce the database hits. I was successful in reducing the number of queries in at least one place, but it increased the total page load time for the page in question. I tried using includes() and eager_load() on an object's associated objects (so, on a CollectionProxy, I think), and at least one time I did manage to collapse a bunch of queries down to one, but it didn't have the desired effect. I also tried `outer_left_join` at one point, with similarly disappointing results. I'd guess the reasons why you didn't see a speedup are these, in order of likelihood: 1) sqlite3 is not a good database. Set up a local postgres. You can do it in Docker these days. 2) the reason why page load is taking a long time is not because of database query speed but instead ruby speed. Things like GROUPS and SUMS are cheap and fast on a database and are slow and expensive in ruby, if those are possible. Looking more at your algorithms will yield other improvements. 3) The way you are using the objects is not benefiting from the includes at all. I would isolate querying and rendering and run metrics on either side of that to identify where the time is truly being spent.
|
# ? Nov 16, 2017 00:26 |
|
kayakyakr posted:I'd guess the reasons why you didn't see a speedup are these, in order of likelihood: Agreed on all three points. I would try Postgres locally if possible, for many reasons. I wouldn't necessarily expect sqlite to reflect certain db optimizations. If you're on a Mac, Homebrew does a pretty decent job of installing/running it these days, easier than Postgres.app even.
|
# ? Nov 16, 2017 07:57 |
|
kayakyakr posted:I'd guess the reasons why you didn't see a speedup are these, in order of likelihood: The Milkman posted:Agreed on all three points. I would try Postgres locally if possible, for many reasons. I wouldn't necessarily expect sqlite to reflect certain db optimizations. If you're on a Mac, Homebrew does a pretty decent job of installing/running it these days, easier than Postgres.app even. Thanks folks! I have MySQL installed already for another project, maybe I'll switch this project over to it for a quick test. I assume it's a serious enough db to show some results if there should be improvements. I'm working on Ubuntu, but I do have a MacBook for meetings and whatnot that I can use if I do end up installing Postgres. Edit: Also, regarding item 2) above, I have no doubt that the computations involved in this particular page load are taking up time, but I don't think that would explain why collapsing 50 - 100 db hits down to 1 or 2 would slow the total load time. The profiling I used did actually show that the new query itself took longer than the ones it replaced. Peristalsis fucked around with this message at 18:35 on Nov 16, 2017 |
# ? Nov 16, 2017 18:30 |
|
How are your indexes looking? Also, please try to replicate the environment used in production on your development machine as much as possible.
|
# ? Nov 16, 2017 19:08 |
|
Doh004 posted:How are your indexes looking? I'm not sure about the indexes - I'll have to look in to that. Just switching to MySQL didn't do much good, so I'm doing something wrong in my changes. The data model is roughly this: Experiments have many Plates Plates have many Samples Samples have many DataPoints One of the places I'm trying to optimize is an Experiment method that returns a subset of its samples, based on sample_type. So, this code code:
There are a number of different things that could be tried here, but I just wanted to see if I could pre-load some data to start off, and changed code:
code:
|
# ? Nov 16, 2017 19:57 |
|
SQLite is plenty fast. The issue is both populating a tree of objects from the queries and then filtering those objects in ruby. To get performance here you really need to leverage the database for querying what you want. That map.flatten call had to iterate all the stuff twice before you even filter. So that's at least three full iterations of the full results in ruby.
|
# ? Nov 17, 2017 18:53 |
|
necrotic posted:SQLite is plenty fast. The issue is both populating a tree of objects from the queries and then filtering those objects in ruby. To get performance here you really need to leverage the database for querying what you want. Thanks, I'll look into restructuring the loop so it only goes over the samples once. Also, I did check the indexing: samples are indexed on plate_id, and plates are indexed on experiment_id - I'm not an expert, but that seems like the obvious list of things to index. I suspect there's a larger, structural problem here - something being done brute-force that could be done better, or maybe repeated queries over the same data. It's not really my project, so I'm not putting in the time I would need to really tear apart the existing code, but I'm baffled that pre-loading data didn't at least help some to do a slow thing faster, even if it should be doing something else to begin with.
|
# ? Nov 20, 2017 17:10 |
|
code:
code:
|
# ? Nov 20, 2017 17:10 |
|
I am a hobbyist trying to improve on a small web app I have. It's a super simple app that queries an external API which pulls a bunch of data into my app's database. The app then does some math and displays output to the user. It's a rails 5 app running on Heroku. I'm currently just letting the web dyno on Heroku do all this work. The call to the API takes ~20 seconds for most users (the math part is trivial, less than 500ms) so the web dyno doesn't time out. But I'm starting to get more users, and obviously the web dyno doing all this work for user A prevents user B from accessing the site until the work is done. Oh, and it's non-optimal that user A is just left hanging for 20 seconds. I'm looking into a way to pass the external API call off to a worker dyno and show the user a page with a spinner that just says something dumb like "hang on a sec", which, when the process is done, redirects the user to the results page. I've watched the railscasts on Beanstalkd, Resque, and Delayed Jobs, and I'm a little confused about which is the best approach for me. Delayed Jobs seems the simplest and my use case is certainly easy, I just don't understand how to actually asynchronously process poo poo in the background while having the user's browser poll or something to determine when the job is actually complete and take them to the right place. I don't want them to have to F5 to figure out when the job is done, I just want it to process async, and then redirect when done. Does anyone have any tips/suggestions? Thanks.
|
# ? Jan 19, 2018 23:17 |
|
You cannot just "process async" in Ruby, you need something like Delayed Job to handle things in another process. Rails 5 has ActiveJob which would be the quickest way forward, but you still have to run a background process to handle the queue.
|
# ? Jan 20, 2018 00:26 |
|
necrotic posted:You cannot just "process async" in Ruby, you need something like Delayed Job to handle things in another process. Rails 5 has ActiveJob which would be the quickest way forward, but you still have to run a background process to handle the queue. I get that, I just can't figure out how to make what I want happen in my controller. I'll post some code when I get home, hopefully that will be more clear.
|
# ? Jan 20, 2018 00:36 |
Sub Par, I think I know what you're asking. you're going to want to read up on ActiveJob class inheritors and use something like Sidekiq to manage async processes Then in your controller you're going to want something like controllers/my_active_record_thing_controller.rb Ruby code:
models/my_active_record_thing.rb Ruby code:
Ruby code:
|
|
# ? Jan 20, 2018 01:10 |
|
You can track the state of the job somewhere in your DB - e.g. you can make a model like SomeJobStatus and set its state column to pending before it's run. Then you pass the ID of an instance of this model to the worker and when the job is finished you set the state to finished, error if it breaks etc. This way you can make a controller action that will return the status of your job - it would just have the query to find that job and serialize it into JSON. You can then poll this action via ajax. This is just one solution. I don't know if Delayed Job (been awhile since I've used it) automatically tracks these states and if you can get a job_id easily - if it does, you don't have to do manually track this stuff so it'll make your life simpler. I know with Sidekiq you can get the job_id and track job states easily, but it has Redis as a dependency and that costs money on Heroku.
|
# ? Jan 20, 2018 01:12 |
|
This is exactly what I'm looking for, thanks. I think I can piece together what I need between this and Google. Thanks again.
|
# ? Jan 20, 2018 01:20 |
|
Also you should look into using Puma for your webserver, which allows several requests to happen at the same time, even if you're only running one dyno https://devcenter.heroku.com/articles/deploying-rails-applications-with-the-puma-web-server
|
# ? Jan 20, 2018 03:34 |
|
It's probably already on Puma since it's the default now, but do set up your config/puma.rb per the Heroku docs.
|
# ? Jan 20, 2018 04:42 |
|
In any case if an action takes 20 seconds, you should pretty much move it to a background job. Especially with API calls, if there's something like a network error - you can raise an error inside the worker and it'll retry automatically to how many times you've set it.
|
# ? Jan 20, 2018 11:54 |
pretty much any controller endpoint that calls an external API for the user should be async. As a rule I only do the minimum work necessary for a valid response in the endpoint, and offload everything else to worker jobs. I've allowed my app developers a `&sync=true` param for a couple endpoints if they want to wait for the data on demand, like a credit check on the current user for example. This might be bad design but I have to make concessions to the front end guys occasionally because a positive work relationship with them is a lot more important to me than adhering absolutely to REST design principles. I'm also the sole API / browser developer at my startup so I get to make these decisions and deal with the fallout (or lack thereof) down the line. So far it's been fine
|
|
# ? Jan 22, 2018 00:33 |
|
Thanks everyone, I finally got this working. I ended up using Delayed Jobs. I wrote a custom job, then built a table to store information about job state, used the hooks provided by delayed jobs to update that table as job goes from queued -> running -> done, and then used JS polling to check that status table for job completion, at which point I redirect the user to the right place. It's all working magically. Now let's hope I don't bankrupt myself by accidentally spinning up too many dynos on Heroku.
|
# ? Jan 25, 2018 19:47 |
|
A recent interview made me realize that I have forgotten a lot about proper ActiveRecord use and data modeling. Specifically, the repeated questions about single-table inheritance (which I've never used) and an exercise where I completely flubbed on desgining the database. I'm also realizing that I've never actually done anything harder or more complicated than has_one :through. Is there a book or similar resource I can read up on to catch back up with data modeling, fun database table fuckery, etc.?
|
# ? Jan 26, 2018 14:11 |
|
A MIRACLE posted:Sub Par, I think I know what you're asking. you're going to want to read up on ActiveJob class inheritors and use something like Sidekiq to manage async processes Be really careful if you decide to go this route. I've been bitten before and now I don't let callbacks escape whatever layer they're in (this is traversing from models to jobs.) If you're working with multiple layers, wrap it in a PORO as an Operation.
|
# ? Feb 7, 2018 02:38 |
|
Pollyanna posted:Is there a book or similar resource I can read up on to catch back up with data modeling, fun database table fuckery, etc.? Late reply, but the Rails guides and docs go into the advanced features in a good amount of detail. And I wouldn't stress over missing the questions about STI, as in the wild it is frequently indicative of either A Present Living Hell or A Coming Armageddon. If I walked into an interview and they kept hammering me about implementing STI, I would be concerned.
|
# ? Feb 28, 2018 06:17 |
|
Molten Llama posted:Late reply, but the Rails guides and docs go into the advanced features in a good amount of detail. Interesting. Good to know...Single Table Inheritance comes up a lot in interviews in my experience. That's potentially worrisome. Maybe I don't want to use it.
|
# ? Feb 28, 2018 16:10 |
|
Pollyanna posted:Interesting. Good to know...Single Table Inheritance comes up a lot in interviews in my experience. That's potentially worrisome. Maybe I don't want to use it. Edit: Well, one of many Rails bastard children
|
# ? Feb 28, 2018 16:49 |
|
I've come across a few situations where STI 'felt' like the right fit and it's proven to work out. But that was going in with a healthy level of cautiousness using it. If it came up more than like once in an interview yeah that's a warning sign.
|
# ? Feb 28, 2018 18:06 |
Well my startup went belly up. If anyone wants to hire me for remote or local to LA pm me. I got a decent CV and some lead / senior level experience
|
|
# ? Mar 2, 2018 17:11 |
|
I'm a little late to the STI discussion, but is there an accepted better way to model inheritance of Model classes in Rails? The only other way I've seen is to have a separate table for each child model, with references back to the parent model. So, something like this:code:
code:
code:
Is STI just considered kind of an ugly hack that's too loose? For example, it seems unfortunate to me that with STI, if classes B and C both inherit from A, B objects can access attributes intended to be used only by C objects (though they'd have to use exact db attribute names, rather than association names that can be specific to subclasses). Our current product is starting to use a few STI models, and if this is a recipe for future problems, I'd like to know now.
|
# ? Mar 6, 2018 20:19 |
|
Peristalsis posted:I'm a little late to the STI discussion, but is there an accepted better way to model inheritance of Model classes in Rails? The only other way I've seen is to have a separate table for each child model, with references back to the parent model. So, something like this: Could the subclasses get their additional fields from a joined table via `has_one` or something like that? I've never seen an implementation of STI where the subclasses used different base tables.
|
# ? Mar 6, 2018 20:58 |
|
Peristalsis posted:I'm a little late to the STI discussion, but is there an accepted better way to model inheritance of Model classes in Rails? The only other way I've seen is to have a separate table for each child model, with references back to the parent model. So, something like this: No, you just model each child table with all of the properties of the parent class. Or you use a concern as the parent and not direct inheritance. The other way of doing real STI would be to use a json field in postgres where children store their properties in json, or to use a document database like mongodb where it's all dynamic json.
|
# ? Mar 6, 2018 21:46 |
|
kayakyakr posted:The other way of doing real STI would be to use a json field in postgres where children store their properties in json, or to use a document database like mongodb where it's all dynamic json. Yeah, the last time I dealt with STI, the subclasses used a json field for extended data. I'm not sure that was the best way to do it, but it worked well enough for a shallow tree with minor differences.
|
# ? Mar 6, 2018 22:38 |
|
|
# ? May 15, 2024 03:38 |
|
I may have considered using STI along with Postgres’s table inheritance at one point but I’ve since stopped thinking about the monstrosity that spawned that idea.
|
# ? Mar 7, 2018 04:16 |