|
The Dreamer posted:I'm trying to randomly grab a field from a randomly selected record that is associated with another record. I see that I can do this by ordering and calling a Random function for whichever database I'm using. This is still a prototype app so I'll probably actually do this before I go to production. I found a different way to do this though and was wondering if it was actually a terrible idea. It works now but I'm not sure how it would scale over time with larger database tables. You could do a quick COUNT(*) then choose a random record using rand(count) as your offset and a LIMIT 1
|
# ? May 29, 2015 07:37 |
|
|
# ? May 30, 2024 13:06 |
|
MALE SHOEGAZE posted:What are you trying to do that you need something automating shell input? There's got to be a better way. If this is directed at me, then I'm trying to let people fire off a remote process from within our application. The app tracks research activity, and users can currently note that they ran a process on some data files (e.g. some R script to run a statistical analysis), and link the work record with the input and output files of that process. (You didn't hear this from me, but the whole app is basically a glorified spreadhseet.) We'd like to give them the option to record the pertinent data, and then invoke the remote process all at once. Of course, they won't be able to enter any info about the output files when they haven't run the script yet, but the remote process can make additional updates to the records in the application with curl commands, using an XML API. By default, most of the login, executable file, input parameter, and output redirection information will be inherited from parent objects in the system, but the user will be able to modify them for specific cases, enter their password (which will be filtered from the log files), and then click a button to have the application use their login info to start their process remotely (and asynchronously). Like I said, this is a prototype at this point - we may well restrict them to specific servers we own, disallow changes to the default values except by admins, or just scrap the approach entirely, but I'd like to make it work well enough to have a non-theoretical discussion about which direction we want to go. If plain old command-line ssh has an option to suppress the additional password prompts, I guess I could just construct a command string to use with backticks in the ruby code, if I can't find a similar option in net-ssh. We've had to do similar things one or two other times with ruby features that were slower than their CLI counterparts.
|
# ? May 29, 2015 20:35 |
|
The Journey Fraternity posted:You could do a quick COUNT(*) then choose a random record using rand(count) as your offset and a LIMIT 1 Thanks. I'll give that a try.
|
# ? May 29, 2015 21:00 |
|
Peristalsis, ssh has NumberOfPasswordPrompts as a config option (see man ssh_config). I don't know if Net::SSH gives access to those. On the command line it'd be code:
|
# ? May 29, 2015 23:05 |
|
Active model serializer is dragging rear end. I switched to it from jbuilder because it was even slower. Applying AMS's caching seems to have helped with a 50% gain, but we're still churning on small things that should be near instant, even on a slow, crappy server. Plus we've got a use case that returns 1500+ records on the reg that was taking 30 seconds and is now down to 15 with caching. So, anyone have any suggestions for faster JSON responses aside from just generating the JSON on the database & never instantiating rails objects?
|
# ? May 30, 2015 05:22 |
|
Peristalsis posted:If plain old command-line ssh has an option to suppress the additional password prompts, I guess I could just construct a command string to use with backticks in the ruby code, if I can't find a similar option in net-ssh. We've had to do similar things one or two other times with ruby features that were slower than their CLI counterparts. There is not a way to remove the prompt for password-based logins. The proper way to solve this situation is using key pairs, or possibly through an LDAP system.
|
# ? May 30, 2015 22:17 |
|
necrotic posted:There is not a way to remove the prompt for password-based logins. The proper way to solve this situation is using key pairs, or possibly through an LDAP system. To be clear, I'm looking to remove the additional prompts if the user enters the wrong password the first time. I'm not sure if your comment accounts for that, but if so, it seems to be in conflict with Soup in a Bag's comment, above. The net-ssh gem allows me to pass the login and password along at the same time, removing the first explicit command line password prompt altogether. Using a key system rather than passwords is definitely a possibility, just one I didn't want to delve into, since I'm not familiar with it. We do use LDAP logins for our main application, though that was coded by someone else, and I'm not really familiar with it, either. Regardless, I appreciate all the responses, and I'll take a look at my options when I'm at work again on Monday.
|
# ? May 31, 2015 01:08 |
|
Fell back to formatting the JSON on the database and passing it through to the front end. Using mongo for this project, so had to figure out how to get aggregates to work. Cut my call to 4-8 seconds for the 1600 result query, sub second for a more sane 40 result query. That should drop further with a beefier database server as the database call is taking 2-3 seconds now. Still not entirely happy with how long it is taking rails to go from hash to json, might figure later out how to pass the raw string directly to the response.
|
# ? May 31, 2015 14:41 |
|
Peristalsis posted:To be clear, I'm looking to remove the additional prompts if the user enters the wrong password the first time. I'm not sure if your comment accounts for that, but if so, it seems to be in conflict with Soup in a Bag's comment, above. The net-ssh gem allows me to pass the login and password along at the same time, removing the first explicit command line password prompt altogether. In case anyone is interested (or this comes up in a search later), net-ssh does accept :number_of_password_prompts as an option. This can be set to zero to prevent the system from prompting for a password after a failed login. ALSO, version 2.10, which is in beta, accepts a :non_interactive option, which can be set to true, and which should do the same thing.
|
# ? Jun 1, 2015 15:21 |
|
kayakyakr posted:So, anyone have any suggestions for faster JSON responses aside from just generating the JSON on the database & never instantiating rails objects? Have you tried different JSON engines? The YAML-based one that's used as a fallback is notoriously slow, while there are a variety of ones that are much faster, including the C-based json/ext (from https://github.com/flori/json/tree/master ) and yajl-ruby, and undoubtedly some Java-based ones for JRuby.
|
# ? Jun 1, 2015 16:05 |
|
Cocoa Crispies posted:Have you tried different JSON engines? The YAML-based one that's used as a fallback is notoriously slow, while there are a variety of ones that are much faster, including the C-based json/ext (from https://github.com/flori/json/tree/master ) and yajl-ruby, and undoubtedly some Java-based ones for JRuby. Decent boost. thanks for the recommendation. using yajl for now, though will try out oj as soon as rbx 2.5+ doesn't ignore interrupts.
|
# ? Jun 1, 2015 20:27 |
|
Surely this is your answer: https://github.com/dcodeIO/ProtoBuf.js/ probably not your answer Obvious question, but are you're serializing only the fields you need, right?
|
# ? Jun 2, 2015 06:27 |
|
MALE SHOEGAZE posted:Surely this is your answer: Serializing the fields we need for that call, or serializing all the fields we'll ever need from any call to that endpoint? Cause we're definitely doing one of those. We're denormalizing associations into the table, so theoretically, reading direct from the database is going to be the fastest possible. The renaming of return values is a problem (so grouping_fields just becomes groupings) that will have to wait for a bigger DB server than the babby's first server I've got it on now. Next time I do performance tuning, I'll create serializers just for the denormalization process so we don't have a bunch of useless poo poo coming across the wire.
|
# ? Jun 2, 2015 15:34 |
|
This may be more of a general programming/HTML question, but we're using Ruby on Rails, so I'll try here first. Our app has a list of files on a form page, to let users associate files with the object being created with the form. The HTML uses an un-ordered list to show the files, and only shows files owned by selected users. The problem is that at least one user has many thousands of files, meaning our list has many thousands of <li> elements, and that seems to be choking browsers some of the time. And it's always slow to work with when loading and/or looking at the problematic users' files. Part of the problem is that we're pre-loading the entire collection of files for all users, and just hiding the ones not currently being browsed. We're probably not even doing that very efficiently, so I'm working on ways to optimize that code. But the simple fact is that some users have an ungodly number of files available to associate with the form object. So, even if I only load the files belonging to a particular person when the current user requests to see them, for some file owners, there are going to be lots of files to show. This seems like this one of those problems that must have come up and been solved before - is there a standard or best practice way to handle this situation? Is there a better structure than [ul][/ul] for huge lists? The model code that loads up these files' records causes another intermittent problem. It's a fairly complex nested ARel query with model.includes for eager loading of related objects. It sometimes makes my dev box hang, requiring me to use kill -KILL to clear it out and start over (I'm using WEBrick on dev). When this happens, the server's shell prompt hangs in the middle of displaying an EXPLAIN of the SQL command from MySQL. I suspected the issue was that the .includes resolves to a SQL query with too many entries in the IN clause (i.e. thousands), but the internet tells me that MySQL doesn't have a limit on the number of entries you can use in that clause. Also, sometimes it works, sometimes it doesn't, so it really can't be a syntax error like that. Oddly, this EXPLAIN command only shows when there is a problem - in instances where the server doesn't hang, there is no EXPLAIN shown in the server shell. I assume this means that there's an issue with executing the SQL query, which then prompts MySQL/rails to run EXPLAIN. Is there a character limit the query you can pass to EXPLAIN? Or maybe the explain is always run, but only echoes to output if there's a problem?
|
# ? Jun 3, 2015 16:12 |
|
Do you have to display all the files at once? You could have some kind of a widget on your page that would show an x amount of files and then the rest would be paginated. You only load the files you need via AJAX, that way you would avoid the massive query. You could also implement some kind of a search for the files, kind of like Select2 jquery plugin works for select fields. Maybe I misunderstood your use case though.
|
# ? Jun 3, 2015 16:23 |
|
Gmaz posted:Do you have to display all the files at once? You could have some kind of a widget on your page that would show an x amount of files and then the rest would be paginated. You only load the files you need via AJAX, that way you would avoid the massive query. You could also implement some kind of a search for the files, kind of like Select2 jquery plugin works for select fields. This is kind of what I'm thinking. I guess I was hoping there was a "magically_show_more_data_than_realistically_possible" gem. Or maybe a "never use .includes with more than 100 rows or one dependency, use this other method instead" rule of thumb. The problem (aside from having to rewire a sizable chunk of the app) would be with sorting. We let users sort the files by owner, by file name, etc. So to show only, say, 100 files at a time, the server would still have to query all of them and put them in the appropriate order to know which 100 to show next. I think I will look at other ways to construct the query, so it's more understandable to me, and just in case it ends up making a difference in performance.
|
# ? Jun 3, 2015 16:46 |
|
If you order the data on the DB level and use offset and limit you still won't have to load all the data. e.g something like: Model.order('some_column DESC').offset(page_number * display_results_number).limit(display_results_number) Gmaz fucked around with this message at 17:50 on Jun 3, 2015 |
# ? Jun 3, 2015 17:46 |
|
Peristalsis posted:This is kind of what I'm thinking. I guess I was hoping there was a "magically_show_more_data_than_realistically_possible" gem. Or maybe a "never use .includes with more than 100 rows or one dependency, use this other method instead" rule of thumb. Sorting should be done on the DB and shouldn't be an issue. I recommend using Kaminari for paging. It's a solid toolset and the maintainer isn't as much of a dick as the will paginate guy.
|
# ? Jun 3, 2015 17:58 |
|
MALE SHOEGAZE posted:Surely this is your answer: Ruby has protobuffs libraries too: https://github.com/protobuf-ruby/beefcake
|
# ? Jun 4, 2015 19:33 |
|
Cocoa Crispies posted:Ruby has protobuffs libraries too: https://github.com/protobuf-ruby/beefcake Yes but they don't deserialize things in a browser!
|
# ? Jun 5, 2015 02:33 |
|
I started using Delayed Job in an app. Whats the best way to get the workers running? There's a script for it, but right now we have to run it form the console. We are not using capistrano and so far all the suggestions are for capistrano. So, whats the best practice for getting it started? The commands is RAILS_ENV=production delayed_job/script start -n5
|
# ? Jun 15, 2015 21:21 |
|
KoRMaK posted:I started using Delayed Job in an app. Whats the best way to get the workers running? There's a script for it, but right now we have to run it form the console. We are not using capistrano and so far all the suggestions are for capistrano. So, whats the best practice for getting it started? The commands is RAILS_ENV=production delayed_job/script start -n5 Use Sucker Punch for async and whenever for scheduled... drop delayed job, it's a PITA.
|
# ? Jun 15, 2015 22:37 |
|
kayakyakr posted:Use Sucker Punch for async and whenever for scheduled... drop delayed job, it's a PITA. Backgrounding tasks within your web workers with absolutely no guarantees of job durability or completion is a terrible idea. If a worker dies for some reason any jobs that haven't run will be lost. Delayed Job and other tools architected like it (read: with background worker processes and a persistent storage mechanism) are far more reliable and make a guarantee that a job will be completed. If you outgrow using the database for jobs, switch to Sidekiq or Resque. Sucker Punch is a hack for async on the free tier of Heroku when you only have one process to run in and shouldn't be used in any real production environment. Thalagyrt fucked around with this message at 23:02 on Jun 15, 2015 |
# ? Jun 15, 2015 22:58 |
|
Thalagyrt posted:Backgrounding tasks within your web workers with absolutely no guarantees of job durability or completion is a terrible idea. If a worker dies for some reason any jobs that haven't run will be lost. The others are also PITA to deploy, as KoRMaK is finding. If you plug in to the new ActiveJob stuff, then you're framework-flexible and can switch to whichever when you're ready to improve your deploy.
|
# ? Jun 15, 2015 23:28 |
|
kayakyakr posted:The others are also PITA to deploy, as KoRMaK is finding. If you plug in to the new ActiveJob stuff, then you're framework-flexible and can switch to whichever when you're ready to improve your deploy. How is Delayed Job anything at all resembling PITA to deploy? Set up supervisor or runit or whatever the heck you want to use to run a couple instances of bundle exec rake jobs:work and you're done. Add a couple command line options to write out pidfiles if you want to be able to easily restart them with say Capistrano. Edit: If you're deploying on Heroku or similar PaaS or even just using Foreman on a VPS it's even easier. Just add worker: bundle exec rake jobs:work to your procfile and you're done. In KoRMaK's case, I'd advise setting up supervisor. It's really rather easy - just have to get it to switch users to the right user and then run rake jobs:work and you're done. It can be done in about 15 minutes if you've never touched supervisor before. Thalagyrt fucked around with this message at 00:04 on Jun 16, 2015 |
# ? Jun 15, 2015 23:47 |
|
kayakyakr posted:Use Sucker Punch for async and whenever for scheduled... drop delayed job, it's a PITA. Thalagyrt posted:I'd advise setting up supervisor. It's really rather easy - just have to get it to switch users to the right user and then run rake jobs:work and you're done. It can be done in about 15 minutes if you've never touched supervisor before. Cool, I'm checking supervisor out. ...it is kind of a pita tho :-/ (im a bad computer person) KoRMaK fucked around with this message at 15:48 on Jun 16, 2015 |
# ? Jun 16, 2015 13:31 |
|
Ugh, supervisor wants non-damoenized tasks, but the delayed job script is damoenized and the rake task doesn't let me specify multiple workers. Should I just run rake jobs:work multiple times if I want multiple workers?
|
# ? Jun 16, 2015 16:25 |
|
KoRMaK posted:Ugh, supervisor wants non-damoenized tasks, but the delayed job script is damoenized and the rake task doesn't let me specify multiple workers. Yeah, just run them as foreground workers. I have 4 services set up with runit for mine, and have delayed_job configured to write out pid files so I can just kill `cat delayed_job.*.pid` to restart all the workers. Here's my runfile for runit - you should be able to do the same with supervisor easily. My ~/shared/environment file just loads up all the env variables - RAILS_ENV and other config. code:
|
# ? Jun 16, 2015 19:01 |
|
Thalagyrt posted:Yeah, just run them as foreground workers. I have 4 services set up with runit for mine, and have delayed_job configured to write out pid files so I can just kill `cat delayed_job.*.pid` to restart all the workers. Here's my runfile for runit - you should be able to do the same with supervisor easily. My ~/shared/environment file just loads up all the env variables - RAILS_ENV and other config. What does the --identifier option do? I'm not finding any info on it. And where are the pid files at? They were showing up in my pids dir when I did "delayed_job start" but when I do "delayed_job run" they don't show up. e: ps aux helps finding all the processes. I still don't see the purpose of --indentifier though KoRMaK fucked around with this message at 20:14 on Jun 16, 2015 |
# ? Jun 16, 2015 20:07 |
|
KoRMaK posted:I need to get better at the linux. It's explained right in the manual for delayed_job: https://github.com/collectiveidea/delayed_job/wiki/Delayed-job-command-details It adds the numeric identifier to the process name, so you'll see delayed_job.1, delayed_job.2, etc.
|
# ? Jun 16, 2015 20:13 |
|
Thalagyrt posted:It's explained right in the manual for delayed_job: https://github.com/collectiveidea/delayed_job/wiki/Delayed-job-command-details Thank you so much! e: Where should I see delayed_job.x? I'm looking in the system monitor and ps aux and not seeing anything but the console script I executed. KoRMaK fucked around with this message at 20:18 on Jun 16, 2015 |
# ? Jun 16, 2015 20:14 |
|
KoRMaK posted:Dammit, I need to browse the wiki pages instead of google searching and get more sleep. It will show up in the pidfile written out by delayed_job, which ends up in RAILS_ROOT/tmp/pids. Setting the identifier will cause the pidfile to be written out as delayed_job.identifier.pid instead of delayed_job.pid, which lets you keep track of each one individually. code:
|
# ? Jun 16, 2015 20:37 |
|
Thalagyrt posted:It will show up in the pidfile written out by delayed_job, which ends up in RAILS_ROOT/tmp/pids. Setting the identifier will cause the pidfile to be written out as delayed_job.identifier.pid instead of delayed_job.pid, which lets you keep track of each one individually. Oh I think I get what was wrong. pid's usually only get made when a process is daemonized, and using delayed_job run keeps it in the foreground, thus no pid files. Looks like they have an article on how to setup delayed job with monit https://github.com/collectiveidea/delayed_job/wiki/monitor-process KoRMaK fucked around with this message at 21:24 on Jun 16, 2015 |
# ? Jun 16, 2015 20:51 |
|
KoRMaK posted:I thought thats where it should be but I can't find it. I looked in my home directory/tmp/pids and the rails/tmp/pids directory and then did a search for everything with pid or delayed_job and I'm not finding any files. The pid files should be created whether or not it's daemonized. It's certainly created in my setup, and you can see the script I'm using to start it - it's not running daemonized. It won't be created if you start it with rake, though.
|
# ? Jun 16, 2015 21:27 |
|
Here's a dumb question: I have rvm installed as my user, but when I go into sudo mode and do rvm it says its not installed. As sudo, I source my users bashrc file thinking that will fix it, but it doesn't. Do I need to install RVM as sudo/root?
|
# ? Jun 18, 2015 13:42 |
|
KoRMaK posted:Here's a dumb question: I have rvm installed as my user, but when I go into sudo mode and do rvm it says its not installed. As sudo, I source my users bashrc file thinking that will fix it, but it doesn't. Why sudo at all?
|
# ? Jun 18, 2015 14:46 |
|
KoRMaK posted:Here's a dumb question: I have rvm installed as my user, but when I go into sudo mode and do rvm it says its not installed. As sudo, I source my users bashrc file thinking that will fix it, but it doesn't. You don't want a web site running as root. Why are you using sudo?
|
# ? Jun 18, 2015 15:54 |
|
I'm trying to get monit to launch the delayed_job script with the right env stuff. I thought I had to install rvm as sudo because monit likes to run stuff as sudo. That was a couple hours ago, and I've come a long way since then, here's what I found works On my local vm start program = "/bin/su - kormak -c 'cd /media/my_app; script/delayed_job start -i 1'" In production it changes to start program = "/bin/su -c 'cd /media/my_app; script/delayed_job start -i 1'" Seems pretty obvious now, but I didn't understand what I was doing and dropped the -c so it kept crashing. I then started using /bin/bash but it didn't work at first either until I used the -l (--login) option, so either /bin/su above works or this start program = "/bin/bash -l -c 'cd /media/my_app; script/delayed_job start -i 1'" So the adventure is over! I did it! And I learned more about linux and the different shells and how to montor processes and a little bit of sysadmining. KoRMaK fucked around with this message at 16:21 on Jun 18, 2015 |
# ? Jun 18, 2015 16:09 |
|
KoRMaK posted:I'm trying to get monit to launch the delayed_job script with the right env stuff. I thought I had to install rvm as sudo because monit likes to run stuff as sudo. Why are you trying to run delayed_job as root? Run it as the user that owns your web site, just like you do on your local VM.
|
# ? Jun 18, 2015 16:13 |
|
|
# ? May 30, 2024 13:06 |
|
Thalagyrt posted:Why are you trying to run delayed_job as root? Run it as the user that owns your web site, just like you do on your local VM.
|
# ? Jun 18, 2015 16:33 |