Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



madkapitolist posted:

Ah yes I've actually used BS in a script before great stuff (although my use of it was very very elementry/simple). I may have phrased my earlier question awkwardly. I am interested in scraping the contents of the actual webpage not the url. For example I want to crawl an itunes page

http://itunes.apple.com/us/app/fluff-friends-rescue/id467407534

and scrape the string "Games" from "Category: Games" on the left. From a quick view of the source it looks like this information is contained within its own list class tag

for example:

<li class="genre"><span class="label">Category: </span><a href="http://itunes.apple.com/us/genre/ios-games/id6014?mt=8">Games</a></li>


I wouldn't mind just having it return this entire chunk of code for every URL I throw in.

Any suggestions or live examples of this being done?


Edit: Looks like this jquery in chrome console returns "games"
$('li.genre > a').html()

scrapy might do what you want: http://doc.scrapy.org/en/0.14/intro/tutorial.html

FWIW, the XPath to the string you want is //li[@class='genre']/a/text()

Adbot
ADBOT LOVES YOU

madkapitolist
Feb 5, 2006
This looks awesome thanks!

Contero
Mar 28, 2004

I'm having trouble wrapping my head around delimited continuations, or continuations period I guess.

What I'm trying to model is this: Imagine I have a checkers board with 3 robots positioned around the board. Each of these robots have a routine:

code:
routine:
issue-command(move forward 3 spaces);
issue-command(turn left);
issue-command(move forward 2 spaces);
What I want to do is run each robot's routine up until a single issue-command, then have control returned back to a function that manages each robot. This managing function should run the routine of each robot until it runs into an issue-command statement, then move on to the next robot until all the robots have finished their routines. Each robot should move forward 3 spaces, then all 3 turn left, etc.

In a way this is like modeling each robot as a "thread" where it sends a command back to the controlling thread and waits for a signal to continue. Only I'd like to use (delimited) continuations instead. I imagine I'll be storing a continuation for each robot, but aside from that I'm lost on how to model this.

I just can't wrap my head around where I'd be putting resets, shifts and who is passing continuations to whom.

Edit: Disregard this, what I was looking for was how to implement coroutines using continuations.

Contero fucked around with this message at 18:53 on Sep 13, 2012

plasticbugs
Dec 13, 2006

Special Batman and Robin
I know a little C and a little Ruby and some Rails. I've built some scripts on my local machine that run in the background and interact with the Twitter API. One script posts automatic replies, one tweets the time on the hour, things like that. Nothing terribly spammy, just gimmicky.

Anyway, my question:
Because these scripts run on my local machine, when my machine powers down, the scripts cease working. How do people have it set up so that something like a Ruby script runs in the background on a remote, always-on server, intercepting Tweets, monitoring updates to Amazon's product catalog to tweet new releases, etc? I'm totally clueless as to what steps to take.

Doctor w-rw-rw-
Jun 24, 2008

plasticbugs posted:

I know a little C and a little Ruby and some Rails. I've built some scripts on my local machine that run in the background and interact with the Twitter API. One script posts automatic replies, one tweets the time on the hour, things like that. Nothing terribly spammy, just gimmicky.

Anyway, my question:
Because these scripts run on my local machine, when my machine powers down, the scripts cease working. How do people have it set up so that something like a Ruby script runs in the background on a remote, always-on server, intercepting Tweets, monitoring updates to Amazon's product catalog to tweet new releases, etc? I'm totally clueless as to what steps to take.
Other people will no doubt be better able to provide deeper answers but here's a quick overview:

If you want to go cheap and learn things, get a virtual private server: http://prgmr.com/xen/
Or use EC2, if that's your thing.

You need to be able to run the scripts in the background. So you probably want to make sure you're using logging, not just printing to stdout/stderr, and properly daemonize the process.

madkapitolist
Feb 5, 2006

Carthag posted:

scrapy might do what you want: http://doc.scrapy.org/en/0.14/intro/tutorial.html

FWIW, the XPath to the string you want is //li[@class='genre']/a/text()

So this is my final spider to try to get it to return "games" but it says there is a syntax error. Any idea?

error:
genres = hxs.select('//li[@class='genre']/a/text()')
^
SyntaxError: invalid syntax


Code:

from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector

from tutorial.items import DmozItem

class DmozSpider(BaseSpider):
name = "dmoz"
allowed_domains = ["itunes.apple.com"]
start_urls = [
"http://itunes.apple.com/us/app/fluff-friends-rescue/id467407534?mt=8"
]

def parse(self, response):
hxs = HtmlXPathSelector(response)
genres = hxs.select('//li[@class='genre']/a/text()')
items = []
for genre in genres:
item = DmozItem()
item['title'] = genre.select('a/text()').extract()
items.append(item)
return items

Cat Plus Plus
Apr 8, 2011

:frogc00l:

Use [code] tags to post code, especially in a layouted language like Python.

Anyway, the problem is that you use single quotes within the string, so it ends prematurely. Use hxs.select("//li[@class='genre']/a/text()") instead.

madkapitolist
Feb 5, 2006
Thank you, will do. No more error but it dumps this JSON file now instead of "games"?

[{"title": []}][{"title": []}]

madkapitolist fucked around with this message at 19:26 on Sep 13, 2012

plasticbugs
Dec 13, 2006

Special Batman and Robin

Doctor w-rw-rw- posted:

Other people will no doubt be better able to provide deeper answers but here's a quick overview:

If you want to go cheap and learn things, get a virtual private server: http://prgmr.com/xen/
Or use EC2, if that's your thing.

You need to be able to run the scripts in the background. So you probably want to make sure you're using logging, not just printing to stdout/stderr, and properly daemonize the process.

Thanks for this jumping off point. I'll have to start getting my hands dirty with figuring out proper logging and daemon-creating - Unixy things I haven't ever touched yet (no pun intended).

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



madkapitolist posted:

So this is my final spider to try to get it to return "games" but it says there is a syntax error. Any idea?

error:
genres = hxs.select('//li[@class='genre']/a/text()')
^
SyntaxError: invalid syntax


Python code:
from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector

from tutorial.items import DmozItem

class DmozSpider(BaseSpider):
   name = "dmoz"
   allowed_domains = ["itunes.apple.com"]
   start_urls = [
       "http://itunes.apple.com/us/app/fluff-friends-rescue/id467407534?mt=8"
   ]

   def parse(self, response):
       hxs = HtmlXPathSelector(response)
       genres = hxs.select('//li[@class='genre']/a/text()')
       items = []
       for genre in genres:
           item = DmozItem()
           item['title'] = genre.select('a/text()').extract()
           items.append(item)
       return items

madkapitolist posted:

No more error but it dumps this JSON file now instead of "games"?

[{"title": []}][{"title": []}]

genre.select('a/text()').extract() won't return anything useful. You already have the list of genres from the first hxs.select() call, and what you're trying to do is take the <a> link from the string "Games" (or whatever genres are in the genres list), which of course doesn't exist, and put that into the title field of the DmozItem from the tutorial, resulting in items with empty titles.

Try genre.extract() instead.

Contero
Mar 28, 2004

Contero posted:

delimited continuations

If anyone cares (unlikely), I actually figured out how to do this, although my code is a bit ugly since I haven't touched racket in two years:

code:
#lang racket
(require racket/control)

(define (makeRobot robotID)
  (lambda (escape1)
    (let ((escape2 
           (shift k (begin 
                      (printf "Moving robot ~a forward~n" robotID)
                      (escape1 k)))))
      (let ((escape3
             (shift k (begin
                        (printf "Turning robot ~a left~n" robotID)
                        (escape2 k)))))
        (begin
          (printf "Moving robot ~a forward 2~n" robotID)
          (escape3 (void)))))))

(define (robotsCanMove robots)
  (foldl (lambda (x y) (or x y)) false 
         (map (lambda (x) (not (equal? (void) x))) robots)))

(define (moveOnce robot)
  (reset (shift k (robot k))))

(define (controller robots)
  (if (robotsCanMove robots)
      (controller (map moveOnce robots))
      'done))

(controller (list (makeRobot 1) (makeRobot 2) (makeRobot 3)))
Which outputs:

code:
Moving robot 1 forward
Moving robot 2 forward
Moving robot 3 forward
Turning robot 1 left
Turning robot 2 left
Turning robot 3 left
Moving robot 1 forward 2
Moving robot 2 forward 2
Moving robot 3 forward 2
'done
> 
The robot routine is pretty nasty looking though. There's definitely a way to seamlessly chain the escape continuations, but I have no idea how to do it. At this point I'm satisfied with just getting it working.

Prof_Beatnuts
Jul 29, 2004
I used to be bad but now I'm good
I'm in the planning stages of developing a web-based application that will allow the users to budget and plan projects for their company with multiple features( multiple plans for the same project, compare project plans with actual info from projects as they proceed, etc). I'm trying to decide what programming language to use. I'm looking at java/javascript, python, php, HTML.

I'm most familiar with python and java.

I'm wondering if anyone can tell me what they would use or recommend something for me.

Thanks

FooGoo
Oct 21, 2008
Ummmm does Excel belong here? I'm not even sure if I need VBA to do what I want, maybe it could be accomplished with a formula, but I have no idea how.

Column A I have a list of dates that span weekly. Column C I have a list of dates that span everyday and an associated number in Column D for each day. I want to fill column B with the number from D that matches the date in A.

Does that make sense? Basically grab the number from D from the date that's listed in A and put it in B. I have a feeling it's pretty simple and can probably be done with a formula, but I have no idea how. Google made it more confusing.

Thanks.

Doctor w-rw-rw-
Jun 24, 2008

Prof_Beatnuts posted:

I'm in the planning stages of developing a web-based application that will allow the users to budget and plan projects for their company with multiple features( multiple plans for the same project, compare project plans with actual info from projects as they proceed, etc). I'm trying to decide what programming language to use. I'm looking at java/javascript, python, php, HTML.

I'm most familiar with python and java.

I'm wondering if anyone can tell me what they would use or recommend something for me.

Thanks

Take a look at Pyramid. It's the other Python choice (not Django).

qntm
Jun 17, 2009
Two people have now asked me idle questions about HTML sanitisation and other fairly elementary problems relating to web sites that they just took responsibility for.

Is there is a general resource, "Idiotic Mistakes That The Last Person Probably Made And How To Fix Them" which an incoming web administrator can consult? A definitive list of advice?

Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



FooGoo posted:

Ummmm does Excel belong here? I'm not even sure if I need VBA to do what I want, maybe it could be accomplished with a formula, but I have no idea how.

Column A I have a list of dates that span weekly. Column C I have a list of dates that span everyday and an associated number in Column D for each day. I want to fill column B with the number from D that matches the date in A.

Does that make sense? Basically grab the number from D from the date that's listed in A and put it in B. I have a feeling it's pretty simple and can probably be done with a formula, but I have no idea how. Google made it more confusing.

Thanks.

There's an Excel megathread: http://forums.somethingawful.com/showthread.php?threadid=3132163

Thing is, it's not really clear what you want as far as what number corresponds to what day, so you should probably give some examples if you post in there or more in here.

Entropic
Feb 21, 2007

patriarchy sucks
bash / general Linux shell question:

I just ran into a weird problem: I have a ridiculous number of small .jpg files in one flat directory, the result of a webcam being poorly set up (not by me!) and going unnoticed for far too long. At least half a million files at a guess, enough that finding out exactly how many of them there are is non-trivial.

How do I delete them?

rm -f *.jpg is not going to work because it will just expand the wildcard expression and try to pass an ungodly long string of filenames as an argument.

I'm woefully unfamiliar with xargs but my understanding is that I try piping find into rm with it I'm just going to spawn an absurd number of processes.

Would a bash script that looks something like
code:
#!/bin/bash
for i in $(echo *.jpg); do rm -f $i; done
do the trick?

Or is it going to try expanding *.jpg before starting to do any deletion?

What's an elegant way of iterating through half a million files to delete them one at a time without first generating a huge list of all the files?

JawnV6
Jul 4, 2004

So hot ...
Delete the directory.

Entropic
Feb 21, 2007

patriarchy sucks

JawnV6 posted:

Delete the directory.

what if there's sub-directories I care about?

Cat Plus Plus
Apr 8, 2011

:frogc00l:
code:
find . -maxdepth 1 -name '*.jpg' -delete
Doing $(echo *.jpg) still expands the glob first (and then echoes it).
You can run above with -print instead of -delete first to see which files will actually be deleted.

pseudorandom name
May 6, 2007

ulimit -s unlimited && rm *.jpg

Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



Entropic posted:

what if there's sub-directories I care about?

Move them out first. Unless there are tens of thousands of them somehow.

ToxicFrog
Apr 26, 2008


Munkeymon posted:

Move them out first. Unless there are tens of thousands of them somehow.

PiotrLegnica's find command is probably the best way to handle this, though.

Prof_Beatnuts
Jul 29, 2004
I used to be bad but now I'm good

Doctor w-rw-rw- posted:

Take a look at Pyramid. It's the other Python choice (not Django).

Thank you so much for the suggestion. Pyramid looks like a great framework to use for my project.

Civil Twilight
Apr 2, 2011

Entropic posted:

I'm woefully unfamiliar with xargs but my understanding is that I try piping find into rm with it I'm just going to spawn an absurd number of processes.

If your find has -delete, -delete would be the best option, but for the record, xargs will feed as many files as it can into each execution of the specified command. Unless your filenames are stupidly long, it won't be doing 'rm' once for each file.

Example:

code:
% find . |wc -l
114444
114k files in this directory; let's echo their filenames using xargs.
code:
% find . -print0|xargs -0 echo |wc -l
79
echo got called 79 times.

Civil Twilight fucked around with this message at 19:22 on Sep 14, 2012

an skeleton
Apr 23, 2012

scowls @ u
So this is a question on one of my homeworks:

quote:

8. How can the ASCII table be used as an addressing mechanism?

Maybe my brain is farting but I have no idea what it is asking. Anyone want to help refresh my memory...?

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
That doesn't make sense to me either.

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



an skeleton posted:

So this is a question on one of my homeworks:


Maybe my brain is farting but I have no idea what it is asking. Anyone want to help refresh my memory...?

What class is it?

I mean I guess you could make a tree structure and have the nodes be keyed on ascii values so a string "abc" woudl traverse down from root through a->b->c but that's dumb.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
It could be talking about some form of trie (like a patricia trie)

LP0 ON FIRE
Jan 25, 2006

beep boop
Anyone know of a good site explaining how to play a YouTube video in your own 100% custom player? I remembered seeing something where you could use a hosted YouTube video and feed it into a Flash player you made yourself. Everything I look up relates how to change certain color parameters and what buttons are visible, which I don't want. Also, is there a way to get rid of the YouTube logo in the bottom right? I'm not trying to have it look that way on YouTube, but embedded on my own site. HTML5 would be okay too.

raminasi
Jan 25, 2005

a last drink with no ice

an skeleton posted:

So this is a question on one of my homeworks:


Maybe my brain is farting but I have no idea what it is asking. Anyone want to help refresh my memory...?

I'm betting that it's just asking you to establish that you understand that the alphabet is contiguous in ASCII and that this can be a useful fact.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

LP0 ON FIRE posted:

Anyone know of a good site explaining how to play a YouTube video in your own 100% custom player? I remembered seeing something where you could use a hosted YouTube video and feed it into a Flash player you made yourself. Everything I look up relates how to change certain color parameters and what buttons are visible, which I don't want. Also, is there a way to get rid of the YouTube logo in the bottom right? I'm not trying to have it look that way on YouTube, but embedded on my own site. HTML5 would be okay too.

Don't use YouTube, then. That's against their terms of service.

What I will say is that you should probably just use YouTube instead of the lovely Flash player you will eventually replace it with. Why on earth do you want to stop it from looking like it's on YouTube?

You can remove the logo with the modestbranding option. See https://developers.google.com/youtube/player_parameters

LP0 ON FIRE
Jan 25, 2006

beep boop

Suspicious Dish posted:

Don't use YouTube, then. That's against their terms of service.

What I will say is that you should probably just use YouTube instead of the lovely Flash player you will eventually replace it with. Why on earth do you want to stop it from looking like it's on YouTube?

You can remove the logo with the modestbranding option. See https://developers.google.com/youtube/player_parameters

I guess it makes sense it's against their terms of service. The client wants to be able to have the videos hosted on YouTube and click a button to still go to the YouTube site for the video, but just have an entirely different skin that doesn't look like the YouTube player. Thanks for the logo removal.

Uziel
Jun 28, 2004

Ask me about losing 200lbs, and becoming the Viking God of W&W.
On a related note, is there a good solution for embedding videos on an intranet site where YouTube (or any external site) wouldn't be an option? We use Chromeframe so we only have to worry about single browser compatibility.

Civil Twilight
Apr 2, 2011

Uziel posted:

On a related note, is there a good solution for embedding videos on an intranet site where YouTube (or any external site) wouldn't be an option? We use Chromeframe so we only have to worry about single browser compatibility.

JW Player seems to be the standard choice for a free embeddable flv player.

Uziel
Jun 28, 2004

Ask me about losing 200lbs, and becoming the Viking God of W&W.

Civil Twilight posted:

JW Player seems to be the standard choice for a free embeddable flv player.
Thanks, I'll check that out.

FooGoo
Oct 21, 2008

Munkeymon posted:

There's an Excel megathread: http://forums.somethingawful.com/showthread.php?threadid=3132163

Thing is, it's not really clear what you want as far as what number corresponds to what day, so you should probably give some examples if you post in there or more in here.

Thanks, didn't see that thread. I'll post over there with a screenshot.

LuciferMorningstar
Aug 12, 2012

VIDEO GAME MODIFICATION IS TOTALLY THE SAME THING AS A FEMALE'S BODY AND CLONING SAID MODIFICATION IS EXACTLY THE SAME AS RAPE, GUYS!!!!!!!
I've started learning about recursion, and it generally makes sense to me, but I've been asked to generate a "hypothetical but complete list of base cases that will allow the recurrence to always terminate" for the given relation:

code:
c(n,k) = c(n,k-1) + c(n-1,k)
Also given that n > k.

Since there are no other conditions, can I just arbitrarily define base cases to cause the recurrence to stop wherever I want it to? My impulse is to say the base case is k = 0 and n is whatever, seeing as how it's arbitrarily greater than k.

Am I being stupid? I feel like I am.

ultrafilter
Aug 23, 2007

It's okay if you have any questions.


Try computing c(2, 2) and see where you need to put in base cases.

Adbot
ADBOT LOVES YOU

Quebec Bagnet
Apr 28, 2009

mess with the honk
you get the bonk
Lipstick Apathy

Uziel posted:

Thanks, I'll check that out.

MediaElement.js is pretty useful too in case you do need to expand beyond a single browser.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply