|
madkapitolist posted:Ah yes I've actually used BS in a script before great stuff (although my use of it was very very elementry/simple). I may have phrased my earlier question awkwardly. I am interested in scraping the contents of the actual webpage not the url. For example I want to crawl an itunes page scrapy might do what you want: http://doc.scrapy.org/en/0.14/intro/tutorial.html FWIW, the XPath to the string you want is //li[@class='genre']/a/text()
|
# ? Sep 12, 2012 18:49 |
|
|
# ? Jun 3, 2024 22:08 |
|
This looks awesome thanks!
|
# ? Sep 12, 2012 19:00 |
|
I'm having trouble wrapping my head around delimited continuations, or continuations period I guess. What I'm trying to model is this: Imagine I have a checkers board with 3 robots positioned around the board. Each of these robots have a routine: code:
In a way this is like modeling each robot as a "thread" where it sends a command back to the controlling thread and waits for a signal to continue. Only I'd like to use (delimited) continuations instead. I imagine I'll be storing a continuation for each robot, but aside from that I'm lost on how to model this. I just can't wrap my head around where I'd be putting resets, shifts and who is passing continuations to whom. Edit: Disregard this, what I was looking for was how to implement coroutines using continuations. Contero fucked around with this message at 18:53 on Sep 13, 2012 |
# ? Sep 12, 2012 21:14 |
|
I know a little C and a little Ruby and some Rails. I've built some scripts on my local machine that run in the background and interact with the Twitter API. One script posts automatic replies, one tweets the time on the hour, things like that. Nothing terribly spammy, just gimmicky. Anyway, my question: Because these scripts run on my local machine, when my machine powers down, the scripts cease working. How do people have it set up so that something like a Ruby script runs in the background on a remote, always-on server, intercepting Tweets, monitoring updates to Amazon's product catalog to tweet new releases, etc? I'm totally clueless as to what steps to take.
|
# ? Sep 13, 2012 17:55 |
|
plasticbugs posted:I know a little C and a little Ruby and some Rails. I've built some scripts on my local machine that run in the background and interact with the Twitter API. One script posts automatic replies, one tweets the time on the hour, things like that. Nothing terribly spammy, just gimmicky. If you want to go cheap and learn things, get a virtual private server: http://prgmr.com/xen/ Or use EC2, if that's your thing. You need to be able to run the scripts in the background. So you probably want to make sure you're using logging, not just printing to stdout/stderr, and properly daemonize the process.
|
# ? Sep 13, 2012 18:07 |
|
Carthag posted:scrapy might do what you want: http://doc.scrapy.org/en/0.14/intro/tutorial.html So this is my final spider to try to get it to return "games" but it says there is a syntax error. Any idea? error: genres = hxs.select('//li[@class='genre']/a/text()') ^ SyntaxError: invalid syntax Code: from scrapy.spider import BaseSpider from scrapy.selector import HtmlXPathSelector from tutorial.items import DmozItem class DmozSpider(BaseSpider): name = "dmoz" allowed_domains = ["itunes.apple.com"] start_urls = [ "http://itunes.apple.com/us/app/fluff-friends-rescue/id467407534?mt=8" ] def parse(self, response): hxs = HtmlXPathSelector(response) genres = hxs.select('//li[@class='genre']/a/text()') items = [] for genre in genres: item = DmozItem() item['title'] = genre.select('a/text()').extract() items.append(item) return items
|
# ? Sep 13, 2012 19:09 |
|
Use [code] tags to post code, especially in a layouted language like Python. Anyway, the problem is that you use single quotes within the string, so it ends prematurely. Use hxs.select("//li[@class='genre']/a/text()") instead.
|
# ? Sep 13, 2012 19:12 |
|
Thank you, will do. No more error but it dumps this JSON file now instead of "games"? [{"title": []}][{"title": []}] madkapitolist fucked around with this message at 19:26 on Sep 13, 2012 |
# ? Sep 13, 2012 19:24 |
|
Doctor w-rw-rw- posted:Other people will no doubt be better able to provide deeper answers but here's a quick overview: Thanks for this jumping off point. I'll have to start getting my hands dirty with figuring out proper logging and daemon-creating - Unixy things I haven't ever touched yet (no pun intended).
|
# ? Sep 13, 2012 19:31 |
|
madkapitolist posted:So this is my final spider to try to get it to return "games" but it says there is a syntax error. Any idea? madkapitolist posted:No more error but it dumps this JSON file now instead of "games"? genre.select('a/text()').extract() won't return anything useful. You already have the list of genres from the first hxs.select() call, and what you're trying to do is take the <a> link from the string "Games" (or whatever genres are in the genres list), which of course doesn't exist, and put that into the title field of the DmozItem from the tutorial, resulting in items with empty titles. Try genre.extract() instead.
|
# ? Sep 13, 2012 21:24 |
|
Contero posted:delimited continuations If anyone cares (unlikely), I actually figured out how to do this, although my code is a bit ugly since I haven't touched racket in two years: code:
code:
|
# ? Sep 13, 2012 23:35 |
|
I'm in the planning stages of developing a web-based application that will allow the users to budget and plan projects for their company with multiple features( multiple plans for the same project, compare project plans with actual info from projects as they proceed, etc). I'm trying to decide what programming language to use. I'm looking at java/javascript, python, php, HTML. I'm most familiar with python and java. I'm wondering if anyone can tell me what they would use or recommend something for me. Thanks
|
# ? Sep 14, 2012 05:02 |
|
Ummmm does Excel belong here? I'm not even sure if I need VBA to do what I want, maybe it could be accomplished with a formula, but I have no idea how. Column A I have a list of dates that span weekly. Column C I have a list of dates that span everyday and an associated number in Column D for each day. I want to fill column B with the number from D that matches the date in A. Does that make sense? Basically grab the number from D from the date that's listed in A and put it in B. I have a feeling it's pretty simple and can probably be done with a formula, but I have no idea how. Google made it more confusing. Thanks.
|
# ? Sep 14, 2012 05:17 |
|
Prof_Beatnuts posted:I'm in the planning stages of developing a web-based application that will allow the users to budget and plan projects for their company with multiple features( multiple plans for the same project, compare project plans with actual info from projects as they proceed, etc). I'm trying to decide what programming language to use. I'm looking at java/javascript, python, php, HTML. Take a look at Pyramid. It's the other Python choice (not Django).
|
# ? Sep 14, 2012 05:38 |
|
Two people have now asked me idle questions about HTML sanitisation and other fairly elementary problems relating to web sites that they just took responsibility for. Is there is a general resource, "Idiotic Mistakes That The Last Person Probably Made And How To Fix Them" which an incoming web administrator can consult? A definitive list of advice?
|
# ? Sep 14, 2012 12:21 |
|
FooGoo posted:Ummmm does Excel belong here? I'm not even sure if I need VBA to do what I want, maybe it could be accomplished with a formula, but I have no idea how. There's an Excel megathread: http://forums.somethingawful.com/showthread.php?threadid=3132163 Thing is, it's not really clear what you want as far as what number corresponds to what day, so you should probably give some examples if you post in there or more in here.
|
# ? Sep 14, 2012 14:33 |
|
bash / general Linux shell question: I just ran into a weird problem: I have a ridiculous number of small .jpg files in one flat directory, the result of a webcam being poorly set up (not by me!) and going unnoticed for far too long. At least half a million files at a guess, enough that finding out exactly how many of them there are is non-trivial. How do I delete them? rm -f *.jpg is not going to work because it will just expand the wildcard expression and try to pass an ungodly long string of filenames as an argument. I'm woefully unfamiliar with xargs but my understanding is that I try piping find into rm with it I'm just going to spawn an absurd number of processes. Would a bash script that looks something like code:
Or is it going to try expanding *.jpg before starting to do any deletion? What's an elegant way of iterating through half a million files to delete them one at a time without first generating a huge list of all the files?
|
# ? Sep 14, 2012 14:47 |
|
Delete the directory.
|
# ? Sep 14, 2012 15:05 |
|
JawnV6 posted:Delete the directory. what if there's sub-directories I care about?
|
# ? Sep 14, 2012 15:06 |
|
code:
You can run above with -print instead of -delete first to see which files will actually be deleted.
|
# ? Sep 14, 2012 15:14 |
|
ulimit -s unlimited && rm *.jpg
|
# ? Sep 14, 2012 15:32 |
|
Entropic posted:what if there's sub-directories I care about? Move them out first. Unless there are tens of thousands of them somehow.
|
# ? Sep 14, 2012 15:38 |
|
Munkeymon posted:Move them out first. Unless there are tens of thousands of them somehow. PiotrLegnica's find command is probably the best way to handle this, though.
|
# ? Sep 14, 2012 15:59 |
|
Doctor w-rw-rw- posted:Take a look at Pyramid. It's the other Python choice (not Django). Thank you so much for the suggestion. Pyramid looks like a great framework to use for my project.
|
# ? Sep 14, 2012 18:58 |
|
Entropic posted:I'm woefully unfamiliar with xargs but my understanding is that I try piping find into rm with it I'm just going to spawn an absurd number of processes. If your find has -delete, -delete would be the best option, but for the record, xargs will feed as many files as it can into each execution of the specified command. Unless your filenames are stupidly long, it won't be doing 'rm' once for each file. Example: code:
code:
Civil Twilight fucked around with this message at 19:22 on Sep 14, 2012 |
# ? Sep 14, 2012 19:18 |
|
So this is a question on one of my homeworks:quote:8. How can the ASCII table be used as an addressing mechanism? Maybe my brain is farting but I have no idea what it is asking. Anyone want to help refresh my memory...?
|
# ? Sep 15, 2012 00:44 |
|
That doesn't make sense to me either.
|
# ? Sep 15, 2012 00:46 |
|
an skeleton posted:So this is a question on one of my homeworks: What class is it? I mean I guess you could make a tree structure and have the nodes be keyed on ascii values so a string "abc" woudl traverse down from root through a->b->c but that's dumb.
|
# ? Sep 15, 2012 00:48 |
|
It could be talking about some form of trie (like a patricia trie)
|
# ? Sep 15, 2012 00:57 |
|
Anyone know of a good site explaining how to play a YouTube video in your own 100% custom player? I remembered seeing something where you could use a hosted YouTube video and feed it into a Flash player you made yourself. Everything I look up relates how to change certain color parameters and what buttons are visible, which I don't want. Also, is there a way to get rid of the YouTube logo in the bottom right? I'm not trying to have it look that way on YouTube, but embedded on my own site. HTML5 would be okay too.
|
# ? Sep 15, 2012 18:15 |
|
an skeleton posted:So this is a question on one of my homeworks: I'm betting that it's just asking you to establish that you understand that the alphabet is contiguous in ASCII and that this can be a useful fact.
|
# ? Sep 15, 2012 21:14 |
|
LP0 ON FIRE posted:Anyone know of a good site explaining how to play a YouTube video in your own 100% custom player? I remembered seeing something where you could use a hosted YouTube video and feed it into a Flash player you made yourself. Everything I look up relates how to change certain color parameters and what buttons are visible, which I don't want. Also, is there a way to get rid of the YouTube logo in the bottom right? I'm not trying to have it look that way on YouTube, but embedded on my own site. HTML5 would be okay too. Don't use YouTube, then. That's against their terms of service. What I will say is that you should probably just use YouTube instead of the lovely Flash player you will eventually replace it with. Why on earth do you want to stop it from looking like it's on YouTube? You can remove the logo with the modestbranding option. See https://developers.google.com/youtube/player_parameters
|
# ? Sep 15, 2012 22:19 |
|
Suspicious Dish posted:Don't use YouTube, then. That's against their terms of service. I guess it makes sense it's against their terms of service. The client wants to be able to have the videos hosted on YouTube and click a button to still go to the YouTube site for the video, but just have an entirely different skin that doesn't look like the YouTube player. Thanks for the logo removal.
|
# ? Sep 16, 2012 00:00 |
|
On a related note, is there a good solution for embedding videos on an intranet site where YouTube (or any external site) wouldn't be an option? We use Chromeframe so we only have to worry about single browser compatibility.
|
# ? Sep 16, 2012 14:33 |
|
Uziel posted:On a related note, is there a good solution for embedding videos on an intranet site where YouTube (or any external site) wouldn't be an option? We use Chromeframe so we only have to worry about single browser compatibility. JW Player seems to be the standard choice for a free embeddable flv player.
|
# ? Sep 16, 2012 15:29 |
|
Civil Twilight posted:JW Player seems to be the standard choice for a free embeddable flv player.
|
# ? Sep 16, 2012 16:11 |
|
Munkeymon posted:There's an Excel megathread: http://forums.somethingawful.com/showthread.php?threadid=3132163 Thanks, didn't see that thread. I'll post over there with a screenshot.
|
# ? Sep 16, 2012 22:40 |
I've started learning about recursion, and it generally makes sense to me, but I've been asked to generate a "hypothetical but complete list of base cases that will allow the recurrence to always terminate" for the given relation:code:
Since there are no other conditions, can I just arbitrarily define base cases to cause the recurrence to stop wherever I want it to? My impulse is to say the base case is k = 0 and n is whatever, seeing as how it's arbitrarily greater than k. Am I being stupid? I feel like I am.
|
|
# ? Sep 16, 2012 23:23 |
|
Try computing c(2, 2) and see where you need to put in base cases.
|
# ? Sep 17, 2012 00:19 |
|
|
# ? Jun 3, 2024 22:08 |
|
Uziel posted:Thanks, I'll check that out. MediaElement.js is pretty useful too in case you do need to expand beyond a single browser.
|
# ? Sep 17, 2012 00:21 |