The Perl Short Questions Megathread: executable line noise

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Perl Short Questions Megathread: executable line noise

«‹›72 »

Mithaldu: Sep 25, 2007; Let's cuddle.

wolffenstein posted:

I created and used a XML

This is your first issue. XML does not map 1:1 to perl data structures and if you serialize poo poo in Perl you should use something that does. Either Data::Dumper or better JSON.

However, if you find that you have requirements like you described, then you need a real drat database. Something like DBD::SQLite will give you all the power of SQL, but still contain the data to a single file you can carry around.

In summary: Some people who have a problem think they'll use XML. Now they have two problems.

# ? May 28, 2011 20:19

Adbot: ADBOT LOVES YOU

# ? Jun 3, 2024 18:10

wolffenstein: Aug 2, 2002; Pork Pro

Yeah okay I'll go back in time and tell intern me to not use XML.

# ? May 28, 2011 20:33

Roseo: Jun 1, 2000; Forum Veteran

uG posted:

How would I check that if i'm making my requests with LWP? Or will I need to use an actual socket?

I'm trying to see if I can access the underlying IO::Socket::INET that LWP uses, but I don't think I can...

Something similar to this should work:

code:

my $ua = LWP::UserAgent->new(keep_alive => 10); ## 10 connection cache
my $response = $ua->get('http://www.google.com');
map { $_->ping() or warn $_->peeraddr() . " dropped connection" } $ua->conn_cache->get_connections;

# ? May 29, 2011 23:47

uG: Apr 23, 2003; by Ralp

Ok I got another one. I picked up a quick job on vworker for a forked/threaded brute force script, easy enough. According to the buyer, he should be able to run more than 10 forks, its slow, and his PHP script is faster. I can't see how to make this significantly more efficient though.

http://www.ugexe.com/brute.txt

I mean i'm sure there are better ways to iterate over some things, but I imagine my problem is with the forks and the LWP connections. Also i'm not very fond of using IPC::ShareLite, and i'm not sure how to remove it from memory when i'm done.

# ? Jun 2, 2011 03:19

Roseo: Jun 1, 2000; Forum Veteran

uG posted:

Ok I got another one. I picked up a quick job on vworker for a forked/threaded brute force script, easy enough. According to the buyer, he should be able to run more than 10 forks, its slow, and his PHP script is faster. I can't see how to make this significantly more efficient though.

http://www.ugexe.com/brute.txt

I mean i'm sure there are better ways to iterate over some things, but I imagine my problem is with the forks and the LWP connections. Also i'm not very fond of using IPC::ShareLite, and i'm not sure how to remove it from memory when i'm done.

Actually, I'd guess you're hanging on the LWP blocking operation. You may want to try an asynchronous HTTP get (such as AnyEvent::HTTP) that'll continue working as it waits for a response. You may also want to run it through Devel::NYTProf and figure out for certain where it's slow, not simply guess.

# ? Jun 2, 2011 05:26

uG: Apr 23, 2003; by Ralp

Roseo posted:

Actually, I'd guess you're hanging on the LWP blocking operation. You may want to try an asynchronous HTTP get (such as AnyEvent::HTTP) that'll continue working as it waits for a response. You may also want to run it through Devel::NYTProf and figure out for certain where it's slow, not simply guess.

Thanks. I went with Mojo::UserAgent so it was easier to just plug in and it did the trick.

# ? Jun 2, 2011 16:42

Accipiter: Jan 24, 2004; SINATRA.

This is probably ridiculously easy, but for some reason (whether it's because my brain is dry from all the coding I've been doing or that it's early and I haven't had coffee yet), I cannot figure out how to do it.

I want to generate an array containing a list of the past 12 hours, in 24 hour format, from the current hour. So it should be a list similar to:

19:00
20:00
21:00
22:00
23:00
00:00
01:00
02:00

...and so on, with the array ending with the current hour.

This can't be THAT tough, can it?

# ? Jun 5, 2011 13:24

MacGowans Teeth: Aug 13, 2003

Accipiter posted:

This is probably ridiculously easy, but for some reason (whether it's because my brain is dry from all the coding I've been doing or that it's early and I haven't had coffee yet), I cannot figure out how to do it.

I want to generate an array containing a list of the past 12 hours, in 24 hour format, from the current hour. So it should be a list similar to:

19:00
20:00
21:00
22:00
23:00
00:00
01:00
02:00

...and so on, with the array ending with the current hour.

This can't be THAT tough, can it?

Wouldn't you just set an array to localtime(time) and count back? Like if you set @currentH to localtime(time), you could push $currentH[2] onto another array and then code to count backwards and push each new number onto the array, so that you have, say, $i-- and then if $i==-1, you set $i=23. Wouldn't that do it? I'm pretty new to perl, so that's my excuse if I'm laughably wrong for some reason.

# ? Jun 5, 2011 15:25

Anaconda Rifle: Mar 23, 2007; Yam Slacker

Accipiter posted:

This is probably ridiculously easy, but for some reason (whether it's because my brain is dry from all the coding I've been doing or that it's early and I haven't had coffee yet), I cannot figure out how to do it.

I want to generate an array containing a list of the past 12 hours, in 24 hour format, from the current hour. So it should be a list similar to:

19:00
20:00
21:00
22:00
23:00
00:00
01:00
02:00

...and so on, with the array ending with the current hour.

This can't be THAT tough, can it?

perl -e 'printf( "%02d:00\n", $_ % 24 ) for ( ( (localtime)[2] - 11) .. (localtime)[2] )'

or, if you want an array:

perl -e 'my @a = map {sprintf( "%02d:00", $_ % 24 )} ( ( (localtime)[2] - 11) .. (localtime)[2] ); print "$_\n" for @a'

I'll let you clean it up however you want.

Anaconda Rifle fucked around with this message at 15:44 on Jun 5, 2011

# ? Jun 5, 2011 15:42

Erasmus Darwin: Mar 6, 2001

One caveat: The suggestions so far ignore the transition to/from Daylight Savings. Depending on what you're doing this for, you may or may not need it to handle those cases correctly.

I believe when DST starts, it looks like:
23:00, 00:00, 01:00, 03:00, 04:00, 05:00

And when it ends:
23:00, 00:00, 01:00, 02:00, 02:00, 03:00

If you need to handle this case (say because you're listing shifts at a hospital that runs 24/7/365), then I'd suggest something like this:

code:

my $end = time();
my $start = $end - 11*3600;

while ($start <= $end) {
  printf('%02d:00\n', localtime($start)[2]);
  $start += 3600;
}

And for sanity-saving purposes, I'd recommend throwing in a comment that you're doing it that way to be DST-safe. Otherwise, you run the risk of it getting refactored into a non-DST-safe one-liner by someone down the road.

# ? Jun 5, 2011 19:26

Anaconda Rifle: Mar 23, 2007; Yam Slacker

Good call, Erasmus Darwin. I didn't even consider that.

# ? Jun 5, 2011 22:00

Roseo: Jun 1, 2000; Forum Veteran

code:

my $now = DateTime->now( time_zone => 'local' );
my @array = map { $now->clone->subtract( hours => $_ )->hour() . ":00" } reverse (0 .. 12);

DateTime's pretty cool. It has daylight savings times transitions, figuring out the local time zone, all sort of other PITA stuff baked right in.

Edit: Fixed to handle the hh:mm formatting.

Roseo fucked around with this message at 22:07 on Jun 5, 2011

# ? Jun 5, 2011 22:03

MacGowans Teeth: Aug 13, 2003

...so I WAS laughably wrong. But the last three solutions are a perfect illustration of why I love perl. Good perl is clever, awesome, and aesthetically pleasing all at once.

# ? Jun 5, 2011 22:13

Dinty Moore: Apr 26, 2007

Is there an alternative to I18N::Langinfo for Perl on Windows? It seems that it's not built into Windows Perl builds due to the lack of an nl_langinfo() implementation in Windows. I need to be able to find out what encoding I should be expecting from terminal reads. (It uses Term::ReadLine to read lines from the terminal, but unfortunately Term::ReadLine doesn't handle translating into Perl's internal UTF8(-esque?) string representation.) And yes, I've tried Google, but it seems to be failing me. (Or I'm just dumb.)

# ? Jun 6, 2011 19:39

Mario Incandenza: Aug 24, 2000; Tell me, small fry, have you ever heard of the golden Triumph Forks?

Not really related but tchrist's recent UTF8 post is pretty fantastic:

http://stackoverflow.com/questions/6162484/why-does-modern-perl-avoid-utf-8-by-default/6163129#6163129

# ? Jun 6, 2011 20:30

Dinty Moore: Apr 26, 2007

Mario Incandenza posted:

Not really related but tchrist's recent UTF8 post is pretty fantastic:

http://stackoverflow.com/questions/6162484/why-does-modern-perl-avoid-utf-8-by-default/6163129#6163129

Yeah, I misspoke, it's not really accurate to refer to the internal string format as UTF8. But whatever the case...

# ? Jun 6, 2011 21:21

uG: Apr 23, 2003; by Ralp

I've been using pQuery for web scraping, and have seen Web::Scraper get mentioned as well. What do you guys use for scraping js/ajax heavy sites? Google is pretty much pointing me towards Selenium.

# ? Jun 8, 2011 07:26

Filburt Shellbach: Nov 6, 2007; Apni tackat say tujay aaj mitta juu gaa!

You're the first person I've seen who uses pQuery. Do you like it?

Selenium is worth trying, definitely. Alien::Selenium may or may not make your life easier.

PhantomJS is an up-and-comer too. It's a headless webkit. It's not as entrenched as Selenium but the idea is pretty neat.

# ? Jun 8, 2011 09:28

uG: Apr 23, 2003; by Ralp

Filburt Shellbach posted:

You're the first person I've seen who uses pQuery. Do you like it?

Selenium is worth trying, definitely. Alien::Selenium may or may not make your life easier.

PhantomJS is an up-and-comer too. It's a headless webkit. It's not as entrenched as Selenium but the idea is pretty neat.

pQuery had better documentation than Web::Scraper which is why I went that route originally (dead lines).

I've never run Selenium before, but I am worried about the number of browser instances I may need loaded up (if it will be a memory problem). PhantomJS looks promising, just wish it has a nice Perl interface like Selenium.

uG fucked around with this message at 05:08 on Jun 9, 2011

# ? Jun 8, 2011 15:43

Schweinhund: Oct 23, 2004

I'm trying to run PAR Packer and I get this error:

Perl lib version (5.10.0) doesn't match executable version (v5.10.1)

How do I fix this?

I tried to upgrade ActivePerl but I keep getting an error "The selected directory contains an incompatible version of ActivePerl. Please choose a different installation location".

Google returns zilch for that message. How do I upgrade Perl without losing all my installed modules and stuff?

# ? Jun 10, 2011 17:23

uG: Apr 23, 2003; by Ralp

Schweinhund posted:

I'm trying to run PAR Packer and I get this error:

Perl lib version (5.10.0) doesn't match executable version (v5.10.1)

How do I fix this?

I tried to upgrade ActivePerl but I keep getting an error "The selected directory contains an incompatible version of ActivePerl. Please choose a different installation location".

Google returns zilch for that message. How do I upgrade Perl without losing all my installed modules and stuff?

PAR never plays nice for me even when i'm not using that Active State poo poo. I can't answer your question, but is there a reason you are using Active State over Strawberry?

# ? Jun 10, 2011 17:35

Schweinhund: Oct 23, 2004

Never heard of it before. I think I just googled "Perl for windows" and installed that.

# ? Jun 10, 2011 18:20

Clobbersaurus: Feb 26, 2004

Is there some way to do prompted input (stdin, not command line) into perl with filesystem-searching tab completion?

There's Term::Complete, but that only works on a list of arguments you supply. I'm having difficulty believing there isn't a working solution for this, but I guess interfacing with the terminal can be prickly.

# ? Jun 23, 2011 19:48

Filburt Shellbach: Nov 6, 2007; Apni tackat say tujay aaj mitta juu gaa!

The word you want for that is ReadLine, which emulates a shell-like environment. Term::ReadLine is the standard there, but it's not great. Term::ReadLine::Gnu can sometimes make it better.

# ? Jun 24, 2011 00:45

Mario Incandenza: Aug 24, 2000; Tell me, small fry, have you ever heard of the golden Triumph Forks?

Devel::REPL has a bunch of tab completion plugins, I can't imagine it would be difficult to hack something up for that.

# ? Jun 26, 2011 17:56

Filburt Shellbach: Nov 6, 2007; Apni tackat say tujay aaj mitta juu gaa!

Mario Incandenza posted:

Devel::REPL has a bunch of tab completion plugins, I can't imagine it would be difficult to hack something up for that.

I hope it's not. I did write all of Devel::REPL's tab completion functionality. :cool:

# ? Jun 26, 2011 21:48

MacGowans Teeth: Aug 13, 2003

Has anyone else had problems with pp including every single installed module instead of just the ones the script you're trying to pack depends on? I'm using strawberry perl, the latest version, and the latest PAR::Packer on Windows 7 64-bit. I did some experimenting a while back, and it looked to me like scandeps wasn't working or something, because I was literally getting a list of everything I had installed. If there's a fix, or if it's because I'm doing something stupid, I'd love to know. Right now, when I make a PAR or executable, I'm using -n and manually including each module and dll, which is kind of a pain, but the difference is pretty clear. For instance, I had a short script that I tried to make into a PAR, and the PAR size was around 3Mb, but when I redid it using -n and just the modules I knew it needed, it went down to something like 100k. So, have you seen this? Am I doing something dumb? I've searched everywhere to see if anyone's having the same problem, and it doesn't look that way, so I'm thinking I just don't know what I'm doing.

# ? Jun 26, 2011 21:56

uG: Apr 23, 2003; by Ralp

Sizzler Manager posted:

Has anyone else had problems with pp including every single installed module instead of just the ones the script you're trying to pack depends on? I'm using strawberry perl, the latest version, and the latest PAR::Packer on Windows 7 64-bit. I did some experimenting a while back, and it looked to me like scandeps wasn't working or something, because I was literally getting a list of everything I had installed. If there's a fix, or if it's because I'm doing something stupid, I'd love to know. Right now, when I make a PAR or executable, I'm using -n and manually including each module and dll, which is kind of a pain, but the difference is pretty clear. For instance, I had a short script that I tried to make into a PAR, and the PAR size was around 3Mb, but when I redid it using -n and just the modules I knew it needed, it went down to something like 100k. So, have you seen this? Am I doing something dumb? I've searched everywhere to see if anyone's having the same problem, and it doesn't look that way, so I'm thinking I just don't know what I'm doing.

Not only do I have that problem, but it often leaves out a bunch of modules that are required unless I -n them. The only packager I ever had luck with was perl2exe, which apparently sucks.

# ? Jun 30, 2011 04:08

MacGowans Teeth: Aug 13, 2003

uG posted:

Not only do I have that problem, but it often leaves out a bunch of modules that are required unless I -n them. The only packager I ever had luck with was perl2exe, which apparently sucks.

Ha, good to know it's not just me.

I still use PAR, but it takes a lot of trial and error, because I'm including all the modules and dll files by hand. What's really inexplicable about the way pp works is that sometimes I'll package something as a PAR file and it works just fine with parl.exe, but then if I create it as an executable with the exact same arguments, it either blows up or doesn't behave correctly. But some things work as an executable and not as a PAR, so it just adds another layer of annoyance to the whole process.

On a separate topic - and this is driving me nuts - when you create a module, how do you automatically move sample scripts to your installation directory when you do a make install, or better yet, create perl scripts wrapped in windows batch files? I can see that a lot of modules I've installed do this, and I'd really like to do the same thing with a module I'm working on, but I'm damned if I can figure out how.

I'm actually using Module::Build, if that makes a difference.

Edit: Nevermind, I'm just dumb. :eng99:

All you have to do is create a bin folder in the distribution and stick perl scripts in there. Build install automatically runs pl2bat on them and sticks them in perl/site/bin.

MacGowans Teeth fucked around with this message at 20:27 on Jul 1, 2011

# ? Jun 30, 2011 13:04

Back Stabber: Feb 5, 2009

Noob question.

I have an input set to a paragraph and i need to remove its sentences from first to last and save them as their own variables. I've just been introduced to regular expressions and thought that something along the lines of this should work:

code:

$paragraph = <STDIN>;
chomp $paragraph;
($sentence) = ($paragraph =~ /^.*\./);

But no such luck. The result is the number one. Does anybody have any wisdom on this type of thing?

# ? Jul 5, 2011 19:31

MacGowans Teeth: Aug 13, 2003

Back Stabber posted:

Noob question.

I have an input set to a paragraph and i need to remove its sentences from first to last and save them as their own variables. I've just been introduced to regular expressions and thought that something along the lines of this should work:
code:
$paragraph = <STDIN>;
chomp $paragraph;
($sentence) = ($paragraph =~ /^.*\./);
But no such luck. The result is the number one. Does anybody have any wisdom on this type of thing?

If you're basically just splitting on periods, why not just use split() with that same regular expression? ($sentence) = ($paragraph =~ /^.*\./) doesn't work because the right side is evaluating to 1 (successful match). Also, the * is greedy, so it will match as far to the right as it can. You can make it lazy by putting a ? after it. If you use

code:

$paragraph =~ /^.*?\./;
$sentence = $&;

...that might do what you want, assuming I understand what you want.

# ? Jul 5, 2011 20:17

uG: Apr 23, 2003; by Ralp

code:

my @sentences;
while($paragraph =~ m/(.*?)\./) {
    my $sentence = $1;
    push @sentences, $sentence;
}

The reason you're getting the number 1 is because you aren't capturing anything, you are just matching. Plus it will only match the first sentence even if you did capture. The ^ isn't needed, since your starting match is literally anything, including nothing.

Or you can split on periods like it was mentioned above:

code:

my @sentences = split('.', $paragraph);

uG fucked around with this message at 23:14 on Jul 5, 2011

# ? Jul 5, 2011 23:09

Roseo: Jun 1, 2000; Forum Veteran

Or you can use CPAN and a module like Lingua::Sentence.

code:

C:\>perl -MLingua::Sentence -E "my $s = Lingua::Sentence->new('en'); 
                                      my $text = q/The key to use is No. 34. 
                                      It will allow you to enter the secured 
                                      area. If anyone else asks you for the 
                                      key, tell them 'No.' Cordially yours, 
                                      Mr. John Doe/; 
                                      say for $s->split($text);
                                     "

The key to use is No. 34.
It will allow you to enter the secured area.
If anyone else asks you for the key, tell them 'No.'
Cordially yours, Mr. John Doe

(Edited for tablebreakingness)

# ? Jul 5, 2011 23:37

Khelmar: Oct 12, 2003; Things fix me.

I'm looking to step through an array 50 at a time, to batch a set of requests to a server in BioPerl. However, the system I'm on can't seem to use natatime, despite my extensive efforts. Is there an easy way to get rid of natatime in this code?

code:

my $batch = natatime(50, @keys);

while (my @batch_seq = $batch->())
{
my $gbseq = $gb->get_Stream_by_id(\@batch_seq);
}

# ? Jul 6, 2011 17:01

Mario Incandenza: Aug 24, 2000; Tell me, small fry, have you ever heard of the golden Triumph Forks?

code:

for ( my $i = 0; $i < @array; $i += 50 ) {
  my @chunk = splice @array, $i, 50;
  do_stuff(\@chunk);
}

Mario Incandenza fucked around with this message at 19:25 on Jul 6, 2011

# ? Jul 6, 2011 19:22

Anaconda Rifle: Mar 23, 2007; Yam Slacker

Mario Incandenza posted:

code:

for ( my $i = 0; $i >= @array; $i += 50 ) {
  my @chunk = splice @array, $i, 50;
  do_stuff(\@chunk);
}

You probably want a <, not a >=.

# ? Jul 6, 2011 19:24

Roseo: Jun 1, 2000; Forum Veteran

natatime is pure perl, if worst comes to worse you can use the List::MoreUtils function by putting it in your namespace:

code:

sub natatime ($@) {
    my $n    = shift;
    my @list = @_;
    return sub {
        return splice @list, 0, $n;
    }
}

# ? Jul 7, 2011 02:19

MacGowans Teeth: Aug 13, 2003

I'm using this code, and it works, but it seems overly complicated to me. Basically, I'm starting with @list, which contains hash references, each of which has a Parent (which is just a scalar value matching an ID somewhere else), a Type, and a Rank. For each set of hash refs with the same parent, if there are multiple refs with the same value in Type, their Ranks must be unique. Is there a better/simpler way to do this?

code:

  my %parents;
  map { $parents{ $_->{Parent} }++ } @list; # get unique parent IDs
  for ( keys %parents )
  {
    next unless $parents{$_} > 1; # unique by definition; don't need to do anything else.
    my $parent = $_;
    my @children = grep { $_->{Parent} eq $parent } @list; # get hashrefs with current parent ID.
    my %uniqueTypes;
    map { $uniqueTypes{ $_->{Type} }++ } @children; # get unique types and count of each.
    for ( keys %uniqueTypes )
    {
      next unless $uniqueTypes{$_} > 1; # Same as with parent IDs, above.
      my $type = $_;
      my @itemsOfType = grep { $_->{Type} eq $type } @children;

      # send a message unless "map { $_->{Rank} } @itemsOfType" 
      # returns a list of unique items.
    }
  }

I'm thinking maybe sticking each attribute into an array ref, so reading through the list of hashes once and building a list of array refs, but I don't think that would really get me anything, and it would probably be more complicated.

Any thoughts? Or, more generally, is there anything horrible or non-idiomatic about this code? I've only been using Perl for a few months, so I'm sure there's a lot I could do better.

# ? Jul 7, 2011 16:14

Erasmus Darwin: Mar 6, 2001

Are you just trying to create a list of unique entries in @list, where uniqueness is defined as not having the exact same combination of parent, type, and rank as another entry? If so, what about just serializing those three values and using them as a hash key?

code:

#!/usr/bin/perl

use strict;
use warnings;

use Data::Dumper;

my @list = (
 { Parent => "foo", Type => "Type 1", Rank => 1 },
 { Parent => "foo", Type => "Type 2", Rank => 1 },
 { Parent => "foo", Type => "Type 1", Rank => 2 },
 { Parent => "bar", Type => "Type 1", Rank => 1 },
 { Parent => "foo", Type => "Type 2", Rank => 2 },
 { Parent => "foo", Type => "Type 1", Rank => 1 },
 { Parent => "foo", Type => "Type 1", Rank => 1 }
);

my %grouped_entries;
for (@list) {
  push @{$grouped_entries{quotemeta($_->{Parent}) . "," . quotemeta($_->{Type}) . "," . quotemeta($_->{Rank})}}, $_;
}

for (values %grouped_entries) {
  if (@$_ == 1) {
    print "Unique entry: ", Dumper($_->[0]), "\n";
  } else {
    print scalar(@$_), " entries with the same Parent/Type/Rank: ", Dumper($_), "\n";
  }
}

# ? Jul 7, 2011 19:50

Adbot: ADBOT LOVES YOU

# ? Jun 3, 2024 18:10

MacGowans Teeth: Aug 13, 2003

Erasmus Darwin posted:

code:

#!/usr/bin/perl

use strict;
use warnings;

use Data::Dumper;

my @list = (
 { Parent => "foo", Type => "Type 1", Rank => 1 },
 { Parent => "foo", Type => "Type 2", Rank => 1 },
 { Parent => "foo", Type => "Type 1", Rank => 2 },
 { Parent => "bar", Type => "Type 1", Rank => 1 },
 { Parent => "foo", Type => "Type 2", Rank => 2 },
 { Parent => "foo", Type => "Type 1", Rank => 1 },
 { Parent => "foo", Type => "Type 1", Rank => 1 }
);

my %grouped_entries;
for (@list) {
  push @{$grouped_entries{quotemeta($_->{Parent}) . "," . quotemeta($_->{Type}) . "," . quotemeta($_->{Rank})}}, $_;
}

for (values %grouped_entries) {
  if (@$_ == 1) {
    print "Unique entry: ", Dumper($_->[0]), "\n";
  } else {
    print scalar(@$_), " entries with the same Parent/Type/Rank: ", Dumper($_), "\n";
  }
}

drat, I never thought about it that way. Yes, that simplifies it a lot. I knew I was over-complicating things. Thanks!

# ? Jul 7, 2011 20:07

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Perl Short Questions Megathread: executable line noise

«‹›72 »