The Perl Short Questions Megathread: executable line noise

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Perl Short Questions Megathread: executable line noise

toadee: Aug 16, 2003; North American Turtle Boy Love Association

I'm having difficulty trying to get the module HTTP::GHTTP to have a timeout value. I'm using GHTTP because I'm trying to pull down a ton of very small objects simultaneously, and it's from all accounts the most lightweight of perl HTTP modules.

From the module's CPAN page:

quote:

Doing timeouts is an exercise for the reader (hint: lookup select() in perlfunc).

Ok, so I have the following:

code:

my $r = HTTP::GHTTP->new($url);

$r->set_async;
$r->set_chunksize(1024);
$r->prepare;

my $stat;
while ($stat = $r->process) {
   select(undef, undef, undef, 15);
}

Which seems to just sit there and not actually timeout after 15 seconds when things go wrong. I'm at a loss as to what else to try, mostly because I'm stupid, I'm sure.

EDIT: and by stupid I mean I got that while loop with select statement from someone elses suggestion online and stuck it in there without having a clue why it would work, so maybe that's why it doesn't.

toadee fucked around with this message at 12:13 on Nov 8, 2008

# ¿ Nov 8, 2008 12:05

Adbot: ADBOT LOVES YOU

# ¿ May 15, 2024 14:21

toadee: Aug 16, 2003; North American Turtle Boy Love Association

TiMBuS posted:

Well, select(undef, undef, undef, 15); should timeout for 15 seconds, so if its hanging indefinitely then it's probably something to do with the $r->process method call. Perhaps the request is timing out? Maybe the async call is being messed up by select causing a halt? Dunno. Make it print for every iteration to see whats going on.
Use warn instead of print so that you can output to stderr and you wont have buffering issues.

hmm, yes, I see, it just sits there until $stat gets returned, which is what it looked like it would do when I saw it, but you know, people on the internet giving perl advice never lie...

So maybe I just need a better understanding of how select() works? Reading the perldoc for it I don't see how this becomes useful in working a timeout value into GHTTP.

# ¿ Nov 8, 2008 12:31

toadee: Aug 16, 2003; North American Turtle Boy Love Association

TiMBuS posted:

I just installed ghttp to have a look. It seems like it is looping as expected, its just when I add sleep/select into the loop, it runs stupidly slow. I'm guessing the request is being affected by the main process being halted..

Oh and yeah, select is used to select a specific handle to use for standard IO operations. Select is blocking, so it will wait until the filehandle is available or until the specified timeout value is hit. The last parameter to select is a timeout value.

Hmm, well then, any suggestions for something very lightweight that will make an HTTP request that I can assign a timeout value to? LWP is way way too big but nothing else I've found has an option for timeout, which seems bizarre since I can think of a lot of reasons why you wouldn't want a script that is in need of a screamingly fast and light HTTP module to sit around waiting for a dead server.

# ¿ Nov 8, 2008 12:41

toadee: Aug 16, 2003; North American Turtle Boy Love Association

oh! duh! Alright, yes, that does work... thanks!

# ¿ Nov 8, 2008 12:54

toadee: Aug 16, 2003; North American Turtle Boy Love Association

I did it a lot shorter than that but I was going on your initial post of "7 characters"

# ¿ Nov 11, 2008 03:48

toadee: Aug 16, 2003; North American Turtle Boy Love Association

heeen posted:

Behold my palindrome test:

errr...

my shell posted:

> ./pdrome.pl
lalalal
palindrome!
> ./pdrome.pl
notone
palindrome!
> ./pdrome.pl
huh?
palindrome!

# ¿ Nov 11, 2008 13:41

toadee: Aug 16, 2003; North American Turtle Boy Love Association

Much better. Neat little algorithm that is!

# ¿ Nov 11, 2008 15:53

toadee: Aug 16, 2003; North American Turtle Boy Love Association

You can also do something like this:

in foo.pm:

code:

sub new {
    my $class = shift;
    my $self = {};
    bless( $self, $class );
    return $self;
}

... write other subs and such ...

Then in your other programs use:

code:

use foo;

my $bar = foo->new();
$bar->test();

toadee fucked around with this message at 13:36 on Sep 29, 2010

# ¿ Sep 29, 2010 13:02

toadee: Aug 16, 2003; North American Turtle Boy Love Association

syphon posted:

The edit fixed it. Now that I have working code... I'll have to go back and try to comprehend what's going on! Thanks for your help!

What's happening is what you said initially. @Processors is indeed an array of anonymous hashes containing the info for each within. When you say 'for my $processor (@Processors)' you are looping through thay array assigning a reference to the anonymous hash in each element to $processor. From there you loop through all keys in %{$processor} (the dereferenced hash), and print the value for each key in that hash.

# ¿ Dec 8, 2010 02:02

toadee: Aug 16, 2003; North American Turtle Boy Love Association

syphon posted:

I've got a quick question. Does anyone know of a module to measure the duration of dates? For my application, I have a string that represents a server's "LastBootTime" (e.g. "11/14/2010 8:01:56 AM") and want a module to convert that into an Uptime. I poked around the DateTime stuff on CPAN but can't find something applicable.

Date::Calc is awesome.

# ¿ Dec 9, 2010 00:45

toadee: Aug 16, 2003; North American Turtle Boy Love Association

So I have a problem that I hope isn't too dumb. Basically, I have a directory that gets about 100 files in it a day, each file somewhere between 17 and 50 megs in size (the files contain call records). I'd like to scan through these files very quickly, as quick as possible, for an arbitrary string.

I've found this code: http://cseweb.ucsd.edu/~sorourke/wf.pl written for the Widefinder project. It is indeed very, very fast. I go through a file in a little under 0.1 seconds in fact, so this would be good. Unfortunately, after jamming it into a subroutine called via a loop through this list of files, performance seems to bog down after 10 files or so. I'm sure this has to do with file IO issues that I don't even begin to understand, but I have this nagging feeling I'm going about this in a very wrong way, and I'm wondering if anyone here has tackled similar issues/has any suggestions?

# ¿ May 21, 2013 17:48

toadee: Aug 16, 2003; North American Turtle Boy Love Association

Unfortunately, the files are generated constantly, at 15 minute intervals. Their point of origination (a VoIP switch) would only have the last 15 minutes of call records before the next batch is slurped, and in general, the searches that need to be performed would cover many such intervals while looking for calls.

# ¿ May 21, 2013 21:05

toadee: Aug 16, 2003; North American Turtle Boy Love Association

het posted:

How much of a difference does that particular code make compared to like a normal straightforward perl script? Is it faster than just grep? What do you need to do with the lines once you find them, just save them to a file?

Much faster than grep, grep takes about 3 seconds to search one of the files (depending on how large it is ie. how many call records are in it). fgrep is a good deal faster, about half a second on average, but the code above is about 5 times faster than fgrep at this. I'm gathering these to deliver to a CGI request, basically a tool for us to use in our NOC to query these call records quickly.

JawnV6 posted:

I'm not really following what "constantly, at 15 minute intervals" means? At any given second could there be new files, or is it just at 15:04, 30:04 past the hour, etc.? Once a new file is present, will more data be appended to it or is it static and must be searched within 15 minutes before the source deletes it?

Essentially, every 15 minutes a file is grabbed from several VoIP switches that contains a ;-separated list of call data records, these get stored in a directory for us to reference when there are reports of issues with calls (they contain info like trunk group selection, outgoing carrier selection, etc). Right now we've just been using command line tools to grep through them but I got the idea to take some spare time and hack together a perl CGI/jquery formatted web interface for the searches, as it's actually a hell of a lot easier to parse through a big list of these things in a nicely formatted table.

Gazpacho posted:

The reference code doesn't unmap the file in the parent process after searching it. You probably should, as a matter of general hygiene and to make sure you aren't creating any leaks that kill your performance which would be my first guess.

I'd also recommend looking at the process list to see whether you are creating zombie processes that the parent needs to clean up.

So I just tried unmapping after finishing up searching through $str and curiously it slows the whole thing way down. I'm guessing this is why the original didn't end up doing so as well. I do end up with defunct processes as observed mega-scientifically via staring at top while it runs, but I'm not sure how to avoid that?

# ¿ May 22, 2013 12:02

toadee: Aug 16, 2003; North American Turtle Boy Love Association

het posted:

You're creating J processes for each run of that script, and when a process exits, its parent process gets the SIGCHLD signal. Until the parent calls wait() (or waitpid() or whatever), the process will be a zombie/defunct process.

Have you considered using a database for this? If we're talking about CDRs, it's already a tabular format, and I'm assuming you're searching on fields like TN or whatever, which a database could index to optimize searches.

I have considered a DB however I'm not sure how best to go about this, I suppose I could make an arbitrary cutoff point for how long I'll want to store CDR records, then just drop tables that are older than that cutoff date, but continually filling it with each days' CDR would get unwieldy pretty quickly. It's something I will look into further to be sure, but in the meantime if anyone has any theories/suggestions on how best to search through files as quickly as possible I'd like to hear them. I tried using Coro with Coro::Handle but at least using the same method I've done before for concurrent HTTP requests but it didn't produce very concurrent looking results and performance.

# ¿ May 22, 2013 15:50

toadee: Aug 16, 2003; North American Turtle Boy Love Association

het posted:

The best way to repeatedly search through data as quickly as possible is to index it somehow so that searches aren't starting with a blank slate every time. If you don't refine your problem definition beyond "search for arbitrary data in arbitrary datasets", performance optimizations become difficult.

Well the problem really is 'search for an arbitrary user provided string among a list of 96 files in semicolon delimited format'. I was hoping for some way to say run several concurrent processes and get return data from each. I do understand that the absolute best and quickest way to do this is if they were all in a database beforehand, but as of this writing that's not possible and while I'm trying to make it so, I was hoping there would be a way to do this more quickly than simply doing what amounts to a serialized line by line regex search for patterns.

# ¿ May 22, 2013 17:03

toadee: Aug 16, 2003; North American Turtle Boy Love Association

Anyone here have experience using the Perl Net::SNMP::Agent modules to write NetSNMP agent handlers?

I am having issues returning a table via an snmp walk. Essentially I want to reply to an SNMP query with some data I have in an MySQL db (about 250 rows with 15 values in each row).

So, the structure, from what I can gather from other tables of SNMP Data Ive compared to, should be like rootOID.ROW.VALUE, returned as all rows for value 0 first, then value 1 second, etc etc.

What happens is when I get the query in, the script returns rootOID.0.0 through rootOID.250.0, then sets the next OID to rootOID.0.1, yet the SNMPwalk never seems to continue.

Relevant code:

code:

sub snmp_handler {
    my ( $handler, $registration_info, $request_info, $requests ) = @_;
    my $request;
    my $row     = 0;
    my $val     = 0;
    my $rootOID = '1.3.6.1.4.1.8072.9999.9999.1111';
    my $nextoid;
 
    for ( $request = $requests ; $request ; $request = $request->next() ) {
        my $oid = $request->getOID();
        writeLog("Working OID $oid\n");
        my ( undef, undef, $rownew, $valnew ) =
          split( /\./, $oid );    #undefs are base OID and 1111 identifier
        $row = $rownew if $rownew =~ /\d/;
        $val = $valnew if $valnew =~ /\d/;
        if ( $request_info->getMode() == MODE_GET ) {
            writeLog("\tWorking GET request\n");
            writeLog("\tGET OID is $oid\n\tRow is $row\n\tVal is $val\n");
 
            my $value = $stats[ $row ]->[ $val ];
            writeLog("\tSetting value $value based on row $row and value $val\n");
            if ( $value =~ /\D/ ) {
                $request->setValue( ASN_OCTET_STR, $value );
 
            }
            else {
                $request->setValue( ASN_INTEGER, $value );
 
            }
        }
        elsif ( $request_info->getMode() == MODE_GETNEXT ) {
            $request->setRepeat(5); # just tried this to see if it had any effect on digging through more of the values, it didn't
            if ( ( $row <= $maxrow ) && ( $val <= $maxval ) ) {
                writeLog("Working GETNEXT request\n");
 
                my $value = $stats[$row]->[$val];
                writeLog("\tGot value $value for row $row, val $val\n");
                $row++;
                if ( $row > $maxrow ) {
                    $val++;
                    $row = 0;
                }
                $nextoid = $rootOID . '.' . $row . '.' . $val;
                writeLog("\tNext oid is $nextoid\n");
                $request->setOID( new NetSNMP::OID("$nextoid") );
                if ( $value =~ /./ ) {
                    if ( $value =~ /\D/ ) {
                        writeLog("\tValue $value is a string!\n");
                        $request->setValue( ASN_OCTET_STR, $value );
                    }
                    elsif ( $value =~ /\d+\.\d+\.\d+\.\d+$/ ) {
                        writeLog("\tValue $value is an IP!\n");
                        $request->setValue( ASN_IPADDRESS, $value );
                    }
                    elsif ( $value =~ /^[0-9]+$/ ) {
                        writeLog("\tValue $value is an Integer!\n");
                        $request->setValue( ASN_INTEGER, $value );
                    }
                    else {
                        writeLog("\tValue $value is something else!\n");
                        $request->setValue( ASN_OCTET_STR, $value );
                    }
                }
            }
        }
        elsif ( $request_info->getMode() == MODE_GETBULK ) {
            writeLog(
"OH BROTHER WE GOTS A BULK REQUEST BATTON DOWN THE HACHES AN SECURE YER BRITCHES\n"
            );
        }
        else {
            my $unknown = $request_info->getMode();
            writeLog("What in tarnation is a dang $unknown?\n");
        }
    }
}

The issue here may be more SNMP or NetSNMP specifically related than perl related, but I don't think there's an SNMP programming megathread.

Another option I have (and may need to go with), is that I can compile C modules into NetSNMP as custom handlers, and there is a whole bunch of stuff in the NetSNMP .h's pertaining to registering and delivering tables. Why those aren't in the Perl API I don't know but it looks like if I can manage to get out of my own way hacking together some C code it would be theoretically easier.

# ¿ Sep 16, 2013 15:41

toadee: Aug 16, 2003; North American Turtle Boy Love Association

The snmpwalk takes place externally, snmpwalk being a commandline utility to issue to a sever providing snmp responses.

The snmpwalk sends a GETNEXT request to an snmp server, the server replies with the next valid OID in a sequence based on that one, along with the value of the response for the OID the next request was sent for. An example:

I issue 'snmpwalk somesnmpserver 1.2.3'
SNMP server says 'ok, 1.2.3's first valid oid is 1.2.3.0.0, I will send the requestor the value of 1.2.3.0.0, and then let them know the next OID is 1.2.3.1.0', at which point the snmpwalker will send another GETNEXT request for 1.2.3.1.0, get a value for it along with the next valid OID in the sequence, and so on.

What is passed into the handler is from NetSNMP::Agent, which is $handler, $registration_info, $request_info, and $requests. What we're mainly dealing with is $request_info (an object containing things like what type of request this is, ie. a simple GET, a GETNEXT, GETBULK, etc), and $requests, an arrayref of request objects. The request objects contain the OID the requestor is asking for a value for, and then we also then modify these request objects with what we wish to return, the NetSNMP daemon running on the machine handles delivering that back to the requestor. We access that by just saying $request->setValue( TYPE, $value ), and we can set the nextOID in the case of a getnext request with $request->setOID( NetSNMP::OID object );

Anyhow, it's and odd and awkward thing to ask for help on here because aside from being sloppy, lazy and not as professional as others' code might be, it's doing what I am asking and intending it to do, and I think the problem lies in how I'm attempting to implement an SNMP table. I guess it was mainly on the hope that someone else here has used perl to do such an implementation via NetSNMP::Agent, because I have googled every combo of 'Perl Net::SNMP agent table' and related phrases I can think of and found NOTHING.

Also, the S in SNMP is the biggest loving joke of a lie ever.

# ¿ Sep 16, 2013 18:10

toadee: Aug 16, 2003; North American Turtle Boy Love Association

That's the reverse of what Im doing, Net::SNMP is used for making an SNMP request to an SNMP handler on a remote server, NetSNMP::Agent is for extending your server's SNMP implementation with a handler you write in Perl. The code I'm writing there is for answering SNMP requests sent to a daemon, not requesting SNMP info from a remote device.

# ¿ Sep 16, 2013 23:58

toadee: Aug 16, 2003; North American Turtle Boy Love Association

HTML::Template with CGI really can do quite a bit and makes things so much neater and easier, thanks to keeping the HTML and perl code separate.

For the current issue though, I would try printing what is returned from $cgi->param with no argument to see what the textarea's stuff is getting passed in as.

Perhaps like:

Perl code:

use Data::Dumper;

....

my @paramslist = $cgi->param;
print Dumper \@paramslist;

# ¿ Dec 16, 2013 22:19

toadee: Aug 16, 2003; North American Turtle Boy Love Association

When in doubt, print it out

A huge portion of my debugging issues are solved when I just print out every variable involved and then go look for why something is getting something it isn't supposed to.

# ¿ Dec 17, 2013 01:21

toadee: Aug 16, 2003; North American Turtle Boy Love Association

JayTee posted:

Any clues as to what's going wrong here would be appreciated.

Perl code:

my $cmd = "hmmpfam $profilHMM $fichier_modifie"; #cmd = commande
	print $cmd."\n"; #This doesn't print anything in the console, I don't know if it should

Is printing this part: hmmpfam /.../profilsHMM-Tases/transp110.hmm /.../modified_input_trad

However, it's not actually running that command string, instead it's running:

code:

open(HMMPFAM, "hmmpfam -E 1E-3 $profilHMM $fichier_modifie |")

Which is resulting in that error. What it's doing is trying to run that command on the computer this script is running on. So, if you drop to a command prompt and type in hmmpfam -E 1E-3 /.../profilsHMM-Tases/transp110.hmm /.../modified_input_trad -- that doesn't run, for whatever reason. When you fix that, this part of this script should execute.

# ¿ Feb 13, 2014 18:30

toadee: Aug 16, 2003; North American Turtle Boy Love Association

I'm surely not as savvy on perl as many others here, but isn't this just a case of the fact that since you're declaring it with my, your $words variable is local to that foreach loop?

I mean saying

Perl code:

(my $word = $_) foreach @words

Is the same as saying

Perl code:

foreach my $word ( @words ) {}

... and I wouldn't expect to be using $word outside of that loop in the latter.

toadee fucked around with this message at 14:14 on Oct 6, 2014

# ¿ Oct 6, 2014 14:02

toadee: Aug 16, 2003; North American Turtle Boy Love Association

uG posted:

Look into turning autocommit off if transactional integrity doesn't have to be considered. Also prepare() the statement outside the loop

To expand on this, outside the loop:

code:

my $sth = $dbh->prepare('INSERT INTO patients (fname, lname) VALUES (?,?)');

Then inside the loop:

code:

$sth->execute( @bind_vars );

That should run quite a bit faster.

# ¿ Mar 28, 2015 13:43

toadee: Aug 16, 2003; North American Turtle Boy Love Association

You can bum lines of perl all day and turn it into barely intelligible nonsense. That code is readable and does what it says on the tin.

# ¿ May 16, 2015 04:39

toadee: Aug 16, 2003; North American Turtle Boy Love Association

Mithaldu posted:

Inbetween any of these cases: [][], []{}, {}{}, {}[], and any repeats of [] or {} after those. This exists because nobody wants to read the arrow everytime when the code tries to go 6 levels deep into a complex structure, aka:

$hash->{shoop}->{da}->[5]->{woop}->[9]->{etc}

I'm certainly not at the level of many in terms of perl mavenhood or whatever, but, I honestly don't mind looking at the arrows in any way. It's also slightly less confusing to not ever have to wonder do I need an arrow here or not.

# ¿ Aug 14, 2015 21:35

toadee: Aug 16, 2003; North American Turtle Boy Love Association

I mean does the arrow confuse you or something?

# ¿ Aug 14, 2015 23:52

toadee: Aug 16, 2003; North American Turtle Boy Love Association

Mithaldu posted:

It slows down the reader, as well as the writer, without providing anything positive.

It provides consistency, and if there were a decent IDE it would just be written for you. Also I believe given the way humans read it doesn't slow that down at all either.

Really I just have a problem with people getting all butthurt over trivial bullshit. If you don't want to use them, great. It's not like it's actually a negative factor in any way other than your own mind and preference. This kind of poo poo is what actually makes people hate asking for programming advice on the internet.

# ¿ Aug 15, 2015 00:17

Adbot: ADBOT LOVES YOU

# ¿ May 15, 2024 14:21

toadee: Aug 16, 2003; North American Turtle Boy Love Association

Well helpful hint, if you want to offer a constructive tip, just be like 'hey, cool thing to note: you can actually skip the arrows after the first one, if you like the readability/ease of typing better'. Instead of a mini diatribe over idiosyncratic perl and proper convention.

# ¿ Aug 15, 2015 00:26

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Perl Short Questions Megathread: executable line noise