Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Mithaldu
Sep 25, 2007

Let's cuddle. :3:

TasteMyHouse posted:

you could write something that parses the script, grabbing all the strings on the right hand side of assignments to the variables you care about. It actually probably wouldn't be that hard.

If you suggest something like that, make sure to mention the CPAN module PPI.

Adbot
ADBOT LOVES YOU

Bazanga
Oct 10, 2006
chinchilla farmer
I have a list of around 5000 virtual and physical server IPs that I have to run through to determine if they automatically redirect http requests (port 80) to https requests (port 443).

I've tried looking for scripts that do this, but haven't been able to find any. Most of them are how to set it up or disable it, not determine if it is happening. I'm not a coder/network guy by trade, so a lot of this stuff is new to me. Any suggestions on tools to use or how to go about writing up a perl script to do this? I know how to read in the list and kick off whatever command needs to be done, I just don't know what tools/modules could be used to figure this out.

uG
Apr 23, 2003

by Ralp
edit: nm

uG fucked around with this message at 18:03 on Aug 8, 2011

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



Bazanga posted:

I have a list of around 5000 virtual and physical server IPs that I have to run through to determine if they automatically redirect http requests (port 80) to https requests (port 443).

I've tried looking for scripts that do this, but haven't been able to find any. Most of them are how to set it up or disable it, not determine if it is happening. I'm not a coder/network guy by trade, so a lot of this stuff is new to me. Any suggestions on tools to use or how to go about writing up a perl script to do this? I know how to read in the list and kick off whatever command needs to be done, I just don't know what tools/modules could be used to figure this out.

Something like:

code:
#!/usr/bin/env perl

while($host = <DATA>) {
	chomp($host);
	$r = qx`curl -s -I http://$host/`;
	if ($r =~ m/location: (https:.*)/) {
		print "$host OK: redirected\n";
	} else {
		print "$host ERROR: did not redirect\n";
	}
}

__DATA__
example.com
www.google.com
127.0.0.1

Roseo
Jun 1, 2000
Forum Veteran

Carthag posted:

Something like:

Why use curl?

code:
my $ua = LWP::UserAgent->new;
for my $url (@urls) {
  my $response = $ua->get($url);
  if ($response->is_redirect) {
    ...
  }
}
You can also use ->redirects to get the list of redirects followed.

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



Roseo posted:

Why use curl?

I was too lazy to look up the LWP docs :)

Bazanga
Oct 10, 2006
chinchilla farmer
code:
#!/usr/bin/env perl

while($host = <DATA>) {
	chomp($host);
	$r = qx`curl -s -I [url]http://[/url]$host/`;
	if ($r =~ m/location: (https:.*)/) {
		print "$host OK: redirected\n";
	} else {
		print "$host ERROR: did not redirect\n";
	}
}

This has worked really well so far. Thanks a ton.

code:
my $ua = LWP::UserAgent->new;
for my $url (@urls) {
  my $response = $ua->get($url);
  if ($response->is_redirect) {
    ...
  }
}
I don't know if I am doing something wrong, but the is_redirect attribute in LWP isn't getting set to true whenever a website redirects. For example, if I use the website http://www.capitalone.com/ as a url, LWP returns a success code and doesn't mention anything about a redirect, while I know for certain that the url redirects.

homercles
Feb 14, 2010

Bazanga posted:

I don't know if I am doing something wrong, but the is_redirect attribute in LWP isn't getting set to true whenever a website redirects. For example, if I use the website http://www.capitalone.com/ as a url, LWP returns a success code and doesn't mention anything about a redirect, while I know for certain that the url redirects.

What is actually happening is that LWP::UserAgent is following the redirects in its code path, successfully loads the redirected URL https://www.capitalone.com/ and states "Yep, that https URL loaded correctly". is_redirect is there to inform you that you are not stuck in an infinite redirect loop.

To have it do what you want (no follow redirects, and inform you of whether a URL requests a redirect), instantiate LWP::UserAgent like so:
code:
my $ua = LWP::UserAgent->new(max_redirect => 0);
More info of course is in the LWP::UserAgent POD.

uG
Apr 23, 2003

by Ralp
Doesn't $response->request()->uri() return the final uri, which you could compare to $host? Or even just look at $response->redirects?

Roseo
Jun 1, 2000
Forum Veteran

homercles posted:

What is actually happening is that LWP::UserAgent is following the redirects in its code path, successfully loads the redirected URL https://www.capitalone.com/ and states "Yep, that https URL loaded correctly". is_redirect is there to inform you that you are not stuck in an infinite redirect loop.

To have it do what you want (no follow redirects, and inform you of whether a URL requests a redirect), instantiate LWP::UserAgent like so:
code:
my $ua = LWP::UserAgent->new(max_redirect => 0);
More info of course is in the LWP::UserAgent POD.

Ugh, duh. Sorry about that, Bazanga.

Bazanga
Oct 10, 2006
chinchilla farmer
Thanks! That was incredibly useful. Once I got over the is_redirect hurdle it all started clicking.

You guys are livesavers. :glomp:

welcome to hell
Jun 9, 2006
This could work just as well in the coding horrors thread, but I'll post it here instead since I know it's something Mithaldu has dealt with.

Passing arguments to other processes on Windows is unreasonably hard, and even worse in Perl.

First is that argument lists are always passed as a single string in Windows, as opposed to arrays on other systems. This is less of a problem than it appears, because 95% of programs use the same rules for parsing that string into an array. Roughly speaking, the rules are that arguments can be quoted with double quotes, and backslashes can escape any character.

The second issue is that cmd.exe uses different quoting rules than the normal parsing routine. It uses caret as the escape character instead of backslash.

The result of this is that you can't create a string that will be treated the same for both of these cases. And since Perl tries to be 'smart' about this, you have to quote strings differently depending on if they have shell meta-characters or not. And there isn't any good way to check that without reimplementing the code to detect them that exists inside perl. So here is a routine that will quote arguments correctly to use with system:
code:
sub quote_list {
    my (@args) = @_;

    my $args = join ' ', map { quote_literal($_) } @args;

    if (_has_shell_metachars($args)) {
        # cmd.exe treats quotes differently from normal argument parsing.
        # just escape everything using ^.
        $args =~ s/([()%!^"<>&|])/^$1/g;
    }
    return $args;
}

sub quote_literal {
    my ($text) = @_;

    # basic argument quoting.  uses backslashes and quotes to escape
    # everything.
    if ($text ne '' && $text !~ /[ \t\n\v"]/) {
        # no quoting needed
    }
    else {
        my @text = split '', $text;
        $text = q{"};
        for (my $i = 0; ; $i++) {
            my $bs_count = 0;
            while ( $i < @text && $text[$i] eq "\\" ) {
                $i++;
                $bs_count++;
            }
            if ($i > $#text) {
                $text .= "\\" x ($bs_count * 2);
                last;
            }
            elsif ($text[$i] eq q{"}) {
                $text .= "\\" x ($bs_count * 2 + 1);
            }
            else {
                $text .= "\\" x $bs_count;
            }
            $text .= $text[$i];
        }
        $text .= q{"};
    }

    return $text;
}

# direct port of code from win32.c
sub _has_shell_metachars {
    my $string = shift;
    my $inquote = 0;
    my $quote = '';

    my @string = split '', $string;
    for my $char (@string) {
        if ($char eq q{%}) {
            return 1;
        }
        elsif ($char eq q{'} || $char eq q{"}) {
            if ($inquote) {
                if ($char eq $quote) {
                    $inquote = 0;
                    $quote = '';
                }
            }
            else {
                $quote = $char;
                $inquote++;
            }
        }
        elsif ($char eq q{<} || $char eq q{>} || $char eq q{|}) {
            if ( ! $inquote) {
                return 1;
            }
        }
    }
    return;
}
Most of this is taken from the article Everyone quotes command line arguments the wrong way.

I tried adapting ExtUtils::MakeMaker to use a routine like this, but there were a number of issues. First is that the tests and possibly other parts of ExtUtils::MakeMaker assume it can create a string that will work both in a makefile (which always uses cmd rules) and through system() directly (which doesn't). There's also the nmake/dmake/make quoting issues. So I gave up on fixing that. There's probably some things that Perl should do differently regarding this, but then you run into backward compatibility issues...

qntm
Jun 17, 2009
I don't know anything about Makefiles but I ran into that exact same issue and I found it to be incredibly difficult to get to the bottom of, both for the Windows command line and in Bash. At least in Bash the solution is fairly simple: put a backslash before anything that looks funny, and don't use any quotes at all. For Windows the procedure is maddening as you just discovered. Based on the same article you pointed out, I came up with this which looks much the same.

One thing that that article doesn't make entirely clear is that escaping the name of the program to run is a different matter from escaping its arguments.

uG
Apr 23, 2003

by Ralp
Is there any way to work with excel files with more than 65000 rows with Perl? Carrying over the rows into a new sheet won't work. Right now i'm saving my files as csv and then manually making each csv file a sheet in a workbook, but my file is getting so big that this laptop can't handle it. So I don't really need all the formatting options or formula stuff, just more like a tabbed csv file.

edit: Excel::Writer::XLSX should work

edit2: Getting out of memory errors :/

uG fucked around with this message at 07:21 on Aug 24, 2011

Mithaldu
Sep 25, 2007

Let's cuddle. :3:

Haarg posted:

Passing arguments to other processes on Windows is unreasonably hard, and even worse in Perl.
Let me add another fuckery here:

Arguments on Windows are not parsed by the shell. They are actually parsed by whatever function is compiled into the C library that receives the arguments. Meaning that HOW they are parsed can change depending on the compiler version used for the program that is being called, which broke one of the recent Windows Perl builds.

welcome to hell
Jun 9, 2006
Yeah, that was the first part of what I was saying, and kind of punted on the 95% doing it the same thing. I'm not surprised to hear it's worse than I was thinking.

Do you have a reference about the changes between different compilers/runtimes? I'd be interested to see what was changed.

Mithaldu
Sep 25, 2007

Let's cuddle. :3:

Haarg posted:

Do you have a reference about the changes between different compilers/runtimes? I'd be interested to see what was changed.
Sorry, no references. That's just what i saw on the p5p mailing list in the emails about "perl 5.15.2 tomorrow".

welcome to hell
Jun 9, 2006
Reading up on that, it looks like a change in how repeated double quotes are parsed, which was never a documented behavior. The code I posted uses backslashes to escape strings instead, so it should work in all cases (with a MS CRT).

The existing code in ExtUtils::MakeMaker tries to create strings that will work in both cmd.exe and when running the process directly. It gets pretty close, but can never be perfect. The only real solution is to have a different behavior when using cmd.

Mithaldu
Sep 25, 2007

Let's cuddle. :3:

Haarg posted:

Reading up on that, it looks like a change in how repeated double quotes are parsed, which was never a documented behavior. The code I posted uses backslashes to escape strings instead, so it should work in all cases (with a MS CRT).

The existing code in ExtUtils::MakeMaker tries to create strings that will work in both cmd.exe and when running the process directly. It gets pretty close, but can never be perfect. The only real solution is to have a different behavior when using cmd.

If you understand it better than i did at that time, please do fork it and provide patches. I'm way too burnt out to do it myself. :(

I will however happily merge any pull request with some decent tests. :)

welcome to hell
Jun 9, 2006
I'd love to fix it but it involves figuring out ExtUtils::MakeMaker internals. If I can get anything working I'll bug you on IRC.

Rohaq
Aug 11, 2006

uG posted:

Is there any way to work with excel files with more than 65000 rows with Perl? Carrying over the rows into a new sheet won't work. Right now i'm saving my files as csv and then manually making each csv file a sheet in a workbook, but my file is getting so big that this laptop can't handle it. So I don't really need all the formatting options or formula stuff, just more like a tabbed csv file.

edit: Excel::Writer::XLSX should work

edit2: Getting out of memory errors :/
I've used Excel::Writer before, and the main problem with it is that it stores everything in memory before flushing it to the file when you specify; if you have a lot of data you're going to write, it's likely you're going to run into memory issues.

uG
Apr 23, 2003

by Ralp
So is there no decent way to work with large excel files with Perl? Preferably not Win32::OLE? :(

Erasmus Darwin
Mar 6, 2001

uG posted:

So is there no decent way to work with large excel files with Perl? Preferably not Win32::OLE? :(

Have you tried Spreadsheet::WriteExcelXML? I haven't used it personally, but as long as it doesn't have the memory problems of Excel::Writer, it should do what you need.

Carthag Tuek
Oct 15, 2005

Tider skal komme,
tider skal henrulle,
slægt skal følge slægters gang



You could "cheat" and serialize to csv, then use a third party tool to convert to xlsx. There seem to be a number of those on google.

uG
Apr 23, 2003

by Ralp

Erasmus Darwin posted:

Have you tried Spreadsheet::WriteExcelXML? I haven't used it personally, but as long as it doesn't have the memory problems of Excel::Writer, it should do what you need.
I just tried it. Still running out of memory.


Carthag posted:

You could "cheat" and serialize to csv, then use a third party tool to convert to xlsx. There seem to be a number of those on google.
Now I need to add formatting (making certain cell backgrounds a different colors), so now I pretty much need some sort of scripting ability, and I need to be able to run it from a *nix command line (so no macros).

I'm wondering if I can format it as html tables, if excel will import it and keep the coloring format...

Mithaldu
Sep 25, 2007

Let's cuddle. :3:

Haarg posted:

I'd love to fix it but it involves figuring out ExtUtils::MakeMaker internals. If I can get anything working I'll bug you on IRC.

If you need any info, let me know. I have no idea who you are on irc, but judging from what you said i think you know who and where i am. :)

a foolish pianist
May 6, 2007

(bi)cyclic mutation

I'm trying to remove all the brace characters, '[' and ']', from a bunch of utf8 unicode text files. I just tried a simple script:

code:
perl -pi -e 's/\[//g' *.txt
and it mangles some of the other characters. Is there a quick way to tell perl that the text is unicode? Some other workaround?

Filburt Shellbach
Nov 6, 2007

Apni tackat say tujay aaj mitta juu gaa!
My go-to is utf8::all. Install that from CPAN and add -Mutf8::all to that perl invocation.

welcome to hell
Jun 9, 2006
While you should probably encode/decode the text when processing it, one of the features of UTF-8 is that ASCII characters are never part of other sequences. It shouldn't be possible to mangle other characters if you are only modifying [ and ].

Are you sure the text is actually in UTF-8? And what kind of mangling are you seeing?

uG
Apr 23, 2003

by Ralp
This might be more appropriate in an apache thread, but i'll ask here first. I'm trying to get Catalyst working with apache2 and fcgid. Everything works, but it looks like its loading everything up every visit, instead of keeping connected to the same fcgi socket. This is what my vhost looks like:

code:

<VirtualHost *:80>
    ServerName [url]www.ugexe.com[/url]
    ServerAlias ugexe.com


    # This should point at your myapp/root
    DocumentRoot /home/web/ugexe.com/root
    Alias /static /home/web/ugexe.com/root/static

    <Location /static>
        SetHandler default-handler
    </Location>

    ScriptAlias / /home/web/ugexe.com/script/games_fastcgi.pl/

    <Location />
        Options +ExecCGI
        Order allow,deny
        Allow from all
        AddHandler fcgid-script .pl
    </Location>
</VirtualHost>
When I used Catalyst, fcgi, and nginx I would have to manually start the fcgi processes, but they stayed in memory. Any idea on what i'm doing wrong?

I'm thinking I should be doing something with plack now that its packaged with Catalyst, but I don't know what.

uG fucked around with this message at 04:34 on Sep 1, 2011

Mithaldu
Sep 25, 2007

Let's cuddle. :3:

uG posted:

I'm thinking I should be doing something with plack now that its packaged with Catalyst, but I don't know what.
Start up your catalyst app with Starman or something like that as a standalone server and reverse proxy your apache to the appropiate port.

syphon
Jan 1, 2001
Can someone advise me on the best way to do multi-line regex's? I'm trying to scrape a list of Perforce jobs, which output in this format...
code:
job000001 on 2011/08/31 by user1 *closed*

        Blah blah blah generic job description for job 1

job000002 on 2011/05/24 by user2 *closed*

        Blah blah blah generic job description for job 2

job000003 on 2011/08/27 by user2 *open*

        Blah blah blah generic job description for job 3

job000004 by user3 *closed*

        Blah blah blah generic job description for job 4
As you can see, each job has its name, date, user, and status on one line, then two newlines and a tab before its job description.

My usual method for parsing information out of data like this would be to do this...
code:
my @raw_data = `p4 jobs -l`;
for (@raw_data) {
  my ($job, $date, $user, $status) = ($1, $2, $3, $4) if (/(\S+) on (\S+) by (\S+) \*(\S+)\*/);
}
Obviously this doesn't work because the for loop analyzes stuff line by line, and the pertinent data exists across multiple lines. I thought about making @raw_data a scalar again, and then adding the 'g' flag to the regex, but it only seems to capture the first job.

Any recommendations on the best way to tackle this?


EDIT: This code works, but only grabs the first iteration of data.
code:
my $jobs_dump = `p4 jobs -l`;
for ($jobs_dump) {
  my ($job, $date, $user, $status, $description) = ($1, $2, $3, $4, $5) if (/(\S+) on (\S+) by (\S+) \*(\S+)\*\n\n\s+(.*)\n/g);
}
(All of the variables are getting set correctly, but like I said, only for the first iteration of data).

syphon fucked around with this message at 19:45 on Sep 12, 2011

MacGowans Teeth
Aug 13, 2003

syphon posted:

Can someone advise me on the best way to do multi-line regex's?
The .. operator is good for multiple lines, but I'm not sure how it works with captures, because I've only used it for stripping out unwanted data. But in case it helps, this is something I had to do recently:

This is part of some code for cleaning up a COBOL copybook. We didn't want comments or 88-levels, which are basically lines that start with whitespace, then "88" and could continue for multiple lines until the line-end character, which is a period. So if you test on this:

my $eightyeightlevel = /^\s+88/ .. /\./;

it returns true when it hits the first line and matches the left side, then it continues to return true until it's matched the right side. But like I said, I really don't know how capturing parentheses would work with this. It might be worth experimenting with, though.

Apsyrtes
May 17, 2004

I tested this with a cut-and-paste of the data you presented,

From here:

code:
my $jobs_dump = `p4 jobs -l`;
This pieces all your lines back together:

code:
$jobs_dump =~ s/\n\n\s+/ /g;
And then this splits it out into an array, which you can iterate over:

code:
my @raw_data = split("\n\n",$jobs_dump);

Apsyrtes fucked around with this message at 20:15 on Sep 12, 2011

syphon
Jan 1, 2001

Apsyrtes posted:

I tested this with a cut-and-paste of the data you presented,
That works beautifully. Thanks!

uG
Apr 23, 2003

by Ralp
code:
# Use this for calculating running stats
my $all_current_plays = $game->plays->search({ GameIdentifier => $play->GameIdentifier, id => { '<=', $play->id } })->search(undef, { cache => 1 } );
my $all_current_home_plays = $all_current_plays->search({ b_home_possession => 1 })->search(undef, { cache => 1 }); 
my $all_current_away_plays = $all_current_plays->search({ b_home_possession => 0 })->search(undef, { cache => 1 }); 
So I have football play by play data, and I to calculate running stats (like QB efficiency), I use the above dbix::class resultsets to iterate over all the previous plays.
code:
		if(my $b = $all_current_home_plays->search({ b_pass => 1 })) {
			my $count = $b->count;
			if($count >0) {
				#CALCULATIONS NEEDING A PASS PLAY TO HAVE OCCURRED
				my $completes = $count - $b->search({ b_incomplete => 1 })->count;
				my $sum = $b->get_column('yards')->sum;
				my $ints = $b->search({ b_intercepted => 1 })->count;
				if(my $c = $b->search({ down => 1 })) {
					my $acount = $c->count;
					if($acount > 0) {
						$stats{HTO_1st_YPP} = $b->search({ down => 1 })->get_column('yards')->sum / $acount;

						my $totals;
						while(my $d = $c->next) {
							$totals += $d->_get_pass_score;
						}
	
						$stats{HTO_1st_Pass_Score_Avg} = $totals / $acount;
					}
				}
Thats the beginning of a block of code demonstrating what i'm trying to do. This lets me do things like calculate the current number of interceptions, completes, etc, but it seemed to be fairly taxing. Then I added the while loop, which gets a custom score for each play and stores them in a variable so I can average it. That absolutely crushed the performance.

What should I change to increase performance without having a bunch of running counts in globals? I don't really understand what I should be caching, and pre fetching, etc, but I have used dbix::class for awhile :(

SA Support Robot
Apr 20, 2007

syphon posted:

Can someone advise me on the best way to do multi-line regex's?

Here's a pretty efficient way:

code:
open(my $h, q/-|/, "p4 jobs -l");
while(not eof $h) {
    local $/ = "\n\n";
    my ($job, $date, $user, $status) = readline($h) =~ /^job0*(\d+) (?:on ([\d\/]+) )?by (\w+) \*(\w+)\*/;
    (my $desc = readline($h)) =~ s/^\t+(.*)\n\n/$1/;
}

SA Support Robot
Apr 20, 2007

uG posted:

I have used dbix::class for awhile :(

There's your problem! I don't understand this modern trend to hide an expressive language under a clunky abstraction and call it best practice.

syphon
Jan 1, 2001
Anyone have lots of experience with IO::Socket? I'm having a weird issue where the timeout parameter isn't being honored on some systems, and I can't figure out why. If you google it, you'll find people saying that some timeout stuff is commented out in INET.PM! This seems like a likely culprit, but doesn't explain why it does work on some boxes but doesn't on others.
code:
#!perl

use strict;
use IO::Socket;

my $sock = new IO::Socket::INET(PeerAddr => 'localhost',PeerPort => 8008,Proto => 'tcp', Timeout => 10);
unless ($sock) {
    print "Port not open!\n";
}
On my workstation this works. The script runs for 10 seconds and then displays "Port not open". On a box in our test lab, this doesn't work. The script says "Port not open" almost immediately.

My workstation is Win7, while the lab box is Win2k3, but I'm invoking a shared install of Perl 5.8.8 (IO::Socket 1.29), so the versions are all identical between the two.

I see lots of talk about using alarms to get around this, but I've never used alarms before so I'm not sure how effective that'd be. Any advice?

EDIT: I should clarify that port 8008 is just a dummy port for testing that I know isn't open for either system.

syphon fucked around with this message at 00:14 on Oct 7, 2011

Adbot
ADBOT LOVES YOU

Ninja Rope
Oct 22, 2005

Wee.
The time it takes new to return is going to depend on a lot of things, including how the remote end responds (or doesn't respond) to closed or blocked ports and the number of retries configured. TCP is guaranteed to return eventually from connect(), but the amount of time depends on a lot of things.

  • Locked thread