Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Filburt Shellbach
Nov 6, 2007

Apni tackat say tujay aaj mitta juu gaa!
Yeah, I get that, but:

Mithaldu posted:

that means don't loving disconnect, get a bouncer/shell if you don't have a useful machine

is hostile and this kind of treatment is probably a large part of why Perl doesn't attract newbies.

Adbot
ADBOT LOVES YOU

PleasantDilemma
Dec 5, 2006

The Last Hope for Peace

Otto Skorzeny posted:

It's really frustrating to both asker and answerer if a lengthy and thoughtful reply to your question gets lost because you were disconnected when someone who had the ability and time to answer got around to it. I don't recall if the bot that hangs around in the perl channels has a !tell command to get around this problem.

I don't want to encourage a derail but isn't IRC for immediate answers while both people are online at the same time? Asking a question and getting an answer sometime later is for forums & message lists.

Mithaldu
Sep 25, 2007

Let's cuddle. :3:

Filburt Shellbach posted:

Yeah, I get that, but:

is hostile
To be honest, you're right that i was being a dick in that. I stand by the sentiment still, but the tone was entirely uncalled for.

Filburt Shellbach posted:

this kind of treatment is probably a large part of why Perl doesn't attract newbies.
That said, i don't think the part about newbies is entirely right. From what i've seen in IRC, and from my own interactions in #win32, #perl-help and #perl, newbies get treated with friendliness and respect as a rule. There are exceptions from time to time, but those are cases where the persons in question have managed to piss everybody off by spending a lot of time ignoring the advice they asked for.



PlesantDilemma posted:

I don't want to encourage a derail but isn't IRC for immediate answers while both people are online at the same time? Asking a question and getting an answer sometime later is for forums & message lists.
Not quite. IRC exists for both. Just like email it is extremely easy and cheap to have asynchronous conversations, but unlike email it ALSO has the capability to easily facilitate realtime conversation. With IRC you can start a conversation, move on to realtime as people begin to answer while you're there, move back to asynch when you or they have to leave, and pick it up later again in realtime again; all within the same medium and while maximizing the amount of random passersby who can see your entire conversation and offer advice.

Forums and message lists are good for some stuff, but discussing topics that require a lot of back-and-forth does not fall into that scope. Similarly difficult is also the attempt to set up realtime conversation via those, because again, back-and-forth is needed for the synchronization process.

------

Cheers. :)

vvvv

Mithaldu fucked around with this message at 20:25 on Jul 7, 2014

jony neuemonic
Nov 13, 2009

I wasn't offended, for what it's worth.

revmoo
May 25, 2006

#basta
I have a relatively obsure lib I need from CPAN (Thrift/XS) and all the mirrors for it are dead. I know nothing about CPAN other than how to install packages. Any tips on finding this lib? I can pull it from another box if need be, but I'm not sure that's the best way to go about it.

Mithaldu
Sep 25, 2007

Let's cuddle. :3:

revmoo posted:

I have a relatively obsure lib I need from CPAN (Thrift/XS) and all the mirrors for it are dead. I know nothing about CPAN other than how to install packages. Any tips on finding this lib? I can pull it from another box if need be, but I'm not sure that's the best way to go about it.

The download links seem to work fine?

https://metacpan.org/pod/Thrift::XS
http://search.cpan.org/~agrundma/Thrift-XS-1.04/

revmoo
May 25, 2006

#basta

Interesting. Well nevermind then! Not sure why the mirrors were failing.

uG
Apr 23, 2003

by Ralp
cpan was down for like 10 minutes yesterday and metacpan was returning the same error. Kind of a silly thing to have happen

revmoo
May 25, 2006

#basta

uG posted:

cpan was down for like 10 minutes yesterday and metacpan was returning the same error. Kind of a silly thing to have happen

I was installing other packages at the time so I'm not sure if that it or not. I'll find out after the weekend I guess because :effort:

revmoo
May 25, 2006

#basta
So the mirrors for Thrift/XS are still broke but I was able to successfully compile it from that site, so that's cool.

revmoo fucked around with this message at 14:07 on Jul 14, 2014

Mario Incandenza
Aug 24, 2000

Tell me, small fry, have you ever heard of the golden Triumph Forks?
BTW, use cpanminus if you're not already, it makes things much simpler (and easier). It supports lots of super-cool features like installing from an arbitary local path/URL/git repo.

uG
Apr 23, 2003

by Ralp
code:
use Modern::Perl;
use List::Util qw/min max/;
 
my $range_groups = [
	'1,2,3,5..10,11..,20..15,21..23..25,32..30..29..27', # just 1 string for testing
];
 
my @new_groups = map { 
	join('..', _min_max( split(/\.\./, $_) )); 
} split(/,/, ($_)) foreach @{$range_groups};
 
say "Final array values: " . join(',', @new_groups);
 

#end
1;
 
 
sub _min_max {
	my ($min, $max) = ((min @_),(max @_));
	return $min != $max?($min,$max):$min;
}
 
 
__END__
Why is @new_groups empty? Inside the map expr the join statement creates the correct strings but none are getting assigned to the array. I'm guessing it has something to do with the ordering the loops are iterated over but i'm not sure why.

The purpose of this is to create perl ranges, even if the input is invalid (32..30..29..27 to 27..32 for instance)

qntm
Jun 17, 2009
Firstly, here's the fix:

Perl code:
my @new_groups = map { 
	join('..', _min_max( split(/\.\./, $_) )); 
} map { split(/,/, ($_)) } @{$range_groups};
As to why this code is behaving like this, I have no clue. I do know that the foreach statement causes everything to the left of it to be executed once per element in the input array, as if there were parentheses wrapped around it, but that still doesn't explain the behaviour you're seeing.

I managed to boil the puzzling behaviour down to:

Perl code:
use strict;
use warnings;
 
my @words = ("alpha", "bravo", "charlie");

(my $word = $_) foreach @words;

print $word;
# Expected: "charlie"
# Actual: undef

qntm fucked around with this message at 12:14 on Oct 6, 2014

qntm
Jun 17, 2009
Okay, so the answer is that Perl is complete garbage:

quote:

NOTE: The behaviour of a my, state, or our modified with a statement modifier conditional or loop construct (for example, my $x if ... ) is undefined. The value of the my variable may be undef, any previously assigned value, or possibly anything else. Don't rely on it. Future versions of perl might do something different from the version of perl you try it out on. Here be dragons.

Yes, here, in a simple foreach loop, there be dragons.

toadee
Aug 16, 2003

North American Turtle Boy Love Association

I'm surely not as savvy on perl as many others here, but isn't this just a case of the fact that since you're declaring it with my, your $words variable is local to that foreach loop?

I mean saying

Perl code:
(my $word = $_) foreach @words 
Is the same as saying
Perl code:
foreach my $word ( @words ) {}
... and I wouldn't expect to be using $word outside of that loop in the latter.

toadee fucked around with this message at 14:14 on Oct 6, 2014

qntm
Jun 17, 2009
If that were the case, the script would not even compile, because the $word at the final line doesn't exist, and I have strict and warnings turned on.

Filburt Shellbach
Nov 6, 2007

Apni tackat say tujay aaj mitta juu gaa!
Yeah, the problem is that Perl, for some reason, intentionally does not flag that as an error.

Mithaldu
Sep 25, 2007

Let's cuddle. :3:

qntm posted:

If that were the case, the script would not even compile, because the $word at the final line doesn't exist, and I have strict and warnings turned on.
I assume that in this case the same mechanism becomes active that allows this:
code:
use strict;
use warnings;
my $c = 1 if 0.5 < rand;
print $c;
I've poked #p5p to see if there can be something done about this.

qntm
Jun 17, 2009
So, I ran into this module. Here is the full extent of the example code for this module:

Perl code:
builder {
    enable 'Plack::Middleware::Acme::PHPE9568F34::D428::11d2::A769::00AA001ACF42';
    $app;
}
This module is a joke, and I understand the joke, but I don't think the joke extends to deliberately obtuse code. What the hell does this code do? What does the builder {} structure do, what is enable, where does $app come from, and why does the line $app; on its own do anything? Classically for Perl, it looks like there is a whole bunch of missing, assumed context, without which the thing is indecipherable.

Mithaldu
Sep 25, 2007

Let's cuddle. :3:
I'm a little amused that you'd have trouble reading that, given that you wrote a guide on Perl. :)

The whole thing expects you to know how Plack middlewares and the Plack::Builder DSL work

Here's the same code in a slightly more verbose manner:

Perl code:
# this goes into a file you can call something like app.psgi
# which you start with, e.g. `plackup app.psgi`
# which does pretty much: my $app = require 'app.psgi'; SomeServer->run_with( $app );

# it expects this customary plack middleware preamble blurb:
use Plack::Builder qw( enable builder );

# original app
my $app = sub {
    my $env = shift;
    return [ 200, ['Contet-Type' => 'text/html'], [ 'Hello World' ] ]
};

# app with middlewares wrapped around it
my $final_app = builder(
    sub {
        # stores middleware name in a local-scoped global variable in Plack::Builder
        enable ( 'Plack::Middleware::Acme::PHPE9568F34::D428::11d2::A769::00AA001ACF42' );
        
        # returns the app to which the contents of the local-scoped global variable are applied
        return $app;
    }
);

$final_app; # this falls out at the end of the require

TheEffect
Aug 12, 2013
Hello all. I'm pretty sure this is a fairly simple question but I know literally nothing about Perl and this is yet another project I'm taking over from an employee that no longer works here.

I have a fairly simple (from what I can tell) Perl script that essentially copies data from a file and parses it into another file. This script runs once a day. It has the line

code:
use Time::Format qw(%time);

my $date_today = $time{'yymmdd', time-86400};
to specify the date.

I need to be able to change that behavior so that I can choose a range of dates so the script can process multiple files for multiple dates rather than having it run once daily. What's the easiest way to go about this? If I have to post the full script just let me know.

Thanks for any help in advance!

TheEffect fucked around with this message at 18:09 on Jan 9, 2015

uG
Apr 23, 2003

by Ralp
https://stackoverflow.com/questions/6622818/what-is-the-optimal-way-to-loop-between-two-dates-in-perl

code:
use strict;
use warnings;
use DateTime;

my $start = DateTime->new(
    day   => 1,
    month => 1,
    year  => 2000,
);

my $stop = DateTime->new(
    day   => 10,
    month => 1,
    year  => 2000,
);


while ( $start->add(days => 1) < $stop ) {
    printf "Date: %s\n", $start->ymd('-');
}
code:
Date: 2000-01-02
Date: 2000-01-03
Date: 2000-01-04
Date: 2000-01-05
Date: 2000-01-06
Date: 2000-01-07
Date: 2000-01-08
Date: 2000-01-09
You would have to be able to install perl modules to do it this way (using the shell: cpan DateTime or via a package manager like debian something like apt-get install libdatetime-perl )

Mithaldu
Sep 25, 2007

Let's cuddle. :3:
If you don't want to deal with dependencies, you can probably also do this, for example, to get the past 7 days:

code:
my @week = map { $time{ 'yymmdd', time - ( 86400 * $_ ) } } 1 .. 7;

TheEffect
Aug 12, 2013
Thanks guys. Trying Ug's solution first, but I'm getting stuck when installing the dependencies. Any idea what this is about?



Mithaldu
Sep 25, 2007

Let's cuddle. :3:

TheEffect posted:

Thanks guys. Trying Ug's solution first, but I'm getting stuck when installing the dependencies. Any idea what this is about?





It sounds like it can't figure out the dependency chain. Either you need to upgrade your CPAN.pm, or snd drolsky a mail to see if he messed up.

prefect
Sep 11, 2001

No one, Woodhouse.
No one.




Dead Man’s Band

TheEffect posted:

Thanks guys. Trying Ug's solution first, but I'm getting stuck when installing the dependencies. Any idea what this is about?





I see Windows. Do you know if you're using Activeperl or Strawberry perl or something else? If Activeperl is an option, it's generally much easier to install modules that need compiled code.

TheEffect
Aug 12, 2013
Thanks everyone. I wasn't able to figure out the dependency issue as I'm a bit pressed for time on this (love when I get assigned to a project I have no idea about that's due the same day) but I was able to come up with a solution to my script based on Mithaldu's suggestion.

Really appreciate all the help and responses. Thanks again.

raej
Sep 25, 2003

"Being drunk is the worst feeling of all. Except for all those other feelings."
I have a CSV file in UTF-8 that has a lot of garbage data in it (weird ASCII characters, HTML remnants, etc

An example would be source file:
code:
(Hello),
&#40;Foo&#41;,
…bar,
Appleton Brewing\/Adler Brau,
St. James\u2019s Gate
and the rules would be
code:
&#40; -> (
&#40; -> )
… -> ...
\/ -> /
\u2019 -> '
And output would look like:
code:
(Hello),
(Foo),
...bar,
Appleton Brewing / Adler Brau,
St. James's Gate
It's not just for html codes, but a wide variety of crap data that needs cleaning up. Right now I'm searching for "&#" finding an example, running a find/replace on it, and repeat. Ideally, I'd like to add a rule to the ruleset that way when I receive a new file, I can apply all of the same rules.

Is there an easy way to do this in Perl?

Rohaq
Aug 11, 2006
Make a hash with your replacements:

Perl code:
my %replacements = (
    '&#40;' => '(',
    '&#41;' => ')',
    'foo'   => 'bar'
);
Then loop over it and replace text in your target text with a regex:

Perl code:
my $string = '&#40;testing&#41;';

print $string,"\n";    # Prints out initial $string value

for my $k ( keys %replacements ) {
    $string =~ s/$k/$replacements{$k}/g;
}

print $string,"\n";    # Prints out new $string value, with replacements.
Working example here

Keep in mind the order of your replacements, since previous replacements may affect further ones; i.e. if you want to replace HTML characters (&#40;) and ampersands (&), replace the HTML characters first, then replace ampersands, otherwise one will conflict with the other.

It's a quick and easy way to do it, and there may be better ways, but that'll do the job.

Rohaq fucked around with this message at 21:41 on Feb 13, 2015

raej
Sep 25, 2003

"Being drunk is the worst feeling of all. Except for all those other feelings."

Rohaq posted:

Make a hash with your replacements:

Perl code:
my %replacements = (
    '&#40;' => '(',
    '&#41;' => ')',
    'foo'   => 'bar'
);
Then loop over it and replace text in your target text with a regex:

Perl code:
my $string = '&#40;testing&#41;';

print $string,"\n";    # Prints out initial $string value

for my $k ( keys %replacements ) {
    $string =~ s/$k/$replacements{$k}/g;
}

print $string,"\n";    # Prints out new $string value, with replacements.
Keep in mind the order of your replacements, since previous replacements may affect further ones; i.e. if you want to replace HTML characters (&#40;) and ampersands (&), replace the HTML characters first, then replace ampersands, otherwise one will conflict with the other.

It's a quick and easy way to do it, and there may be better ways, but that'll do the job.

This is EXACTLY what I am looking for! Thank you so much!

raej fucked around with this message at 22:07 on Feb 13, 2015

Rohaq
Aug 11, 2006

raej posted:

This is EXACTLY what I am looking for! Thank you so much!
No problem!

By the by, if you want to decode HTML entities all in one go, install HTML::Entities from CPAN and use it as follows:
Perl code:
use HTML::Entities;

my $string = '&#40;testing&#41;';

print $string,"\n";    # Prints "&#40;testing&#41;"

$string = decode_entities($string);

print $string,"\n";    # Prints "(testing)"
You'll massively shorten your replacements list for other weird characters.

Rohaq fucked around with this message at 21:46 on Feb 13, 2015

raej
Sep 25, 2003

"Being drunk is the worst feeling of all. Except for all those other feelings."
One more thing,g how would I modify this to read/modify a file?

Rohaq
Aug 11, 2006
Depends: How big's the file?

If it's a fairly small file, check out this tutorial: You can load it into memory, make changes line by line, and output it again, overwriting the file.

If it's a really big file, you probably don't want to load it all into memory, because you've only got so much memory available. You need to look into using the Tie::File module, which can tie an array to a file, then you can loop over the array loading each line one by one, and edit lines one at a time. This is going to be slower than loading a file into memory, but it avoids your file from filling a ton of memory at any one time.

Again, take a backup of the original file first, last thing you want to do is mess up your original data.

Details on Tie::File can be found here: http://perldoc.perl.org/Tie/File.html

Rohaq fucked around with this message at 02:36 on Feb 14, 2015

Anaconda Rifle
Mar 23, 2007

Yam Slacker
Couldn't you just use sed or awk for this? This is what they do really well.

raej
Sep 25, 2003

"Being drunk is the worst feeling of all. Except for all those other feelings."
It's about a 5mb text file, so....small?

Rohaq
Aug 11, 2006

raej posted:

It's about a 5mb text file, so....small?
That should be fine to work with using the first method.

It only becomes problematic once you have gigantic log files. Back when I was a newbie and didn't know about Tie::File, I ended up with a 1.7GB perl process running, on a machine with only 2GB of RAM. :)

raej
Sep 25, 2003

"Being drunk is the worst feeling of all. Except for all those other feelings."

Rohaq posted:

No problem!

By the by, if you want to decode HTML entities all in one go, install HTML::Entities from CPAN and use it as follows:
Perl code:
use HTML::Entities;

my $string = '&#40;testing&#41;';

print $string,"\n";    # Prints "&#40;testing&#41;"

$string = decode_entities($string);

print $string,"\n";    # Prints "(testing)"
You'll massively shorten your replacements list for other weird characters.

Is there something like this for unicode? It seems like my data has a stupid amount of unicode characters like '\u00E8' which would result in 'è'

However, trying encode_entities on that would remove only the '\u' portion and leave the '00E8'

EDIT:

Using the Replacements method above will convert everything after the '\' but keeps the '\' intact

Perl code:
my %replacements = (
	'\u00E8' => 'è'
);

raej fucked around with this message at 18:23 on Feb 16, 2015

Mithaldu
Sep 25, 2007

Let's cuddle. :3:

raej posted:

Is there something like this for unicode? It seems like my data has a stupid amount of unicode characters like '\u00E8' which would result in 'è'

However, trying encode_entities on that would remove only the '\u' portion and leave the '00E8'

This is the easiest really:

code:
use IO::All -binary, -utf8;
my $data = io("filename")->all;

raej
Sep 25, 2003

"Being drunk is the worst feeling of all. Except for all those other feelings."

Mithaldu posted:

This is the easiest really:

code:
use IO::All -binary, -utf8;
my $data = io("filename")->all;

on the second line, am I putting that on the open part?

Here's my messy code:
Perl code:
use utf8;
use open ':encoding(utf8)';
use HTML::Entities;
use IO::ALL -binary, -utf8;
binmode(STDOUT, ":utf8");

$in_file = "input.csv";
$out_file = "output.csv";

open (IN, "<$in_file") or die "Can't open $in_file: $!\n";
open (OUT, ">$out_file") or die "Can't open $out_file: $!\n";

my %replacements = (
	'\\u00E8' => 'è'
);

while ( $line = <IN> ) {
@fields = split /\s*,\s*/, $line;

#Find and replace
for my $k ( keys %replacements ) {
	$line =~ s/$k/$replacements{$k}/g;
}

#output
print OUT $line;
}

close IN;
close OUT;

#read in output file and print to screen to confirm
open (TEST, "<$out_file") or die "Can't open $out_file: $!\n";
while ( <TEST> ) {
print;
}
close TEST;

Adbot
ADBOT LOVES YOU

Mithaldu
Sep 25, 2007

Let's cuddle. :3:

raej posted:

on the second line, am I putting that on the open part?

It completely replaces open/close etc by wrapping the filehandle in an object that does all the sanity-checks and automatically closes the FH when the object is destroyed at the end of a function. Here's code of either streaming or slurping that should do what you need:

Perl code:
use strict;
use warnings;
use utf8;
use HTML::Entities 'decode_entities';
use IO::ALL -binary, -utf8;

binmode STDOUT, ":utf8";

my $in_file      = "input.csv";
my $out_file     = "output.csv";
my %replacements = ( '…' => '...' );

# stream_big_files();
slurp_small_files();

#read in output file and print to screen to confirm
print io( $out_file )->all;

sub clean_line {
    my ( $line ) = @_;

    decode_entities( $line );    # in place

    #Find and replace
    for my $k ( keys %replacements ) {
        $line =~ s/$k/$replacements{$k}/g;
    }

    return $line;
}

sub slurp_small_files {
    my @lines = io( $in_file )->all;
    $_ = clean_lines( $_ ) for @lines;
    io( $out_file )->print( @lines );
    return;
}

sub stream_big_files {
    my $in  = io( $in_file );
    my $out = io( $out_file );
    while ( my $line = $in->getline ) {
        last if !defined $line;
        $line = clean_line( $line );
        $out->print( $line );
    }
}

  • Locked thread