The Perl Short Questions Megathread: executable line noise

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Perl Short Questions Megathread: executable line noise

«‹›72 »

Filburt Shellbach: Nov 6, 2007; Apni tackat say tujay aaj mitta juu gaa!

Yeah, I get that, but:

Mithaldu posted:

that means don't loving disconnect, get a bouncer/shell if you don't have a useful machine

is hostile and this kind of treatment is probably a large part of why Perl doesn't attract newbies.

# ? Jul 7, 2014 18:41

Adbot: ADBOT LOVES YOU

# ? Jun 8, 2024 01:30

PleasantDilemma: Dec 5, 2006; The Last Hope for Peace

Otto Skorzeny posted:

It's really frustrating to both asker and answerer if a lengthy and thoughtful reply to your question gets lost because you were disconnected when someone who had the ability and time to answer got around to it. I don't recall if the bot that hangs around in the perl channels has a !tell command to get around this problem.

I don't want to encourage a derail but isn't IRC for immediate answers while both people are online at the same time? Asking a question and getting an answer sometime later is for forums & message lists.

# ? Jul 7, 2014 19:17

Mithaldu: Sep 25, 2007; Let's cuddle.

Filburt Shellbach posted:

Yeah, I get that, but:

is hostile

To be honest, you're right that i was being a dick in that. I stand by the sentiment still, but the tone was entirely uncalled for.

Filburt Shellbach posted:

this kind of treatment is probably a large part of why Perl doesn't attract newbies.

That said, i don't think the part about newbies is entirely right. From what i've seen in IRC, and from my own interactions in #win32, #perl-help and #perl, newbies get treated with friendliness and respect as a rule. There are exceptions from time to time, but those are cases where the persons in question have managed to piss everybody off by spending a lot of time ignoring the advice they asked for.

PlesantDilemma posted:

I don't want to encourage a derail but isn't IRC for immediate answers while both people are online at the same time? Asking a question and getting an answer sometime later is for forums & message lists.

Not quite. IRC exists for both. Just like email it is extremely easy and cheap to have asynchronous conversations, but unlike email it ALSO has the capability to easily facilitate realtime conversation. With IRC you can start a conversation, move on to realtime as people begin to answer while you're there, move back to asynch when you or they have to leave, and pick it up later again in realtime again; all within the same medium and while maximizing the amount of random passersby who can see your entire conversation and offer advice.

Forums and message lists are good for some stuff, but discussing topics that require a lot of back-and-forth does not fall into that scope. Similarly difficult is also the attempt to set up realtime conversation via those, because again, back-and-forth is needed for the synchronization process.

------

Cheers.

vvvv

Mithaldu fucked around with this message at 20:25 on Jul 7, 2014

# ? Jul 7, 2014 20:19

jony neuemonic: Nov 13, 2009

I wasn't offended, for what it's worth.

# ? Jul 7, 2014 20:23

revmoo: May 25, 2006; #basta

I have a relatively obsure lib I need from CPAN (Thrift/XS) and all the mirrors for it are dead. I know nothing about CPAN other than how to install packages. Any tips on finding this lib? I can pull it from another box if need be, but I'm not sure that's the best way to go about it.

# ? Jul 11, 2014 22:32

Mithaldu: Sep 25, 2007; Let's cuddle.

revmoo posted:

I have a relatively obsure lib I need from CPAN (Thrift/XS) and all the mirrors for it are dead. I know nothing about CPAN other than how to install packages. Any tips on finding this lib? I can pull it from another box if need be, but I'm not sure that's the best way to go about it.

The download links seem to work fine?

https://metacpan.org/pod/Thrift::XS
http://search.cpan.org/~agrundma/Thrift-XS-1.04/

# ? Jul 12, 2014 01:39

revmoo: May 25, 2006; #basta

Mithaldu posted:

The download links seem to work fine?

https://metacpan.org/pod/Thrift::XS
http://search.cpan.org/~agrundma/Thrift-XS-1.04/

Interesting. Well nevermind then! Not sure why the mirrors were failing.

# ? Jul 12, 2014 01:52

uG: Apr 23, 2003; by Ralp

cpan was down for like 10 minutes yesterday and metacpan was returning the same error. Kind of a silly thing to have happen

# ? Jul 12, 2014 16:18

revmoo: May 25, 2006; #basta

uG posted:

cpan was down for like 10 minutes yesterday and metacpan was returning the same error. Kind of a silly thing to have happen

I was installing other packages at the time so I'm not sure if that it or not. I'll find out after the weekend I guess because :effort:

# ? Jul 12, 2014 16:42

revmoo: May 25, 2006; #basta

So the mirrors for Thrift/XS are still broke but I was able to successfully compile it from that site, so that's cool.

revmoo fucked around with this message at 14:07 on Jul 14, 2014

# ? Jul 14, 2014 13:31

Mario Incandenza: Aug 24, 2000; Tell me, small fry, have you ever heard of the golden Triumph Forks?

BTW, use cpanminus if you're not already, it makes things much simpler (and easier). It supports lots of super-cool features like installing from an arbitary local path/URL/git repo.

# ? Jul 14, 2014 23:28

uG: Apr 23, 2003; by Ralp

code:

use Modern::Perl;
use List::Util qw/min max/;
 
my $range_groups = [
	'1,2,3,5..10,11..,20..15,21..23..25,32..30..29..27', # just 1 string for testing
];
 
my @new_groups = map { 
	join('..', _min_max( split(/\.\./, $_) )); 
} split(/,/, ($_)) foreach @{$range_groups};
 
say "Final array values: " . join(',', @new_groups);
 

#end
1;
 
 
sub _min_max {
	my ($min, $max) = ((min @_),(max @_));
	return $min != $max?($min,$max):$min;
}
 
 
__END__

Why is @new_groups empty? Inside the map expr the join statement creates the correct strings but none are getting assigned to the array. I'm guessing it has something to do with the ordering the loops are iterated over but i'm not sure why.

The purpose of this is to create perl ranges, even if the input is invalid (32..30..29..27 to 27..32 for instance)

# ? Oct 6, 2014 06:57

qntm: Jun 17, 2009

Firstly, here's the fix:

Perl code:

my @new_groups = map { 
	join('..', _min_max( split(/\.\./, $_) )); 
} map { split(/,/, ($_)) } @{$range_groups};

As to why this code is behaving like this, I have no clue. I do know that the foreach statement causes everything to the left of it to be executed once per element in the input array, as if there were parentheses wrapped around it, but that still doesn't explain the behaviour you're seeing.

I managed to boil the puzzling behaviour down to:

Perl code:

use strict;
use warnings;
 
my @words = ("alpha", "bravo", "charlie");

(my $word = $_) foreach @words;

print $word;
# Expected: "charlie"
# Actual: undef

qntm fucked around with this message at 12:14 on Oct 6, 2014

# ? Oct 6, 2014 12:03

qntm: Jun 17, 2009

Okay, so the answer is that Perl is complete garbage:

quote:

NOTE: The behaviour of a my, state, or our modified with a statement modifier conditional or loop construct (for example, my $x if ... ) is undefined. The value of the my variable may be undef, any previously assigned value, or possibly anything else. Don't rely on it. Future versions of perl might do something different from the version of perl you try it out on. Here be dragons.

Yes, here, in a simple foreach loop, there be dragons.

# ? Oct 6, 2014 13:52

toadee: Aug 16, 2003; North American Turtle Boy Love Association

I'm surely not as savvy on perl as many others here, but isn't this just a case of the fact that since you're declaring it with my, your $words variable is local to that foreach loop?

I mean saying

Perl code:

(my $word = $_) foreach @words

Is the same as saying

Perl code:

foreach my $word ( @words ) {}

... and I wouldn't expect to be using $word outside of that loop in the latter.

toadee fucked around with this message at 14:14 on Oct 6, 2014

# ? Oct 6, 2014 14:02

qntm: Jun 17, 2009

If that were the case, the script would not even compile, because the $word at the final line doesn't exist, and I have strict and warnings turned on.

# ? Oct 6, 2014 15:51

Filburt Shellbach: Nov 6, 2007; Apni tackat say tujay aaj mitta juu gaa!

Yeah, the problem is that Perl, for some reason, intentionally does not flag that as an error.

# ? Oct 6, 2014 18:59

Mithaldu: Sep 25, 2007; Let's cuddle.

qntm posted:

If that were the case, the script would not even compile, because the $word at the final line doesn't exist, and I have strict and warnings turned on.

I assume that in this case the same mechanism becomes active that allows this:

code:

use strict;
use warnings;
my $c = 1 if 0.5 < rand;
print $c;

I've poked #p5p to see if there can be something done about this.

# ? Oct 7, 2014 08:37

qntm: Jun 17, 2009

So, I ran into this module. Here is the full extent of the example code for this module:

Perl code:

builder {
    enable 'Plack::Middleware::Acme::PHPE9568F34::D428::11d2::A769::00AA001ACF42';
    $app;
}

This module is a joke, and I understand the joke, but I don't think the joke extends to deliberately obtuse code. What the hell does this code do? What does the builder {} structure do, what is enable, where does $app come from, and why does the line $app; on its own do anything? Classically for Perl, it looks like there is a whole bunch of missing, assumed context, without which the thing is indecipherable.

# ? Dec 30, 2014 00:44

Mithaldu: Sep 25, 2007; Let's cuddle.

I'm a little amused that you'd have trouble reading that, given that you wrote a guide on Perl.

The whole thing expects you to know how Plack middlewares and the Plack::Builder DSL work

Here's the same code in a slightly more verbose manner:

Perl code:

# this goes into a file you can call something like app.psgi
# which you start with, e.g. `plackup app.psgi`
# which does pretty much: my $app = require 'app.psgi'; SomeServer->run_with( $app );

# it expects this customary plack middleware preamble blurb:
use Plack::Builder qw( enable builder );

# original app
my $app = sub {
    my $env = shift;
    return [ 200, ['Contet-Type' => 'text/html'], [ 'Hello World' ] ]
};

# app with middlewares wrapped around it
my $final_app = builder(
    sub {
        # stores middleware name in a local-scoped global variable in Plack::Builder
        enable ( 'Plack::Middleware::Acme::PHPE9568F34::D428::11d2::A769::00AA001ACF42' );
        
        # returns the app to which the contents of the local-scoped global variable are applied
        return $app;
    }
);

$final_app; # this falls out at the end of the require

# ? Dec 30, 2014 01:08

TheEffect: Aug 12, 2013

Hello all. I'm pretty sure this is a fairly simple question but I know literally nothing about Perl and this is yet another project I'm taking over from an employee that no longer works here.

I have a fairly simple (from what I can tell) Perl script that essentially copies data from a file and parses it into another file. This script runs once a day. It has the line

code:

use Time::Format qw(%time);

my $date_today = $time{'yymmdd', time-86400};

to specify the date.

I need to be able to change that behavior so that I can choose a range of dates so the script can process multiple files for multiple dates rather than having it run once daily. What's the easiest way to go about this? If I have to post the full script just let me know.

Thanks for any help in advance!

TheEffect fucked around with this message at 18:09 on Jan 9, 2015

# ? Jan 9, 2015 18:03

uG: Apr 23, 2003; by Ralp

https://stackoverflow.com/questions/6622818/what-is-the-optimal-way-to-loop-between-two-dates-in-perl

code:

use strict;
use warnings;
use DateTime;

my $start = DateTime->new(
    day   => 1,
    month => 1,
    year  => 2000,
);

my $stop = DateTime->new(
    day   => 10,
    month => 1,
    year  => 2000,
);


while ( $start->add(days => 1) < $stop ) {
    printf "Date: %s\n", $start->ymd('-');
}

code:

Date: 2000-01-02
Date: 2000-01-03
Date: 2000-01-04
Date: 2000-01-05
Date: 2000-01-06
Date: 2000-01-07
Date: 2000-01-08
Date: 2000-01-09

You would have to be able to install perl modules to do it this way (using the shell: cpan DateTime or via a package manager like debian something like apt-get install libdatetime-perl )

# ? Jan 9, 2015 18:34

Mithaldu: Sep 25, 2007; Let's cuddle.

TheEffect posted:

If you don't want to deal with dependencies, you can probably also do this, for example, to get the past 7 days:

code:

my @week = map { $time{ 'yymmdd', time - ( 86400 * $_ ) } } 1 .. 7;

# ? Jan 9, 2015 18:43

TheEffect: Aug 12, 2013

Thanks guys. Trying Ug's solution first, but I'm getting stuck when installing the dependencies. Any idea what this is about?

# ? Jan 9, 2015 19:08

Mithaldu: Sep 25, 2007; Let's cuddle.

TheEffect posted:

Thanks guys. Trying Ug's solution first, but I'm getting stuck when installing the dependencies. Any idea what this is about?

It sounds like it can't figure out the dependency chain. Either you need to upgrade your CPAN.pm, or snd drolsky a mail to see if he messed up.

# ? Jan 9, 2015 19:13

prefect: Sep 11, 2001; No one, Woodhouse.
No one.; Dead Man’s Band

TheEffect posted:

Thanks guys. Trying Ug's solution first, but I'm getting stuck when installing the dependencies. Any idea what this is about?

I see Windows. Do you know if you're using Activeperl or Strawberry perl or something else? If Activeperl is an option, it's generally much easier to install modules that need compiled code.

# ? Jan 9, 2015 19:22

TheEffect: Aug 12, 2013

Thanks everyone. I wasn't able to figure out the dependency issue as I'm a bit pressed for time on this (love when I get assigned to a project I have no idea about that's due the same day) but I was able to come up with a solution to my script based on Mithaldu's suggestion.

Really appreciate all the help and responses. Thanks again.

# ? Jan 9, 2015 20:25

raej: Sep 25, 2003; "Being drunk is the worst feeling of all. Except for all those other feelings."

I have a CSV file in UTF-8 that has a lot of garbage data in it (weird ASCII characters, HTML remnants, etc

An example would be source file:

code:

(Hello),
&#40;Foo&#41;,
�bar,
Appleton Brewing\/Adler Brau,
St. James\u2019s Gate

and the rules would be

code:

&#40; -> (
&#40; -> )
� -> ...
\/ -> /
\u2019 -> '

And output would look like:

code:

(Hello),
(Foo),
...bar,
Appleton Brewing / Adler Brau,
St. James's Gate

It's not just for html codes, but a wide variety of crap data that needs cleaning up. Right now I'm searching for "&#" finding an example, running a find/replace on it, and repeat. Ideally, I'd like to add a rule to the ruleset that way when I receive a new file, I can apply all of the same rules.

Is there an easy way to do this in Perl?

# ? Feb 13, 2015 21:12

Rohaq: Aug 11, 2006

Make a hash with your replacements:

Perl code:

my %replacements = (
    '&#40;' => '(',
    '&#41;' => ')',
    'foo'   => 'bar'
);

Then loop over it and replace text in your target text with a regex:

Perl code:

my $string = '&#40;testing&#41;';

print $string,"\n";    # Prints out initial $string value

for my $k ( keys %replacements ) {
    $string =~ s/$k/$replacements{$k}/g;
}

print $string,"\n";    # Prints out new $string value, with replacements.

Working example here

Keep in mind the order of your replacements, since previous replacements may affect further ones; i.e. if you want to replace HTML characters (() and ampersands (&), replace the HTML characters first, then replace ampersands, otherwise one will conflict with the other.

It's a quick and easy way to do it, and there may be better ways, but that'll do the job.

Rohaq fucked around with this message at 21:41 on Feb 13, 2015

# ? Feb 13, 2015 21:25

raej: Sep 25, 2003; "Being drunk is the worst feeling of all. Except for all those other feelings."

Rohaq posted:

Make a hash with your replacements:
Perl code:
my %replacements = (
    '&#40;' => '(',
    '&#41;' => ')',
    'foo'   => 'bar'
);
Then loop over it and replace text in your target text with a regex:
Perl code:
my $string = '&#40;testing&#41;';

print $string,"\n";    # Prints out initial $string value

for my $k ( keys %replacements ) {
    $string =~ s/$k/$replacements{$k}/g;
}

print $string,"\n";    # Prints out new $string value, with replacements.
Keep in mind the order of your replacements, since previous replacements may affect further ones; i.e. if you want to replace HTML characters (() and ampersands (&), replace the HTML characters first, then replace ampersands, otherwise one will conflict with the other.

It's a quick and easy way to do it, and there may be better ways, but that'll do the job.

This is EXACTLY what I am looking for! Thank you so much!

raej fucked around with this message at 22:07 on Feb 13, 2015

# ? Feb 13, 2015 21:39

Rohaq: Aug 11, 2006

raej posted:

This is EXACTLY what I am looking for! Thank you so much!

No problem!

By the by, if you want to decode HTML entities all in one go, install HTML::Entities from CPAN and use it as follows:

Perl code:

use HTML::Entities;

my $string = '&#40;testing&#41;';

print $string,"\n";    # Prints "&#40;testing&#41;"

$string = decode_entities($string);

print $string,"\n";    # Prints "(testing)"

You'll massively shorten your replacements list for other weird characters.

Rohaq fucked around with this message at 21:46 on Feb 13, 2015

# ? Feb 13, 2015 21:42

raej: Sep 25, 2003; "Being drunk is the worst feeling of all. Except for all those other feelings."

One more thing,g how would I modify this to read/modify a file?

# ? Feb 13, 2015 22:07

Rohaq: Aug 11, 2006

Depends: How big's the file?

If it's a fairly small file, check out this tutorial: You can load it into memory, make changes line by line, and output it again, overwriting the file.

If it's a really big file, you probably don't want to load it all into memory, because you've only got so much memory available. You need to look into using the Tie::File module, which can tie an array to a file, then you can loop over the array loading each line one by one, and edit lines one at a time. This is going to be slower than loading a file into memory, but it avoids your file from filling a ton of memory at any one time.

Again, take a backup of the original file first, last thing you want to do is mess up your original data.

Details on Tie::File can be found here: http://perldoc.perl.org/Tie/File.html

Rohaq fucked around with this message at 02:36 on Feb 14, 2015

# ? Feb 13, 2015 23:01

Anaconda Rifle: Mar 23, 2007; Yam Slacker

Couldn't you just use sed or awk for this? This is what they do really well.

# ? Feb 13, 2015 23:52

raej: Sep 25, 2003; "Being drunk is the worst feeling of all. Except for all those other feelings."

It's about a 5mb text file, so....small?

# ? Feb 14, 2015 18:29

Rohaq: Aug 11, 2006

raej posted:

It's about a 5mb text file, so....small?

That should be fine to work with using the first method.

It only becomes problematic once you have gigantic log files. Back when I was a newbie and didn't know about Tie::File, I ended up with a 1.7GB perl process running, on a machine with only 2GB of RAM.

# ? Feb 14, 2015 20:59

raej: Sep 25, 2003; "Being drunk is the worst feeling of all. Except for all those other feelings."

Rohaq posted:

No problem!

By the by, if you want to decode HTML entities all in one go, install HTML::Entities from CPAN and use it as follows:
Perl code:
use HTML::Entities;

my $string = '&#40;testing&#41;';

print $string,"\n";    # Prints "&#40;testing&#41;"

$string = decode_entities($string);

print $string,"\n";    # Prints "(testing)"
You'll massively shorten your replacements list for other weird characters.

Is there something like this for unicode? It seems like my data has a stupid amount of unicode characters like '\u00E8' which would result in '�'

However, trying encode_entities on that would remove only the '\u' portion and leave the '00E8'

EDIT:

Using the Replacements method above will convert everything after the '\' but keeps the '\' intact

Perl code:

my %replacements = (
	'\u00E8' => '�'
);

raej fucked around with this message at 18:23 on Feb 16, 2015

# ? Feb 16, 2015 18:16

Mithaldu: Sep 25, 2007; Let's cuddle.

raej posted:

Is there something like this for unicode? It seems like my data has a stupid amount of unicode characters like '\u00E8' which would result in '�'

However, trying encode_entities on that would remove only the '\u' portion and leave the '00E8'

This is the easiest really:

code:

use IO::All -binary, -utf8;
my $data = io("filename")->all;

# ? Feb 16, 2015 18:23

raej: Sep 25, 2003; "Being drunk is the worst feeling of all. Except for all those other feelings."

Mithaldu posted:

This is the easiest really:
code:
use IO::All -binary, -utf8;
my $data = io("filename")->all;

on the second line, am I putting that on the open part?

Here's my messy code:

Perl code:

use utf8;
use open ':encoding(utf8)';
use HTML::Entities;
use IO::ALL -binary, -utf8;
binmode(STDOUT, ":utf8");

$in_file = "input.csv";
$out_file = "output.csv";

open (IN, "<$in_file") or die "Can't open $in_file: $!\n";
open (OUT, ">$out_file") or die "Can't open $out_file: $!\n";

my %replacements = (
	'\\u00E8' => '�'
);

while ( $line = <IN> ) {
@fields = split /\s*,\s*/, $line;

#Find and replace
for my $k ( keys %replacements ) {
	$line =~ s/$k/$replacements{$k}/g;
}

#output
print OUT $line;
}

close IN;
close OUT;

#read in output file and print to screen to confirm
open (TEST, "<$out_file") or die "Can't open $out_file: $!\n";
while ( <TEST> ) {
print;
}
close TEST;

# ? Feb 16, 2015 18:26

Adbot: ADBOT LOVES YOU

# ? Jun 8, 2024 01:30

Mithaldu: Sep 25, 2007; Let's cuddle.

raej posted:

on the second line, am I putting that on the open part?

It completely replaces open/close etc by wrapping the filehandle in an object that does all the sanity-checks and automatically closes the FH when the object is destroyed at the end of a function. Here's code of either streaming or slurping that should do what you need:

Perl code:

use strict;
use warnings;
use utf8;
use HTML::Entities 'decode_entities';
use IO::ALL -binary, -utf8;

binmode STDOUT, ":utf8";

my $in_file      = "input.csv";
my $out_file     = "output.csv";
my %replacements = ( '�' => '...' );

# stream_big_files();
slurp_small_files();

#read in output file and print to screen to confirm
print io( $out_file )->all;

sub clean_line {
    my ( $line ) = @_;

    decode_entities( $line );    # in place

    #Find and replace
    for my $k ( keys %replacements ) {
        $line =~ s/$k/$replacements{$k}/g;
    }

    return $line;
}

sub slurp_small_files {
    my @lines = io( $in_file )->all;
    $_ = clean_lines( $_ ) for @lines;
    io( $out_file )->print( @lines );
    return;
}

sub stream_big_files {
    my $in  = io( $in_file );
    my $out = io( $out_file );
    while ( my $line = $in->getline ) {
        last if !defined $line;
        $line = clean_line( $line );
        $out->print( $line );
    }
}

# ? Feb 17, 2015 06:29

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Perl Short Questions Megathread: executable line noise

«‹›72 »