The Perl Short Questions Megathread: executable line noise

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Perl Short Questions Megathread: executable line noise

«‹›72 »

raej: Sep 25, 2003; "Being drunk is the worst feeling of all. Except for all those other feelings."

Mithaldu posted:

It completely replaces open/close etc by wrapping the filehandle in an object that does all the sanity-checks and automatically closes the FH when the object is destroyed at the end of a function. Here's code of either streaming or slurping that should do what you need:

Perl code:

use strict;
use warnings;
use utf8;
use HTML::Entities 'decode_entities';
use IO::ALL -binary, -utf8;

binmode STDOUT, ":utf8";

my $in_file      = "input.csv";
my $out_file     = "output.csv";
my %replacements = ( '�' => '...' );

# stream_big_files();
slurp_small_files();

#read in output file and print to screen to confirm
print io( $out_file )->all;

sub clean_line {
    my ( $line ) = @_;

    decode_entities( $line );    # in place

    #Find and replace
    for my $k ( keys %replacements ) {
        $line =~ s/$k/$replacements{$k}/g;
    }

    return $line;
}

sub slurp_small_files {
    my @lines = io( $in_file )->all;
    $_ = clean_lines( $_ ) for @lines;
    io( $out_file )->print( @lines );
    return;
}

sub stream_big_files {
    my $in  = io( $in_file );
    my $out = io( $out_file );
    while ( my $line = $in->getline ) {
        last if !defined $line;
        $line = clean_line( $line );
        $out->print( $line );
    }
}

After installing the IO::All package, I'm not getting an "undefined subroutine" error on any calls to io() Is there something else I'm missing?

# ? Feb 17, 2015 16:28

Adbot: ADBOT LOVES YOU

# ? May 17, 2024 20:16

Anaconda Rifle: Mar 23, 2007; Yam Slacker

Am I the only one here who thinks raej should be using sed or awk?

# ? Feb 17, 2015 16:41

raej: Sep 25, 2003; "Being drunk is the worst feeling of all. Except for all those other feelings."

Even trying to get a proof of concept unicode to utf8 conversion is proving difficult.

Perl code:

use HTML::Entities;
binmode STDIN, ":encoding(UTF-8)";

$input = "Mont Sal\u00E8ve Rye Porter \xE8";
$input =~ s/\u00/\x/g;
 print $input, "\n";
 print HTML::Entities::decode($input), "\n";
 print HTML::Entities::encode($input), "\n";

Which produces

code:

Mont Sal ve Rye Porter &#934;
Mont Sal ve Rye Porter &#934;
Mont Sal&#0;ve Rye Porter &egrave;

the \u00E8 is what I'm seeing in my data file (Which is encoded at utf8 to begin with). I'm trying to get the \u00E8 to �.

If I change $input =~ s/\u00/\x/g; to $input =~ s/\u00E8/�/g;, the output from HTML::Entities::encode spits out the HTML &#egrave; but it seems silly to go through a find/replace->encode->decode

http://www.charbase.com/00e8-unicode-latin-small-letter-e-with-grave

# ? Feb 17, 2015 17:26

Mithaldu: Sep 25, 2007; Let's cuddle.

raej posted:

After installing the IO::All package, I'm not getting an "undefined subroutine" error on any calls to io() Is there something else I'm missing?

I didn't noticed you'd typoed All as ALL. Also i got a typo of my own in there with clean_line/clean_lines.

Anaconda Rifle posted:

Am I the only one here who thinks raej should be using sed or awk?

You can do utf8/html entities with those?

# ? Feb 17, 2015 17:28

raej: Sep 25, 2003; "Being drunk is the worst feeling of all. Except for all those other feelings."

Mithaldu posted:

I didn't noticed you'd typoed All as ALL. Also i got a typo of my own in there with clean_line/clean_lines.
You can do utf8/html entities with those?

Aha, ok, got it running now. But I'm still having the unicode notation conversion issue.

Even if I change the replacements to be my %replacements = ( '\\u00E8' => '�' );

the output looks like
Mont Sal\�ve Rye Porter

If I could get rid of the / I think I would be in business

# ? Feb 17, 2015 17:49

Anaconda Rifle: Mar 23, 2007; Yam Slacker

Mithaldu posted:

You can do utf8/html entities with those?

You should be able to. You might have to fiddle with locale.

# ? Feb 17, 2015 18:10

Mithaldu: Sep 25, 2007; Let's cuddle.

raej posted:

Aha, ok, got it running now. But I'm still having the unicode notation conversion issue.

Even if I change the replacements to be my %replacements = ( '\\u00E8' => '�' );

the output looks like
Mont Sal\�ve Rye Porter

If I could get rid of the / I think I would be in business

Upload a zip or make a github repo with the script and a tiny example where it doesn't work right.

Anaconda Rifle posted:

You should be able to. You might have to fiddle with locale.

I'm absolutely not a linux person, but i think html entities don't have much to do with locales. Am i missing someting?

# ? Feb 18, 2015 00:14

Anaconda Rifle: Mar 23, 2007; Yam Slacker

Mithaldu posted:

I'm absolutely not a linux person, but i think html entities don't have much to do with locales. Am i missing someting?

I was just talking about UTF-8. Html entities are just text, right? "<" and stuff like that?

# ? Feb 18, 2015 01:30

uG: Apr 23, 2003; by Ralp

You double slashed your Unicode character character hash key, so I imagine that extra slash is the first character of that key and why it's showing up before your '�'. You have the key in single quotes so there is no need to escape it

# ? Feb 18, 2015 01:54

raej: Sep 25, 2003; "Being drunk is the worst feeling of all. Except for all those other feelings."

Mithaldu posted:

Upload a zip or make a github repo with the script and a tiny example where it doesn't work right.

Here you go!
https://github.com/prognar/encodetoutf8

# ? Feb 18, 2015 17:27

Mithaldu: Sep 25, 2007; Let's cuddle.

raej posted:

Here you go!
https://github.com/prognar/encodetoutf8

Sent you a pull request. You had two issues:

- don't put semicolons in lists
- you needed \Q\E around your $k, which instructs the regex to search for the string exactly as given instead of trying to interpolate it as a regex itself

Also, holy balls that is a hosed-up data set. :aaa:

# ? Feb 18, 2015 19:25

raej: Sep 25, 2003; "Being drunk is the worst feeling of all. Except for all those other feelings."

Mithaldu posted:

Sent you a pull request. You had two issues:

- don't put semicolons in lists
- you needed \Q\E around your $k, which instructs the regex to search for the string exactly as given instead of trying to interpolate it as a regex itself

Also, holy balls that is a hosed-up data set.

Woo, that's it!

I added in every Unicode translation and it parsed through them like a champ.

You weren't kidding about the source data, I had to put in a follow up replacements function to clean up all the Unicode that didn't have a heading \

Thank you so much!

# ? Feb 18, 2015 22:07

Hughmoris: Apr 21, 2007; Let's go to the abyss!

I've written my first simple script in Python, and I want to try and write it in Perl to compare the two.

I'm getting stuck on understanding how calling CPAN modules and functions work. I'm on Windows 8, I installed Strawberry Perl, and I used its CPAN client to install CAM::PDF. Now, looking at the CAM::PDF page, there is a getpdftext.pl under the Documentation section that I want to use.

Is that getpdftext.pl part of the CAM::PDF module that I installed? How do I go about calling it in a very simple script? I think I understand how to use that script from a command line but I'm not sure how to incorporate it into another script.

My first goal is to write a very basic script that converts "testFile.pdf" to a text file, and I want to do it from within a script, not the command line.

Hughmoris fucked around with this message at 02:57 on Feb 22, 2015

# ? Feb 22, 2015 02:48

Mithaldu: Sep 25, 2007; Let's cuddle.

Hughmoris posted:

Is that getpdftext.pl part of the CAM::PDF module that I installed? How do I go about calling it in a very simple script?

It's probably not installed. Just click source on its page, copy-paste the code into an appropiately named file, then in your script use backticks like so to call it:

code:

my $output =  `perl $scriptname $arguments`;

Additionally, you sound very confused, which is ok, but also like you didn't read *any* teaching materials on Perl, which is less ok. Please head to http://perl-tutorial.org and read some of the stuff linked there. (i recommend ovid's beginning perl, and modern perl)

# ? Feb 22, 2015 09:15

Hughmoris: Apr 21, 2007; Let's go to the abyss!

Mithaldu posted:

It's probably not installed. Just click source on its page, copy-paste the code into an appropiately named file, then in your script use backticks like so to call it:
code:
my $output =  `perl $scriptname $arguments`;
Additionally, you sound very confused, which is ok, but also like you didn't read *any* teaching materials on Perl, which is less ok. Please head to http://perl-tutorial.org and read some of the stuff linked there. (i recommend ovid's beginning perl, and modern perl)

Thanks for this.

You're right, I probably am not doing as much reading as I need to in order to understand the basics of Perl, but I have done some. I've been reading ovid's book. Either way, thanks again for posting a solution.

# ? Feb 25, 2015 03:09

uG: Apr 23, 2003; by Ralp

code:

use Moops;
use Moo;
require MooX::Options;

our $VERSION = 0.01;

class Options {
    role VM {
        use MooX::Options;
        option 'install_vm' => (
            short       => 'vm',
            is          => 'rw',
            format      => 's@',
            default     => sub {['moar']},
            autosplit   => 1,
            required    => 1,
            doc         => 'VM backend(s) to install',
        );
    }


    role Compiler {
        use MooX::Options;
        option 'install_compiler' => (
            short       => 'c',
            is          => 'rw',
            format      => 's@',
            default     => sub {['rakudo']},
            autosplit   => 1,
            required    => 0,
            doc         => 'Perl6 compiler(s) to install',
        );
    }

    role ModuleInstaller {
        use MooX::Options;
        option 'install_panda' => (
            short       => 'panda',
            is          => 'ro',
            required    => 0,
            doc         => 'Install panda, a module installer for Rakudo Perl6',
        );
    }
} # class Options


# MAIN
class App::p6brew with Options::VM with Options::Compiler with Options::ModuleInstaller {
    use MooX::Options;
    option verbose => (
        is  => 'ro',
        isa => Bool,
    );
} # class App::p6brew



1;



__END__

I can call the above with `my $app = App::p6brew->new_with_options;` but I can't figure out how to apply all roles from class Options to App::p6brew (notice they are applied individually). I'm also trying to figure out how to structure it so I don't have to use MooX::Options inside each role. Any ideas?

# ? Mar 8, 2015 20:11

Mithaldu: Sep 25, 2007; Let's cuddle.

Don't have any ideas due to lack of experience with whatever that is you're doing, however i can recommend going onto irc.perl.org and asking in #web-simple.

# ? Mar 8, 2015 22:40

Salt Fish: Sep 11, 2003; Cybernetic Crumb

Is there any easier/more concise/built in way to do this?

code:

sub split_entry{
        #take in a string, split it, then return the requested field (starting with field 0)
        #usage: split_entry(string, field separator, field number)
        #split_entry("Hello $ Goodbye",$,0) returns "Hello"
        @result = split($_[0],$_[1]);
        return @result[$_[2]];
        }

Basically in bash I would do cut -d$ -f2 or something, but afaik you can't have split just return a single field instead of the whole array.

# ? Mar 18, 2015 22:56

Mithaldu: Sep 25, 2007; Let's cuddle.

Salt Fish posted:

Is there any easier/more concise/built in way to do this?
code:
sub split_entry{
        #take in a string, split it, then return the requested field (starting with field 0)
        #usage: split_entry(string, field separator, field number)
        #split_entry("Hello $ Goodbye",$,0) returns "Hello"
        @result = split($_[0],$_[1]);
        return @result[$_[2]];
        }
Basically in bash I would do cut -d$ -f2 or something, but afaik you can't have split just return a single field instead of the whole array.

A small note: Your example doesn't match what your code does. In any case, this is a little shorter:

code:

#take in a string, split it, then return the requested field (starting with field 0)
#usage: split_entry(string, field separator, field number)
#split_entry("Hello $ Goodbye",$,0) returns "Hello"
sub split_entry {
    my ( $string, $sep, $index ) = @_;
    return ( split $sep, $string )[$index];
}

print split_entry( 'Hello $ Goodbye', '\$', 0 );

# ? Mar 18, 2015 23:18

Salt Fish: Sep 11, 2003; Cybernetic Crumb

Mithaldu posted:

A small note: Your example doesn't match what your code does. In any case, this is a little shorter:

code:

#take in a string, split it, then return the requested field (starting with field 0)
#usage: split_entry(string, field separator, field number)
#split_entry("Hello $ Goodbye",$,0) returns "Hello"
sub split_entry {
    my ( $string, $sep, $index ) = @_;
    return ( split $sep, $string )[$index];
}

print split_entry( 'Hello $ Goodbye', '\$', 0 );

Thats a great answer thank you. I'm curious now why I can't just use:

code:

my $string = "data 1 # comment 1";
print (split '#', $string)[0];

But I'm working my way through Learning Perl so I reckon I'll get that part soon. Thanks!

# ? Mar 18, 2015 23:33

Mithaldu: Sep 25, 2007; Let's cuddle.

Salt Fish posted:

Thats a great answer thank you. I'm curious now why I can't just use:
code:
my $string = "data 1 # comment 1";
print (split '#', $string)[0];
But I'm working my way through Learning Perl so I reckon I'll get that part soon. Thanks!

Because print steals the parens. Try this: print( (split '#', $string)[0] );

# ? Mar 18, 2015 23:39

Salt Fish: Sep 11, 2003; Cybernetic Crumb

Mithaldu posted:

Because print steals the parens. Try this: print( (split '#', $string)[0] );

That's awesome, thank you again. Here is the complete draw a big rear end table code which is simplier for sure:

code:

use CGI qw/:standard/;
        open(FILE, "file.txt") or die("died");
        print table({-border=>1},
                caption('A test table'),
                Tr({-align=>'CENTER'},
                [
                        th(["Data", "Comment"]),
                        map { td( (split '#', $_)[0] ) . td ((split '#', $_ )[1] ) } <FILE>,
                        ]
                        )
                );
        close(FILE);

Let me tell you, this looks a lot better than when I wrote it in php to compare. Perl is good imo.

# ? Mar 18, 2015 23:48

Mithaldu: Sep 25, 2007; Let's cuddle.

Salt Fish posted:

That's awesome, thank you again. Here is the complete draw a big rear end table code which is simplier for sure:

[...]

Let me tell you, this looks a lot better than when I wrote it in php to compare. Perl is good imo.

It is quite good, but let me also tell you that Perl looks a lot nicer when you run code through Perl::Tidy.

code:

use CGI qw/:standard/;
open( FILE, "file.txt" ) or die( "died" );
print table(
    { -border => 1 },
    caption( 'A test table' ),
    Tr(
        { -align => 'CENTER' },
        [
            th( [ "Data", "Comment" ] ),    #
            map { td( ( split '#', $_ )[0] ) . td( ( split '#', $_ )[1] ) } <FILE>,
        ]
    )
);
close( FILE );

Also, i only just now noticed you mentioned Learning Perl, and boy do i have bad news for you. You picked a book that is so old that i'd confidently describe it as poo poo. The things i see you use, bareword filehandles, CGI, no strict/warnings are not very good things to learn. Please instead try:

http://onyxneon.com/books/modern_perl/

and:

http://www.wrox.com/WileyCDA/WroxTitle/Beginning-Perl.productCd-1118013840.html

or

http://web.archive.org/web/20120709053246/http://ofps.oreilly.com/titles/9781118013847/index.html

# ? Mar 18, 2015 23:54

EVGA Longoria: Dec 25, 2005; Let's go exploring!

Mithaldu posted:

It is quite good, but let me also tell you that Perl looks a lot nicer when you run code through Perl::Tidy.
code:
use CGI qw/:standard/;
open( FILE, "file.txt" ) or die( "died" );
print table(
    { -border => 1 },
    caption( 'A test table' ),
    Tr(
        { -align => 'CENTER' },
        [
            th( [ "Data", "Comment" ] ),    #
            map { td( ( split '#', $_ )[0] ) . td( ( split '#', $_ )[1] ) } <FILE>,
        ]
    )
);
close( FILE );
Also, i only just now noticed you mentioned Learning Perl, and boy do i have bad news for you. You picked a book that is so old that i'd confidently describe it as poo poo. The things i see you use, bareword filehandles, CGI, no strict/warnings are not very good things to learn. Please instead try:

http://onyxneon.com/books/modern_perl/

and:

http://www.wrox.com/WileyCDA/WroxTitle/Beginning-Perl.productCd-1118013840.html

or

http://web.archive.org/web/20120709053246/http://ofps.oreilly.com/titles/9781118013847/index.html

There's still tons of CGI out there so there's no harm in learning it. Nothing you pick up in CGI will be harmful to good practice, but it's useful to understand it for when you get tasked with maintaining CGI.

That said, 100% use Modern Perl. It's the best source for learning perl anytime this decade.

# ? Mar 19, 2015 17:31

Toshimo: Aug 23, 2012; He's outta line...

But he's right!

Ok, so I wrote some code and I'm like 300% sleep deprived and haven't eaten anything all day and it's 11pm and BRAAIINNNSSS. Can someone just look this over and make sure it's doing what I think it's doing? The output looks pretty good (considering it's a 260,000 long wall of text), but I don't even know at this point.

I've got a bunch of files that I merged together into a bigass file that is a list of IP address ranges in the format:
###.###.###.### - ###.###.###.###

However, the program I'm feeding it to takes great offense at overlapping ranges, so I needed to remove all duplication.

I lovingly crafted this:

code:

#!/usr/bin/perl

use Exporter;
use Net::IPAddress;
use strict;
use warnings;

my $path_to_file = 'data.txt';
open(my $handle, '<', $path_to_file);
chomp(my @lines = <$handle>);
close $handle;

my @ipranges;

foreach my $line (@lines)
{
        $line =~ /([0-9\.]+) - ([0-9\.]+)/;
        my @temp_ranges = (ip2num("$1"), ip2num("$2"));
        push @ipranges, \@temp_ranges;
}

my $duplicates = 0;

do{
        my @temp_ranges;
        my $test_lower = $ipranges[0][0];
        my $test_upper = $ipranges[0][1];
        for my $i ( 1 .. $#ipranges)
        {
                if($ipranges[$i][0]<$test_upper)
                {
                        $duplicates = 1;
                        if($ipranges[$i][1]>$test_upper)
                        {
                                $test_upper = $ipranges[$i][1];
                        }
                }
                else
                {
                        my @range_arr = ($test_lower, $test_upper);
                        push @temp_ranges, \@range_arr;
                        $test_lower = $ipranges[$i][0];
                        $test_upper = $ipranges[$i][1];
                }
        }
        @ipranges = @temp_ranges;
}while (!$duplicates);

for my $range (0 .. $#ipranges)
{
        print num2ip($ipranges[$range][0]) . " - " . num2ip($ipranges[$range][1]) . " , 000 ,  \n";
}

The program I'm feeding it to hasn't complained and I went from 562,000 lines of input to 212,000 lines of output so, hopefully, maybe...

# ? Mar 22, 2015 04:00

het: Nov 14, 2002; A dark black past
is my most valued
possession

Toshimo posted:

Ok, so I wrote some code and I'm like 300% sleep deprived and haven't eaten anything all day and it's 11pm and BRAAIINNNSSS. Can someone just look this over and make sure it's doing what I think it's doing? The output looks pretty good (considering it's a 260,000 long wall of text), but I don't even know at this point.

I've got a bunch of files that I merged together into a bigass file that is a list of IP address ranges in the format:
###.###.###.### - ###.###.###.###

However, the program I'm feeding it to takes great offense at overlapping ranges, so I needed to remove all duplication.

I lovingly crafted this:

The program I'm feeding it to hasn't complained and I went from 562,000 lines of input to 212,000 lines of output so, hopefully, maybe...

I'm pretty sure this isn't doing what you intend it to do, I tested with this as data.txt:

quote:

0.0.0.0 - 1.0.0.0
0.5.0.0 - 0.6.0.0
0.5.0.0 - 1.1.1.1
1.1.0.0 - 2.0.0.0
126.0.0.0 - 127.0.0.0

and got this output:

quote:

0.0.0.0 - 2.0.0.0 , 000 ,

(edit: the input I used in this example was sorted already and still produced incorrect output but I found when using unsorted input it gave even more incorrect output, thus the comment below)

I would suggest sorting the ipranges first, it simplifies the algorithm:

code:

#!/usr/bin/perl

use Exporter;
use Net::IPAddress;
use strict;
use warnings;

my $path_to_file = 'data.txt';
open(my $handle, '<', $path_to_file);
chomp(my @lines = <$handle>);
close $handle;

my @ipranges;

foreach my $line (@lines)
{
        $line =~ /([0-9\.]+) - ([0-9\.]+)/;
        my @temp_ranges = (ip2num("$1"), ip2num("$2"));
        push @ipranges, \@temp_ranges;
}

@ipranges = sort { $a->[0] <=> $b->[0] } @ipranges;

my ($lower, $upper) = (0, 0);
my @result = ();
for my $i (0 .. $#ipranges) {
    if ($ipranges[$i][0] < $upper) {
        if ($ipranges[$i][1] > $upper) {
            $upper = $ipranges[$i][1];
        }
    }
    else {
        push(@result, [ $lower, $upper ]) unless ($lower == 0 and $upper == 0);
        ($lower, $upper) = @{$ipranges[$i]};

    }
}
push(@result, [ $lower, $upper ]);

@ipranges = @result;

for my $range (0 .. $#ipranges)
{
print num2ip($ipranges[$range][0]) . " - " . num2ip($ipranges[$range][1]) . " , 000 ,  \n";

het fucked around with this message at 07:38 on Mar 22, 2015

# ? Mar 22, 2015 07:26

Roseo: Jun 1, 2000; Forum Veteran

Net::CIDR or Net::CIDR::Lite may be more appropriate for this use case than the IP address class.

# ? Mar 22, 2015 13:17

Toshimo: Aug 23, 2012; He's outta line...

But he's right!

het posted:

I'm pretty sure this isn't doing what you intend it to do, I tested with this as data.txt:

and got this output:

(edit: the input I used in this example was sorted already and still produced incorrect output but I found when using unsorted input it gave even more incorrect output, thus the comment below)

I would suggest sorting the ipranges first, it simplifies the algorithm:

Thanks. I'll give this a shot. The data I'm working with is pre-sorted, but it can't hurt to sort anyway.

Roseo posted:

Net::CIDR or Net::CIDR::Lite may be more appropriate for this use case than the IP address class.

Yeah. I tried this with Net::CIDR and it's been going for 3 hours and hasn't showed any inclination on finishing yet. :ohdear:

code:

#!/usr/bin/perl

use Exporter;
use Net::CIDR;
use strict;
use warnings;

my $path_to_file = 'data.txt';
open(my $handle, '<', $path_to_file);
chomp(my @lines = <$handle>);
close $handle;

my @cidr_list;

foreach my $line (@lines)
{
        $line =~ s/ //;
        @cidr_list=Net::CIDR::cidradd($line, @cidr_list);
}

my @a=Net::CIDR::cidr2range(@cidr_list);

print join("\n", @a);

# ? Mar 22, 2015 23:16

Hughmoris: Apr 21, 2007; Let's go to the abyss!

Been reading up Modern Perl and trying to convert some of my simple Python scripts to Perl.

For my first: I have a folder full of .vgr files. For each file in the directory, I want to iterate line by line to see if it contains the string "LOAD_ORDER=OIS=208972". If it does, I want to stop, print the name of that file then move on to the next file.

Perl code:

use warnings;
use strict;
use diagnostics;

my @vgr_files = glob("*.vgr");
foreach my $file (@vgr_files)
{
	open my $fh, '<', $file
		or die "can't open it";
	
	foreach (<$fh>)
		{
		 	if ($_ =~ /LOAD_ORDER=OIS=208972/)
		 	{
		 		print "$file\n";
		 		last;
		 	}

		}
}

Can I make this code any cleaner/quicker? Also, I ran into a problem where I couldn't glob the .vgr files in the directory using a full file path. I'm thinking it has something to do with spaces in the file path (glob("c:\folder\new folder\*.vgr")). I couldn't get it working so I just moved my script to that folder and used glob("*.vgr")

# ? Mar 26, 2015 03:21

uG: Apr 23, 2003; by Ralp

code:

use Modern::Perl;
use Path::Class;
use experimental qw/smartmatch/;
use FindBin;

my @exts = (qr/vgr$/);
dir("$FindBin::Bin/data_directory")->recurse(callback => sub {
    my $file = (-f $_[0] && $_[0] ~~ @exts) ? $_[0] : return;
    open my $fh, '<', $file; 
    while(<$fh>) { say $file and last if m/LOAD_ORDER=OIS=208972/ }
    close $fh;
});

This is how i'd do it

uG fucked around with this message at 04:02 on Mar 26, 2015

# ? Mar 26, 2015 03:56

Hughmoris: Apr 21, 2007; Let's go to the abyss!

uG posted:

code:

use Modern::Perl;
use Path::Class;
use experimental qw/smartmatch/;
use FindBin;

my @exts = (qr/vgr$/);
dir("$FindBin::Bin/data_directory")->recurse(callback => sub {
    my $file = (-f $_[0] && $_[0] ~~ @exts) ? $_[0] : return;
    open my $fh, '<', $file; 
    while(<$fh>) { say $file and last if m/LOAD_ORDER=OIS=208972/ }
    close $fh;
});

This is how i'd do it

That hurts my brain. Can you break this down?

Perl code:

dir("$FindBin::Bin/data_directory")->recurse(callback => sub {
    my $file = (-f $_[0] && $_[0] ~~ @exts) ? $_[0] : return;

Hughmoris fucked around with this message at 04:23 on Mar 26, 2015

# ? Mar 26, 2015 04:17

uG: Apr 23, 2003; by Ralp

code:

# An array containing precompiled regexes that we will use to match against the file name.
# In this case its a single regex, but you could put as many as you want into the array
my @exts = (qr/vgr$/);

# `dir` is a Path::Class object, and in this case it the directory given as an argument ($FindBin::Bin just points to the /path to the perl script) and recursively 
# runs the supplied callback each time it finds a file or folder
dir("$FindBin::Bin/data_directory")->recurse(callback => sub {
    
    # The first parameter supplied to the callback is the name of the folder or file. Since we are not interested in folders, we will test if its a file with `-f $file`
    # We also apply the filter on file extension here with the smart match operator, '~~', which basically says "does $_ match anything in @exts?" and tries to match appropriately 
    # @exts contains a regex, its doing something like `for (@exts) { return $file if $_ =~ m/vgr$/; }`
    my $file = (-f $_[0] && $_[0] ~~ @exts) ? $_[0] : return;

# ? Mar 26, 2015 04:30

Hughmoris: Apr 21, 2007; Let's go to the abyss!

uG posted:

code:

# An array containing precompiled regexes that we will use to match against the file name.
# In this case its a single regex, but you could put as many as you want into the array
my @exts = (qr/vgr$/);

# `dir` is a Path::Class object, and in this case it the directory given as an argument ($FindBin::Bin just points to the /path to the perl script) and recursively 
# runs the supplied callback each time it finds a file or folder
dir("$FindBin::Bin/data_directory")->recurse(callback => sub {
    
    # The first parameter supplied to the callback is the name of the folder or file. Since we are not interested in folders, we will test if its a file with `-f $file`
    # We also apply the filter on file extension here with the smart match operator, '~~', which basically says "does $_ match anything in @exts?" and tries to match appropriately 
    # @exts contains a regex, its doing something like `for (@exts) { return $file if $_ =~ m/vgr$/; }`
    my $file = (-f $_[0] && $_[0] ~~ @exts) ? $_[0] : return;

Thanks for taking the time to write that up.

# ? Mar 26, 2015 04:35

Hughmoris: Apr 21, 2007; Let's go to the abyss!

Trying to learn simple databases and Perl, together. I installed SQLite and am using the DBD::SQLite module. Any recommend readings to ease my way into using Perl with databases?

For my first database, I was going to try a project someone here in CoC recommended. Create a table full of randomized patient names, social security #s, home address and insurance carrier.

# ? Mar 28, 2015 03:38

uG: Apr 23, 2003; by Ralp

I'd read the DBIx::Class documents and use that, but lots of people don't like ORMs

code:

perl -MDateTime -MModern::Perl -e 'use DBIx::Class::Schema::Loader qw/make_schema_at/; make_schema_at("New::Schema::Name", { debug => 0 }, [q/dbi:sqlite:ncaa_cfb/,"user","pass", {} ]); my $table = New::Schema::Name->resultset("Schedule"); my $query = $table->search({ date => {"<", DateTime->new( month => 9, day => 11, year => 2001 )->ymd } }); say "all columns in table:" . join(",", @{$row->result_source->columns}); while(my $row = $query->next) { say "date: " . $row->date;  }'

Learning how to setup the schemas can have a bit of a learning curve, but you can have them generated dynamically like the above command in the mean time.

My general boilerplate looks something like

code:

use Modern::Perl;
use Config::General;
use FindBin;
use DB::Schema;

# Database Connections
my $conf       = Config::General->new("$FindBin::Bin/../DB/db.cnf");
my %config    = $conf->getall;
my $DB        = DB::Schema->connect( "dbi:mysql:" . $config{mysql}{db} . ':host=' . $config{mysql}{host}, $config{mysql}{user}, $config{mysql}{pass}, { mysql_enable_utf8 => 1} );
my $teams_rs  = $DB->resultset('DimTeamNcaaId');
my $games_rs  = $DB->resultset('Schedule');

Then later on you can start handling your relationships/joins in a perl-y fashion, so you don't have to have separate schemas variables if they have a relationship setup in their schema

code:

$teams_rs->search({ name => 'Michigan State' })->games;

Its a steep learning curve, but DBIx::Class is one of Perl's best tools. In fact its the only thing that keeps me from using Perl6 full time :getin:

uG fucked around with this message at 04:29 on Mar 28, 2015

# ? Mar 28, 2015 04:27

Hughmoris: Apr 21, 2007; Let's go to the abyss!

Thanks, this will give me some light reading to do over the weekend.

# ? Mar 28, 2015 05:14

Hughmoris: Apr 21, 2007; Let's go to the abyss!

Played with perl and DBI a little more. I successfully created my first database and figured out how to loop over a file containing names, and enter those names into my database. However, it is slow as dirt.

I have a 1,220-line file, called combined_names.txt where each line has a first name and last name:

code:

alexander hamilton
harry dresden
sarah smith

My database has 3 columns: Id, fname, lname

Here's my ugly code that creates the database, then iterates over my combined_names list and inserts the names into the db. Takes about 30 seconds to insert 1,220 rows... Any tips?

Perl code:

#!perl
use warnings;
use strict;
use diagnostics;
use DBI;

my $dbfile = "sample.db";

my $dsn = "dbi:SQLite:dbname=$dbfile";
my $user = "";
my $password = "";
my $dbh = DBI->connect($dsn, $user, $password, {
	PrintError => 0,
	RaiseError => 1,
	AutoCommit => 1,
	FetchHashKeyName => 'NAME_1c',
	});

my $sql = <<'END_SQL';
CREATE TABLE patients (
	id		INTEGER PRIMARY KEY,
	fname 	VARCHAR(100),
	lname	VARCHAR(100)
	)
END_SQL

$dbh->do($sql);

my @split_names = ();

open my $fh, '<', 'C:/Strawberry/Projects/Names_project/combined_names.txt'
	or die "Can't open it";

foreach (<$fh>)
{
	@split_names = split / /, $_, 2;
	$dbh->do('INSERT INTO patients (fname, lname) VALUES (?,?)', undef, $split_names[0], $split_names[1]);
}

$dbh->disconnect;

# ? Mar 28, 2015 06:32

uG: Apr 23, 2003; by Ralp

Look into turning autocommit off if transactional integrity doesn't have to be considered. Also prepare() the statement outside the loop

uG fucked around with this message at 07:39 on Mar 28, 2015

# ? Mar 28, 2015 06:49

toadee: Aug 16, 2003; North American Turtle Boy Love Association

uG posted:

Look into turning autocommit off if transactional integrity doesn't have to be considered. Also prepare() the statement outside the loop

To expand on this, outside the loop:

code:

my $sth = $dbh->prepare('INSERT INTO patients (fname, lname) VALUES (?,?)');

Then inside the loop:

code:

$sth->execute( @bind_vars );

That should run quite a bit faster.

# ? Mar 28, 2015 13:43

Adbot: ADBOT LOVES YOU

# ? May 17, 2024 20:16

Hughmoris: Apr 21, 2007; Let's go to the abyss!

uG posted:

Look into turning autocommit off if transactional integrity doesn't have to be considered. Also prepare() the statement outside the loop

toadee posted:

To expand on this, outside the loop:
code:
my $sth = $dbh->prepare('INSERT INTO patients (fname, lname) VALUES (?,?)');
Then inside the loop:
code:
$sth->execute( @bind_vars );
That should run quite a bit faster.

Wow. Turning off autocommit and then preparing my statement outside the loop dropped my create & write time from ~ 32 seconds to being less than 1 second. Thanks.

# ? Mar 28, 2015 16:01

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Perl Short Questions Megathread: executable line noise

«‹›72 »