Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
homercles
Feb 14, 2010

shift / @_ question here. I ran into some unexpected behavior. Here is test code:
code:
sub print_args { print "[@_]\n"; }

sub call_bad  { shift->{cb}->(@_) }
sub call_good { my $self = shift; $self->{cb}->(@_) }

my $hash_glob = { cb => \&print_args };
my $hash_anon = { cb => sub { print_args(@_) } };
print "1. I expect [123], I got "; call_bad($hash_glob, 123);
print "2. I expect [123], I got "; call_bad($hash_anon, 123);

bless $hash_glob; bless $hash_anon;
print "3. I expect [123], I got "; $hash_glob->call_bad(123);
print "4. I expect [123], I got "; $hash_anon->call_bad(123);
print "5. I expect [123], I got "; $hash_glob->call_good(123);
print "6. I expect [123], I got "; $hash_anon->call_good(123);
Output is:
code:
1. I expect [123], I got [HASH(0x2397f40) 123]
2. I expect [123], I got [HASH(0x23b3c10) 123]
3. I expect [123], I got [main=HASH(0x2397f40) 123]
4. I expect [123], I got [main=HASH(0x23b3c10) 123]
5. I expect [123], I got [123]
6. I expect [123], I got [123]
My only guess is that in &call_bad, @_ seems to be evaluating before shift() occurs, even though to access the coderef that takes @_ the shift has to be done.

Adbot
ADBOT LOVES YOU

homercles
Feb 14, 2010

Bazanga posted:

I don't know if I am doing something wrong, but the is_redirect attribute in LWP isn't getting set to true whenever a website redirects. For example, if I use the website http://www.capitalone.com/ as a url, LWP returns a success code and doesn't mention anything about a redirect, while I know for certain that the url redirects.

What is actually happening is that LWP::UserAgent is following the redirects in its code path, successfully loads the redirected URL https://www.capitalone.com/ and states "Yep, that https URL loaded correctly". is_redirect is there to inform you that you are not stuck in an infinite redirect loop.

To have it do what you want (no follow redirects, and inform you of whether a URL requests a redirect), instantiate LWP::UserAgent like so:
code:
my $ua = LWP::UserAgent->new(max_redirect => 0);
More info of course is in the LWP::UserAgent POD.

homercles
Feb 14, 2010

If only there was a perl module that processed text files formatted as a csv. And if only there was an xs version of it so it ran quickly.

I wonder what that module would be called? All right all right it's a log file with comma-joined fields not a CSV but still...

homercles fucked around with this message at 21:16 on Oct 21, 2012

homercles
Feb 14, 2010

What happens if you declare your methods static? You're polluting the global namespace for no reason.

homercles
Feb 14, 2010

Ninja Rope posted:

Not that I don't believe you, but can you show me where it's documented? I'd like to know more.
There are some good quotes in the 5.14.* documentation. They have been removed in 5.16, I don't know why and care less as well.

Urls of interest are:
http://perldoc.perl.org/5.14.2/perlref.html

perlref 5.14.2 posted:

Hard references are smart--they keep track of reference counts for you, automatically freeing the thing referred to when its reference count goes to zero. (Reference counts for values in self-referential or cyclic data structures may not go to zero without a little help; see Two-Phased Garbage Collection in perlobj for a detailed explanation.) If that thing happens to be an object, the object is destructed. See perlobj for more about objects. (In a sense, everything in Perl is an object, but we usually reserve the word for references to objects that have been officially "blessed" into a class package.)

http://perldoc.perl.org/5.14.2/perlobj.html#Two-Phased-Garbage-Collection

That last link doesn't state the GC rules in English but it does explicitly enumerate them.

homercles
Feb 14, 2010

uG posted:

What I can say is its directly related to the linked list (item* head) in the if statement I mentioned above. Removal of the Perl headers, XS prototypes on the bottom, and cxs_edistance function removed result in code that when compiled, returns the expected value (so the C guys I know think i'm crazy).
code:
 39   item *head,*curr,*iterator;
 40   head = (item*)malloc(sizeof(item));
 41   curr = head;
...
 58     if(hash(head,src[i]) == NULL){
The contents of head haven't been initialised, causing head->next->next to segfault via hash().

homercles
Feb 14, 2010

tonski posted:

There is no call to head->next->next:

First call to hash (line 23):
code:
  iterator = head;  //iterator = { next =  , value = , count = }
  while(iterator->next){  // { untrue }
    ..
  } // skipped to here because while is untrue
  return NULL;
I'll use the line numbers from pastebin.

40: head is malloc'd. contents of head contains garbage has it has not been memset, so: head = { next = <garbage>, value = <garbage>, count = <garbage> }
58: hash(head, src[i]) is called. head's contents still have not been initialised
23: item* iterator = head;
24: while(iterator->next){ // the truthfulness of this statement is undefined. it might be true, might be false. it depends on the memory allocated by malloc. we will assume it's true for this example as that will cause a segfault
25: if(iterator->value == index){ // undefined. assume false for this example
28: iterator = iterator->next // that is, iterator = head->next. head->next contains garbage. iterator is now filled with nonaddressable garbage.
23: while(iterator->next){ // this may segfault, we're testing head->next->next which may fail as head->next was never initialised with a value, attempting to deref it is undefined behavior

homercles fucked around with this message at 21:32 on Nov 3, 2012

homercles
Feb 14, 2010

uG posted:

Alas, that is not the problem either. FWIW, you can compile this and it will spit out '1': http://pastebin.com/M86yLumM

Same code with the Perl headers slapped on, export/call main() (no arguments, the values are hard coded) from the XS.pm wrapper (instead of xs_edistance), and we segfault in the same spot we've been discussing (which, again, works perfectly fine outside the Perlish enviroment).

The malloc stuff is a problem. It's not the problem but it's a problem. I ran your code on my machine and it prints out 3 not 1, because you've got array corruption too. On my machine, writing to scores[ax+1][ay+1] was changing the value of ay.

Here's my version that works and prints 1: http://pastebin.com/uN0dp9DT

I changed how push and hash work, removed item *curr,*iterator from scores. Changed the scores array to int scores[ax+2][ay+2], because you're reading and writing to it past its bounds.

Declaring an array int x[1] and then reading/writing to x[1] is broken, so the array has to be 1 larger. Same with declaring int scores[ax+1][ay+1] and then reading/writing to scores[ax+1][ay+1]

homercles
Feb 14, 2010

I wrote a pure-perl SNMP agent a few years ago for regression testing at my job, boss gave the ok to sling some code your way. I use some Perl decoding code found in mrtg that seems to have fallen out from its official place on the internet, however. I'll send you a PM and maybe upload this stuff in a more formalised sense to CPAN.

homercles
Feb 14, 2010

I want to export a global method (used for logging of course) to all packages in my company's namespaces, and if a new package in my company's namespaces is loaded dynamically I want to detect that and export my method to there too.

I can't figure out a good way of doing this, I have two ideas but they aren't all that ideal. One is to create a package dedicated to exporting a global logging method, and explicitly use'ing that package to export my logging method into the namespace that invoked &import.

The other idea is to override a global and hijack its functionality (eg hijack CORE::GLOBAL::log), checking caller and deferring to regular logarithm behavior if called by an unexpected package.

I had some other idea that are terrible, one is autodetecting every package in my companies namespaces (grepping for '^package' directives in .pm files in my lib paths) and pre-exporting my logging method to all packages.

An even worse one was overriding strict::import (except that is lexical, getting it to work when you have multiple packages in a file is a no go)

I had an even worser idea, but apparently providing your own UNIVERSAL::AUTOLOAD impl is bad.

Does anything cleaner exist?

homercles
Feb 14, 2010

Mithaldu posted:

Frankly, that is the sanest way i can think of doing this, as you can very tightly control when it should happen. On the other hand, your idea itself is entirely loving insane and i don't understand what the issue is with simply telling people to do use Our::Log 'log_info'; whenever they wish to log.
No gently caress you dad! I don't think creating a new perl global function is an insane thing, although most approaches to try and emulate global creation are irresponsible.

I was trying to remember this last one (much better than mucking with &UNIVERSAL::AUTOLOAD), and that's putting a coderef into @INC to get notified when a library is about to be loaded, but before it actually is eval'ed. You can then open the file, scan through it for package declarations and pre-stash your prototyped "globals" into the to-be-loaded namespaces. :jeb:

homercles
Feb 14, 2010

I don't want to go golf crazy or anything silly, but I'd do this. Might even use File::Slurp to remove the manual fh open if the file is known to be quite small. Removing boiler plate via core-ish modules.

code:
use List::Util qw(sum);
my $filename = "sums_in_loop_data.txt";
open my $test_data, '<', $filename
	or die "Cannot open '$filename' for writing: $!";

print sum(split), " " while <$test_data>
With file::slurp, and using join:
code:
use List::Util qw(sum);
use File::Slurp qw(slurp);
use feature qw(say);
my $filename = "sums_in_loop_data.txt";
say join " ", map { sum split } slurp $filename;

homercles
Feb 14, 2010

Toe Rag posted:

Not to be a dick, but neither of these is easy for a beginner to understand, and more importantly, they give the wrong answer. From the site...

Yours include 3 as part of the answer, which is not part of the pairs to be summed.
Hughmoris asked for a more streamlined version of his program, I did so bug for bug :) The intention was not to write beginner code since that resembles K&R C.

I've just gotten bitten by POSIX behavior or whatever, we're stashing a bunch of data into __DATA__, the DATA handle is literally an open file descriptor of the file in question. Since fork() will use the same underlying file descriptor in the kernel, all our reads to DATA in sibling apache processes are slapfighting each other. It was a very lazy approach to stuff some lookup in the __DATA__ portion, turns out it was disastrous.

homercles
Feb 14, 2010

Is there a stripped down vi clone written in Perl?

homercles
Feb 14, 2010

Azhais posted:

I'm too old to learn new syntax. Perl5 for life!

:corsair:

I'll learn Perl 6 when there are Perl 6 jobs out there. So, probably never.

homercles
Feb 14, 2010

You can make threads work, certainly you can. (well I'd use Parallel::ForkManager mostly because true threads aren't needed)

Is there a way to differentiate the key lines from the value lines? Do keys match a certain pattern? If you seek to a part of the file, assume you're mid-line and read in the current line assuming it's junk, then scan in the next line, can you determine via a regex if that line is a key, or if it's a value? If so then this is the easiest case, each child can run independent of the parent.

Otherwise, just have each thread assume that either one of the following are true, the first line in its slice is a key and subsequent is a value, OR that the first line is a value and subsequent lines form key/value pairs. When done, send both results to the parent, and the number of lines processed. Once the parent has all processed results from its children, it can reconstruct whether the first line each slice processed was a key or a value (based on the number of scanned in lines each child sends to the parent) and choose the processed results accordingly. It might be space prohibitive and it's going to be a bit pesky to write, but it's conceptually simple.

Alternately, maybe you can use gnu parallel for this.

A test program: perl -e 'BEGIN { $x = "x"x60 } print "key $_ $x\nval $_ $x\n" for 1 .. 214' | parallel --no-notice --block 500 --pipe -L 100 egrep --line-buffered -A1 -n "'^key .*7 '"

How you'd use it: cat BIGFILE.txt | parallel --no-notice --pipe -L 50000000 fgrep --line-buffered -A1 -x -f PATTERNSFILE.txt

You would have to parse the output to make sure matched keys are on 'odd' lines, GNU Parallel will limit the maximum number of line sent to fgrep forcing each new parallel block to be on a clean guaranteed key line. GNU Parallel is also written in Perl so it might not be much of a performance improvement though.

Adbot
ADBOT LOVES YOU

homercles
Feb 14, 2010

perldelta is the be all end all of what changed: http://perldoc.perl.org/index-history.html

As to the specifics I tend not to personally delve too much into them, I still have to work on environments stuck in 5.10.0 (not that I'm complaining, it still feels quite modern before smartmatch was butchered). Perhaps a kinder soul (Mithaldu?) could delve into the big things that changed.

  • Locked thread