The Perl Short Questions Megathread: executable line noise

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Perl Short Questions Megathread: executable line noise

«‹›2 »

Erasmus Darwin: Mar 6, 2001

Triple Tech posted:

Two things. One, it sounds like what you're asking for is to output to a file as soon as possible. That is achieved by the following:
code:
$|++;

Autoflush only works on the currently selected file handle, which by default is STDOUT. However, the following options should all work:

code:

use IO::Handle;

OUT->autoflush(1);

-or-

code:

$oldfh = select(OUT);
$| = 1;
select($oldfh);

-or-

code:

select((select(OUT), $|=1)[0]);

# ¿ Dec 7, 2007 19:31

Adbot: ADBOT LOVES YOU

# ¿ May 2, 2024 06:28

Erasmus Darwin: Mar 6, 2001

Kidane posted:

Yes! Awesome, I've got a beginner's perl book which devotes one paragraph to getopt::long and doesn't mention the configure option. So basically anything extra will cause ARGV to have an option set, yes yes yes, thank you v. much.

Try running "perldoc Getopt::Long" in your shell, and you should get way more than you ever wanted to know about Getopt::Long. It'll work for any installed module, and you can also look up built-in functions with "perldoc -f functioname". "perldoc perldoc" gives you documentation on perldoc itself.

# ¿ Feb 13, 2008 17:02

Erasmus Darwin: Mar 6, 2001

syphon^2 posted:

Hey that's really cool. How would one handle multiple-line "interactive" scripts then? The carriage return let's you re-draw the current line, but what if you need to re-draw the PREVIOUS line(s)?

I think that's getting to the point where you need to start worrying about the terminal-specific means of moving the cursor around. (Or you can cheat and just hardcode the appropriate ANSI escape sequences, but that's bad.)

The low-level interface to the underlying terminal control sequences is handled by Term::Cap, which seems to be bundled with Perl:

http://www.perl.com/doc/manual/html/lib/Term/Cap.html

But you don't want to use that unless you're feeling adventurous and masochistic. It's a lot easier to use a nice wrapper like Term::Screen (which you'll have to grab off CPAN):

http://search.cpan.org/~jstowe/Term-Screen-1.03/Screen.pm

Term::Screen, Term::ReadKey, and Term::ANSIColor was enough for me to do an asteroid clone using ASCII characters. I can't remember why I skipped using Term::Screen's key handling code and instead went with Term::ReadKey.

# ¿ Mar 12, 2008 16:19

Erasmus Darwin: Mar 6, 2001

Triple Tech posted:

Unless you think some non-while method is even faster. Like reading the file raw with sysreads and stuff?

I would think that would give some improvements because of the line length. I'm a little fuzzy on the behind-the-scenes details, but I suspect Perl's traditional, user-friendly file reading mechanism might be inefficient when dealing with long lines. I wouldn't be surprised if there's a lot of string reallocation going on behind the scenes.

Sartak posted:

I'd really be surprised if this wasn't the fastest method:
code:
my @cols = unpack 'a10a2a6...', $_;

Wouldn't Perl still have to parse the 'a10a2a6...' string for every line of the file? I suspect that might be slower than the brute-force substr method which gets turned into bytecode once.

# ¿ Apr 8, 2008 15:00

Erasmus Darwin: Mar 6, 2001

permanoob posted:

I can run the script, but it's not outputting the file.

If nothing's late, it won't output a file.

Also, the code uses '=' as a comparison operator instead of '=='. However, by coincidence because of how the code is structured, that bug doesn't break anything (unless months and days in the data are 0-based instead of 1-based).

Edit: Also, a lot of that crazy date comparison logic can be simplified.

code:

if ($year1 * 600 + $mn1 * 40 + $day1 < $year2 * 600 + $mn2 * 40 + $day2) {
  # It's late.  Write out the return.
} else {
  # It's not late.  Go to the next line.
}

Erasmus Darwin fucked around with this message at 20:06 on Apr 8, 2008

# ¿ Apr 8, 2008 19:58

Erasmus Darwin: Mar 6, 2001

nrichprime posted:

There's a lot of
code:
if ($var1 = $var2)
instead of
code:
if ($var1 == $var2)
statements peppered through the code, these may be the source of some of your problems.

Except that it's structured as:

code:

if ($var1 > $var2) {
} elsif ($var1 < $var2) {
} elsif ($var1 = $var2) {
}

Which means it works by coincidence, as long as $var1 and $var2 aren't 0 (which is true if the CSV is using normal, human-readable, 1-based dates). So even though it's wrong, I don't think it's the source of the problem unless the CSV uses 0-based dates (e.g. January = 0 rather than 1).

# ¿ Apr 9, 2008 13:38

Erasmus Darwin: Mar 6, 2001

6174 posted:

How would I rewrite this C++ function in a Perlish manner?

code:

sub line_match($$) {
  my ($base, $insert) = @_;

  return (substr($base,  74, 8) eq substr($insert,  74, 8)) &&
         (substr($base,  89, 8) eq substr($insert,  89, 8)) &&
         (substr($base, 114, 8) eq substr($insert, 114, 8));
}

It's also worth noting that the C++ version can be rewritten the same way.

# ¿ Apr 10, 2008 18:33

Erasmus Darwin: Mar 6, 2001

Triple Tech posted:

code:

my $csv_dir = "/home/jnooraga/csvdir/";
my $log_dir = "/home/jnooraga/csvdir/logs/";

my @csvfiles = glob("$csv_dir*");
my @logfiles = glob("$log_dir*");

I've got one minor quibble with this. I personally prefer moving the last / to when $csv_dir and $log_dir are used rather than sticking it in the config string:

code:

my $csv_dir = "/home/jnooraga/csvdir";
my $log_dir = "/home/jnooraga/csvdir/logs";

my @csvfiles = glob("$csv_dir/*");
my @logfiles = glob("$log_dir/*");

1) Invocations have a more natural looking "$dir/$file" style to them which mimics what they really are.
2) If someone changes the directories (especially if you were to move their definition to a config file or something), they won't shoot themselves in the foot regardless of whether or not they include a final slash (since an extra slash gets ignored but a missing slash causes everything to break).

Yes, I've put way, way too much thought into this.

# ¿ May 14, 2008 19:22

Erasmus Darwin: Mar 6, 2001

Div posted:

I hope I explained that right, because frankly the entire thing just seems retarded to me. Can anyone who's had to deal with this in the past point me in the direction of a better way to cache those file server hits?

While it's not the greatest, I'm not sure what your problem is with the scheme beyond it being "retarded". Are you having problems with logo updates not propagating quickly enough since cache persistence is arbitrarily tied to the life of the mod_perl process? Is there so much logo data being cached that it's causing problems with the size of the mod_perl and/or the system memory?

Edit: One more proper way would be to use Memoize to memoize the logo look-up function. However, at its core, Memoize would be doing the same thing that your code already does with just a lot of extra packaging and unnecessary flexibility.

Erasmus Darwin fucked around with this message at 16:16 on Jun 6, 2008

# ¿ Jun 6, 2008 15:00

Erasmus Darwin: Mar 6, 2001

Rather than relying on the registry query code to handle communicating with each remote server, what about installing a local script on each one that bundles up all the relevant registry keys and spits it out as a single blob of data to your main script?

# ¿ Jul 1, 2008 13:20

Erasmus Darwin: Mar 6, 2001

syphon^2 posted:

The script is responsible for querying servers out of a pool of ~300. New ones are added and old ones are deprecated on a nearly daily basis.

The master script could be responsible for installing or updating itself on the servers. Alternately, the master script could tell the remote server to run the client script, which would be located on a share on the server running the master script.

Another possibility is to just run everything in parallel. A bit of experimenting shows that fork() comes up with an error if you try and create more than 64 child processes for a single process, but I was able to create 59 more children from one of those child processes. 59 seems like an odd and arbitrary limit, but I haven't dug further to see what's going on or whether there's a limit on the total number of subprocesses available for a given family. Regardless, even processing 50 servers at a time should still be relatively quick: 300 servers / 50 servers/batch * 52 seconds latency for dumping all the subkeys = just over 5 minutes.

# ¿ Jul 1, 2008 15:29

Erasmus Darwin: Mar 6, 2001

What about creating a new hash where each key is the DNS server value from the previous hash and each value is an array of keys from the previous hash that had that DNS server?

Something like this:

code:

for $k (sort keys %servervalues) {
  if (! $newhash{$servervalues{k}}) {
    $newhash{$servervalues{$k}} = [ ];
  }
  push @{$newhash{$servervalues{$k}}}, $k;
}

The sort is obviously optional but is a quick and easy way to make sure the individual arrays are created in a sorted order.

And then you can do something like:

code:

for $k (keys %newhash) {
  if (@{$newhash{$k}} > 1) {
    print join(', ', @{$newhash{$k}}), " share the same value.\n";
  }
}

# ¿ Jul 17, 2008 20:09

Erasmus Darwin: Mar 6, 2001

Triple Tech posted:

So, if you want by spaces, you're going to have to cut up each line of numbers first, and then pump your counter algo into it, which on the surface looks sound.

In which case, the 'split' function is probably what you want. You feed it a regexp and a string, and it splits the string into an array of strings using the regexp as the separator. For example:

split /:/, 'foo:bar:baz' would return [ 'foo', 'bar', 'baz' ]. (Note that the separator, ':' in this case, is completely removed from the output.)

Also, since the output of split is an array, you can quickly iterate through it using a foreach loop.

Here's one way to write the main read/processing loop (in spoiler tags because it's worth trying to figure it out on your own first):

code:

[spoiler]while (<>) {
    chomp; # Implicitly acts on $_.
    foreach (split /\s/, $_) {
        $ThreeHash{$_}++;
    }
}[/spoiler]

Also, I abused $_ a bit in there. Don't let that throw you. A more explicit version would be:

code:

[spoiler]while (my $line = <>) {
    chomp $line;
    foreach my $code (split /\s/, $line) {
        $ThreeHash{$code}++;
    }
}[/spoiler]

That effectively does the same thing as the last block.

Also, there are other ways to do it besides using split. For example, you can use a loop that iterates on the regexp operator with the 'g' flag in order to have it return every 3 digit block in a given string. That works better if your file's not in a consistent format or if there's a lot of stuff you're skipping over. For example, that'd be the way to go if you wanted to count every area code embedded in a bunch of English text talking about the joys of area codes.

# ¿ Jul 21, 2008 14:16

Erasmus Darwin: Mar 6, 2001

Triple Tech posted:

Also not all platforms (Windows) respect the shebang.

Windows obviously doesn't use the shebang to invoke perl, but perl on Windows (or at least the copy of ActivePerl that I've got installed here) is smart enough to manually parse the shebang line once it's invoked. So "#!/usr/bin/perl -w" still works on Windows.

# ¿ Jul 23, 2008 18:36

Erasmus Darwin: Mar 6, 2001

Triple Tech posted:

Why are we allowed to assume that the map of a list is done in order? Like you know how the perldoc says we can't assume hash keys have a predictable order?

Having a fixed order is an integral concept to the list type, so it makes sense to preserve that order when working with the type. Hashes don't have a fixed order for their elements, so attempting to impose one on the output of the keys function would often result in extra work.

So I guess it sort of boils down to it's that way because that's the way it is. Still, you need to keep in mind that these are two of the fundamental data building blocks of the language. Their differences complement each other and allow you to create more complex data structures.

# ¿ Sep 5, 2008 18:30

Erasmus Darwin: Mar 6, 2001

Sartak posted:

Perhaps you want an END block.

I ran a quick test, and it looks like that doesn't work for SIGTERM:

code:

$ perl -e 'END { print "Dying!\n"; } kill("TERM", $$); sleep(120);'
Terminated
$

So he'd still need to worry about catching the signal somehow. For a general purpose module, I suspect that any solution will be slightly messy and have some suboptimal side-effects.

# ¿ Sep 23, 2008 17:23

Erasmus Darwin: Mar 6, 2001

TiMBuS posted:

Feels kinda hackish, sorta like using goto in C to handle errors.
And it looks a little out of place, being tagged onto the end of the loop block. Not very intuitive.

I have to disagree somewhat. I'm a little too set in my C-ish ways to use continue blocks on a regular basis, but it's hard to see it as hackish. It puts the increment code for the loop into its own distinct block thereby increasing semantic content and allowing for reduced duplication of code. It's about as hackish as C's for-loop (which a cynic could call a glorified while loop).

# ¿ Sep 26, 2008 02:57

Erasmus Darwin: Mar 6, 2001

syphon^2 posted:

Huh, threading/coroutines must be beyond me. I looked over this, and their example seems nothing more than an elaborate goto(), it doesn't run things asynchronously.

I think you're right. It does look like it handles some nifty things, but it's still at the mercy of your code providing a "cede" to switch contexts. So when you're blocking on a sleep or the registry lookup code, there's no way your code can do a cede to let your other code run. In short, I don't think this fits your needs.

I think you're probably better off using Win32::Process like you were thinking of before.

Mithaldu posted:

The main point here is: Sockets don't lock because they keep the CPU busy, but because they're idling until something happens. What you want is to do something during the time they spend idling. Coro::Socket makes that possible.

That obviously requires the sockets in use to be Coro::Sockets. However, since syphon^2 is relying on Win32::TieRegistry to abstract away the networking details, he doesn't have access to the sockets being created. Then again, I suppose he could go digging into Win32::TieRegistry's code and create his own Coro::Socket-based version. Seems like more trouble that just doing multiple processes, though.

# ¿ Oct 22, 2008 17:00

Erasmus Darwin: Mar 6, 2001

syphon^2 posted:

Doing it this way means I have to rip out a bunch of code an put it in a separate .pl file though. (or can Win32::Process::Create execute Perl code within the same Perl executable?)

You can stick it in the same script if you want. Just invoke the child processes with /child as one of the arguments and check @ARGV when you start up.

quote:

EDIT: I just thought of something else... Win32::Process::Create is like fork, in that I can't share variables, right?

Yup. A quick and easy workaround is to just have the child processes write their results to textfiles that're checked by the parent process.

# ¿ Oct 22, 2008 19:48

Erasmus Darwin: Mar 6, 2001

NeoHentaiMaster posted:

Is there a way for a perl script to see whats piping input to it?

If it's Linux, there's always searching /proc. /proc/$PID/fd/0 will be linked to a pipe -- use readlink to get the value. Find the other fd linked to that pipe in /proc, and you know what's feeding you input. There might be an easier and more portable way to do that, but I wouldn't know where to begin looking.

You can also change it so the first thing sent by the calling script is a password of sorts. In order to pull that off, you'd need to make the calling script setuid, as well. For security, I'd recommend having the calling script disable core dumps, read the password out of a file readable only by the user it's setuid as, and then drop privileges. For disabling core dumps, you can use the BSD::Resource module or a wrapper for your script.

# ¿ Nov 24, 2008 14:58

Erasmus Darwin: Mar 6, 2001

Quick and dirty:

code:

$pid = fork();
if ($pid < 0) {
  die('Fork failed.');
} elsif ($pid == 0) {
  # Child process:
  open(HANDLE, '| /bin/script2');
  print HANDLE stuff;
  close(HANDLE);
  exit;
}

# Parent process continues on...

# ¿ Dec 1, 2008 01:11

Erasmus Darwin: Mar 6, 2001

NeoHentaiMaster posted:

I figured since a forked process is a 'child' process the parent process would still wait for it to close before it finished.

That only happens if you call wait, which as the name implies, makes the parent process wait until a child terminates. Wait's also useful for cleaning up zombie processes -- those are child processes that have already completed but which haven't been cleaned up via wait, yet. That's not really an issue here since the child process doesn't finish until after script1 is done, but it's worth knowing about for other situations.

quote:

Still not sure exactly how its working but it is working! Guess I got more reading to do but since I have a working example it should make it easier to understand.

The thing to understand is what happens when you call fork. It creates an exact duplicate of your current process -- same code, copy of all the data, etc. The only difference is the return value from fork, namely 0 for the child, and the pid of the child for the parent. Also, since the child process is a copy, any changes to variables won't be seen by the parent and vice-versa.

So you've essentially got two copies of script1 running once you call the fork. One's going to just pipe a few lines to script2 and then exit, and it doesn't matter that it's going to spend a while twiddling its thumbs while waiting on script2. The other goes on to finish the regular script1 stuff. It's also worth noting that if you leave off the "exit;" for the child process, it'll continue on executing the parent's code making a funny mess of things since all your post-fork code will get run twice by two different processes.

Also, here's something fun to consider:

code:

perl -e 'for ($i=0; $i<3; ++$i) { fork(); } print ".";'

This will output 8 periods instead of 4. The reason is that for loop continues to execute in the child processes, so you have children spawning children.

# ¿ Dec 1, 2008 15:48

Erasmus Darwin: Mar 6, 2001

Fenderbender posted:

s/\$.{8}).+(\$/\\\1~1\2/g would work too I would think. Untested and off the top of my head.

I think your regexp is still broken. Regardless, you need to check the file system to reliably determine a short name due to the possibility of naming collisions. If I make a root-level folder on C: called "Program Stuff", it'll wind up being Progra~2 due to Progra~1 being taken by "Program Files".

# ¿ Jan 16, 2009 05:09

Erasmus Darwin: Mar 6, 2001

Sartak posted:

That isn't as clear as I'd like it to be. Anyone want to take a stab at rewriting it better?

How's this look?

code:

my %is_special = map { $_ => 1 } qw/user project transaction/;

sub create_record {
    my $self = shift;
    my $type = shift;
    my $original_type = $type;

    if (! $is_special{$type}) {
        # If it's plural, let's try operating on the non-pluralized type.
        $type =~ s/s$//;
    }

    if ($is_special{$type}) {
        return $self->create_special_record($type, @_);
    } else {   
        return $self->create_mundane_record($original_type, @_);
    }
}

In non-plural cases, this checks $is_special twice for the same value. That's not the greatest thing, but it does make the code really nice and clear. But unless unless this is being called a ridiculous number of times, the clarity far outweighs any other concerns.

Changing the last bit to an else is just cosmetic obviously, but I think it looks a bit better having the two possible returns parallel each other. That being said, I've done plenty of cases with "if (foo) { return blah; } return blahblah;", so either/or.

Edit: Crap. Nevermind. Just realized the problem with it.
Edit 2: Ok, fixed it by adding in $original_type.

Erasmus Darwin fucked around with this message at 03:48 on Feb 3, 2009

# ¿ Feb 3, 2009 03:45

Erasmus Darwin: Mar 6, 2001

Xae posted:

Since the file comes out sorted, is there a way to do it with out a hash?

Yes. Just keep track of what the number is for the line you're printed so you can print a new prefix or just a comma as appropriate. Something like this (untested):

code:

$hdr_num = -1;
while (<>) {
  chomp;
  if (/^(\d+)\|(.+)/) {
    if ($hdr_num == $1) {
      print ",$2";
    } else {
      if ($hdr_num != -1) {
        print "\n";
      }
      print "$1|$2";
      $hdr_num = $1;
    }
  }
}
if ($hdr_num != -1) {
  print "\n";
}

# ¿ Feb 17, 2009 18:56

Erasmus Darwin: Mar 6, 2001

Shoes posted:

code:

    if( $pid = 0 ){

There's a bug here. You want == rather than =. With =, you'll be assigning 0 to $pid and then your conditional will always evaluate to false. Also, make sure you've got an exit statement at the end of the code being executed by your child process, otherwise both the parent and the child process will continue forking new processes.

# ¿ Feb 18, 2009 02:38

Erasmus Darwin: Mar 6, 2001

Shoes posted:

I'm guaranteed that this is going to be a list of a few hundred items. Is there a way to call 10 or 20 at a time?

http://forums.somethingawful.com/showthread.php?threadid=2961749#post349626854

# ¿ Feb 18, 2009 05:01

Erasmus Darwin: Mar 6, 2001

Beardless Woman posted:

Use Data:: Dumper to dump out the contents of $xml->{localLogDirectory} to see why it's coming up as a Hash.

I don't think localLogDirectory is coming up as a hash. It looks like that's getting expanded to "/srv/stats/log". It's the next variable in the string, $server, that seems to be the hash reference.

wolffenstein, are you sure that the corrected code that you posted still produces the wrong output? It looks like the incorrect version that you first posted would be exactly the sort of mistake that would produce the bad output in question.

If your corrected version's still causing problems, Data::Dumper's the way to go as Beardless Woman suggested. I'd recommend doing the dump on $serverRef to get the full picture.

Erasmus Darwin fucked around with this message at 21:18 on Mar 17, 2009

# ¿ Mar 17, 2009 21:05

Erasmus Darwin: Mar 6, 2001

Captain Frigate posted:

one of the things i need to unpack is an unsigned long long int, and i'm just not sure how to go about that because it doesn't look like unpack does that.

It looks like some versions of Perl support 'Q' for packing/unpacking an unsigned 64-bit integer. You can test it by doing:
perl -e 'pack("Q",0);'

If you get an error about Q being an invalid type for pack, it won't work. Otherwise, you're good to go.

If it doesn't work, you're on the right track with reading the value in two chunks. However, if by padding it with zeros, you mean multiplying it by a power of 10, that won't exactly work since the boundary between the two parts is a power of 2. So you want to do a bitshift (which will pad your value with zero bits).

HOWEVER, it's more complicated than that. If your version of Perl wasn't built with 64-bit integers (which if it was, you should be able to just use 'Q' as an unpack argument), you're going to run into overflow issues. So you'll probably want to use Math::BigInt instead.

# ¿ Jul 6, 2009 15:08

Erasmus Darwin: Mar 6, 2001

Erasmus Darwin posted:

So you'll probably want to use Math::BigInt instead.

Ok, I think I've got a decent Math::BigInt-based solution. Since you didn't specify, I'm just going to stick with reading a number in little-endian order. Assuming that $bin contains the binary representation of the 64-bit int, this should get it turned into a nice Math::BigInt object:

code:

$num = Math::BigInt->new('0x' . reverse(unpack('h*', $bin)));

Edit: If you want to read in a number in network/big-endian order, this should work:

code:

$num = Math::BigInt->new('0x' . unpack('H*', $bin));

However, byte order issues give me a headache, so I could have gotten something jumbled up. Caveat emptor, and make sure you test this code when you toss it into your script.

Erasmus Darwin fucked around with this message at 15:33 on Jul 6, 2009

# ¿ Jul 6, 2009 15:29

Erasmus Darwin: Mar 6, 2001

Captain Frigate posted:

perl crashes. Is that normal?

I can only get two results out of the various perl installations I've got lying around: The correct answer (on the one 64-bit install), and the invalid pack option error (on all the rest). However, the only one with any of the 64-bit -V options enabled is the full 64-bit one (64-bit Centos 5 box).

# ¿ Jul 7, 2009 23:04

Erasmus Darwin: Mar 6, 2001

Captain Frigate posted:

Hm, so you don't get the right answer on non-64-bit installs?

On the 32-bit installs that I tried it on, trying to invoke pack or unpack with the Q option produced a fatal error that ended the script. So I didn't get the right answer, but I didn't get a wrong answer, either -- the script just dies.

If you're going to run your script on a variety of machines, you can always use an eval block to test whether or not it works. Something like the following quick-and-ugly, untested code:

code:

eval {
  # Note: Use of string comparisons is deliberate.
  $a = 2**63 + 1;
  die '2**63 + 1 is wrong or inaccurate' if $a ne '9223372036854775809';
  $b = unpack('Q', pack('Q', $a));
  die 'mismatch from pack/unpack' if $a ne $b;
}
$64bit_ops_ok = ! $@;

It'd probably be more ideal to test it with 2**64 - 1, but I was worried about the 2**64 part of the calculation producing an overflow / float promotion even on a 64-bit platform, and I'm too lazy to see if that's the case.

And ideally, you could just check the config options satest3 mentioned that perl was compiled with, but since you're getting incorrect results from your own setup, an actual test might be more worthwhile.

Also, depending on how you're using the 64-bit value in the rest of your code, you might want to look into more efficient options. There's Math::BigInt::Lite which uses built-in scalars for the numbers until your reach the overflow point where it transparently flips to Math::BigInt. There are also more efficient libraries for doing Math::BigInt's calculations such as Math::BigInt::GMP and Math::BigInt::Pari.

quote:

Well either way, its looks like your solution works like a charm! Thanks a lot!

Glad I could help. It was a fun problem, and I was actually a little surprised when it turned out to be a one-line solution (excluding the 'use' line for the library, of course). I was expecting to have to feed two 32-bit integers into a Math::BigInt object with a bitshift in-between. I was just using the unpack('h*', $bin) part to double-check my assumptions about how the data actually looked when I noticed how easy it would be to manipulating that into a string that could initialize a Math::BigInt.

# ¿ Jul 8, 2009 18:50

Erasmus Darwin: Mar 6, 2001

This is a bit of a wild guess, but you aren't doing this on a Windows machine, are you? If so, you'll need to do "binmode(VID);" after you open it. From reading the manpages, I'm not sure that would be an issue, but it seems like it potentially could be.

# ¿ Jul 14, 2009 19:03

Erasmus Darwin: Mar 6, 2001

The short answer is that I'm not entirely sure. The manpage more or less says that if it's a binary file and there's any doubt, you should just go ahead and binmode it to be safe.

The longer answer is that some of the file I/O functions use a notion of "characters" where character is defined based on the file mode, the file contents, and so on. To complicate matters, other file I/O functions (such as seek and tell) always use bytes instead of characters for efficiency reasons. The manpage for read implies that it uses characters but that the default method is to treat them as bytes -- apparently it's not quite that simple since binmode fixed it.

Presumably, you were running across something where the character count and the byte count differed. Maybe it was a \r\n combo in a string. Maybe it was doing some utf8 weirdness. I'm not really sure since I haven't delved into how it handles that sort of stuff.

# ¿ Jul 14, 2009 19:53

Erasmus Darwin: Mar 6, 2001

Captain Frigate posted:

code:

     %daddy_hash=("little_hash",%little_hash,"thing_1",0,"thing_2",0);
}

What you want here is to use \%little_hash instead of just %little_hash. With the slash there, $daddy_hash{little_hash} will contain a hash reference to %little_hash. Without the \, the contents of %little_hash get flattened into the list being used to define %daddy_hash, which throws things off since you've effectively doing this:

code:

%daddy_hash = ("little_hash", "foo",
               0, "bar",
               0, "thing_1",
               0, "thing_2",
               0);

And that's just all sorts of screwed up. I'm not sure exactly how Perl would handle this since you've got a bunch of repeat assignments to the 0 key, and you've got a mismatched key at the end.

I think Triple Tech tried to point this out with his "oh snap" comment, but I'm not sure he fully understood the degree of your misunderstanding. Or maybe I've misunderstood.

Also, if you use the \ notion, you need to be aware that a reference to %little_hash points to the same memory that %little_hash uses -- it's not a copy. So if you do '$little_hash{blahblah} = 7' after creating %daddy_hash, it'll still modify the contents of the hash in $daddy_hash{little_hash}. Of course once your function exits, there won't be a reference to %little_hash anymore, so you no longer have to worry about it being modified via things not directly accessing the contents of %daddy_hash (unless you pass the reference somewhere else).

Finally, Data::Dumper is your friend for testing various data structures. Just throw "use Data::Dumper;" at the beginning of your program and then stick "print Dumper(\%test);" at the end, and it'll print a nice, readable representation of what your data structure looks like.

Erasmus Darwin fucked around with this message at 14:35 on Aug 11, 2009

# ¿ Aug 11, 2009 00:00

Erasmus Darwin: Mar 6, 2001

Captain Frigate posted:

So once the function exits, the data from %little_hash will still be there, but the original reference "%little_hash" won't be?

Correct. As long as there's a reference to the hash data (in this case via $test{little_hash}), Perl will keep the data around. That's the magic of garbage collection.

quote:

Will it still be a hash?

For the most part, yes. Technically, $test{little_hash} is a hash reference rather than a hash. That's actually a special scalar value that gets dereferenced (via the -> or % operators) to get at the underlying hash structure. However, the dereferencing is sometimes implicit since Perl knows that a nested structure has to be a reference (since that's the only way to stick an array or hash within another array or hash).

And to cut through that confusing bit of text, here are some examples:

code:

my %test = make_hashes(); # From before.
my $hashref = $test{little_hash};
print $hashref, "\t", ref($hashref); # Prints: HASH(0x1231232)       HASH
print $hashref->{foo}; # Prints: 0
print $test{little_hash}->{foo}; # Prints: 0
print $test{little_hash}{foo}; # Prints: 0 -- Note that -> was implied.
my %newhash = %$hashref; # %newhash is now a copy of little_hash's contents.

Note that at the end of this code, $test{little_hash} and $hashref both refer to the same hash in memory while %newhash is just a copy. So if you were to change $hashref->{foo} to 5, it'd also change in $test{little_hash}, but it wouldn't change in %newhash.

quote:

What I'm really looking for is a way to put a hash into another one, so yeah, nested hashes.

In addition to using a reference to a hash variable (i.e. when we did \%little_hash back in the make_hashes function), you can also create anonymous hash references with brackets { }. For example:

my %test = ( little_hash => { foo => 0, bar => 0 }, thing_1 => 0, thing_2 => 0 );

(Note that parentheses are used for the outer enclosing punctuation and brackets are used for the subhash.)

Also, I'd recommend taking a look at the perlref, perllol, and perldsc manpages. They're a little on the dense side, and they'll cover some stuff that's likely to make your eyes glaze over, but they do cover the whole mess in detail. It's not a bad system once you're used to it (especially if you've got experience in dealing with pointers in other languages), but it does take a bit for it to all sink in.

# ¿ Aug 11, 2009 15:06

Erasmus Darwin: Mar 6, 2001

Close. You want a dollar-sign in front of test, not a percent. So it's:
$false_thing=$test{little_hash}{foo};

# ¿ Aug 11, 2009 16:43

Erasmus Darwin: Mar 6, 2001

Captain Frigate posted:

Wait, one more small question: if I set up the hash that way, do I still have to put the key in quotes?

Perl generally lets you omit the quotes around a hash key when it's inside braces like that. That works with any hash -- you don't need a special setup to do that. However, depending on what sort of symbols you've got in the key, you might have to use quotes. For example, if you've got spaces in there, you'll need to quote it.

quote:

EDIT: Do I have to make any changes if I want to make one of the members an array?

Nope. It works the same way. The only difference would be using \@litte_array instead of \%little_hash. Or, if you go the anonymous route, use brackets [] instead of braces {}.

# ¿ Aug 11, 2009 20:08

Erasmus Darwin: Mar 6, 2001

Captain Frigate posted:

code:

sub foo {
     my(input_1, input2) = @_;
     %hash_1 = (
          field_1 => input_1,
          field_2 => input_2
     );
     return \%hash_1;
}

This is fine, except for some minor syntax errors (missing dollar signs in front of input_1 and input_2, and a missing underscore in input_2 in the 'my' statement). But your understanding of what's going on seems fine.

quote:

code:

sub bar {
     $temp = foo(1,2);
     %hash_2 = (
          field_1 => 3,
          field_2 => 4,
          hash_1 => %{$temp}
     );
     return \%hash_2;
}

First, "%{$temp}" can be written as "%$temp" since the "{}" in this case are just acting like parentheses (i.e. controlling order of operation) with regard to how Perl processes the data type.

Second, "%$temp" (and the equivalent "%{$temp}") is incorrect anyway because you need a hash reference as the value to store in the hash. So you just want "$temp". (Or, if you were crazy, you could do "\%$temp".)

# ¿ Aug 17, 2009 20:45

Adbot: ADBOT LOVES YOU

# ¿ May 2, 2024 06:28

Erasmus Darwin: Mar 6, 2001

You can also use Math::BigInt and then just explicitly use a Math::BigInt object for the items that will hold large values.

Also, here's a quick and dirty attempt at emulating "pack('Q', $num)" using Math::BigInt. My morning coffee hasn't kicked in, and I made it work by comparing the binary representations and just fiddling with it until it matched, so I may have screwed something up. So standard disclaimers apply. Oh, and it's little endian only. Anyway, here's the code:

code:

#!/usr/bin/perl -w

use Math::BigInt;

$x = 0x12345678;

$goal = pack('Q', $x);

$big_num = new Math::BigInt($x);

$hex = $big_num->as_hex();

$hex =~ s/^0x//;

if (length($hex) < 16) {
    $hex = '0' x (16 - length($hex)) . $hex;
}

$attempt = reverse(pack('H*', $hex));

if ($attempt eq $goal) {
    print "Goal matched.\n";
} else {
    print "Failure.\n";
}

# ¿ Aug 21, 2009 16:00

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Perl Short Questions Megathread: executable line noise

«‹›2 »