The Perl Short Questions Megathread: executable line noise

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Perl Short Questions Megathread: executable line noise

«‹›72 »

Erasmus Darwin: Mar 6, 2001

syphon^2 posted:

The script is responsible for querying servers out of a pool of ~300. New ones are added and old ones are deprecated on a nearly daily basis.

The master script could be responsible for installing or updating itself on the servers. Alternately, the master script could tell the remote server to run the client script, which would be located on a share on the server running the master script.

Another possibility is to just run everything in parallel. A bit of experimenting shows that fork() comes up with an error if you try and create more than 64 child processes for a single process, but I was able to create 59 more children from one of those child processes. 59 seems like an odd and arbitrary limit, but I haven't dug further to see what's going on or whether there's a limit on the total number of subprocesses available for a given family. Regardless, even processing 50 servers at a time should still be relatively quick: 300 servers / 50 servers/batch * 52 seconds latency for dumping all the subkeys = just over 5 minutes.

# ? Jul 1, 2008 15:29

Adbot: ADBOT LOVES YOU

# ? May 21, 2024 20:11

Undecided: May 13, 2005

I don't have a lot of Perl experience, but I've been trying to get Net: :ssh:

Perl installed from CPAN since yesterday, and no matter what I try it fails. I have even tried it on two different distributions (Fedora 9 and Ubuntu JeOS 8), and had a coworker try on their environment.

Are Perl modules really this horrible?

The Dependency link on the cpan website quotes a 17% chance for it to install correctly based on their automated tests. [url]http://cpandeps.cantrell.org.uk/?module=Net: :ssh:

:Perl;perl=latest[/url]

It ends up with about 20 other modules that it has to install, and several of them end up failing to make. They claim installation is an easy process, but on a completely clean OS install it still never works. The main culprits seem to be Crypt:: DH, Math::Pari, and Math::GMP.

I even tried installing the package libnet-ssh-perl from apt and it goes through the process and is marked as installed..but can't be found on the disk.

Getting this module installed is taking longer than writing the script that I need to use it in... Anyone have any suggestions in general on getting libraries to install successfully?

Undecided fucked around with this message at 16:14 on Jul 3, 2008

# ? Jul 2, 2008 16:16

ShoulderDaemon: Oct 9, 2003; support goon fund; Taco Defender

Undecided posted:

I don't have a lot of Perl experience, but I've been trying to get Net:Perl installed from CPAN since yesterday, and no matter what I try it fails. I have even tried it on two different distributions (Fedora 9 and Ubuntu JeOS 8), and had a coworker try on their environment.

Are Perl modules really this horrible?

Yes.

Undecided posted:

The Dependency link on the cpan website quotes a 17% chance for it to install correctly based on their automated tests. [url]http://cpandeps.cantrell.org.uk/?module=Net::Perl;perl=latest[/url]

It ends up with about 20 other modules that it has to install, and several of them end up failing to make. They claim installation is an easy process, but on a completely clean OS install it still never works. The main culprits seem to be Crypt:H, Math::Pari, and Math::GMP.

Math::Pari and Math::GMP at least depend on some C libraries being present, and will fail if they can't link to those.

Undecided posted:

I even tried installing the package libnet-ssh-perl from apt and it goes through the process and is marked as installed..but can't be found on the disk.

Using apt to install packages where possible is the best plan, as it doesn't involve the huge brokenness that is CPAN. However, that package is for Net::SSH, not Net: :ssh:

:Perl. Furthermore, if you have been messing around with CPAN, it's possible you've accidentally used it to upgrade perl, and now have a version that does not search in the correct places for libraries.

Undecided posted:

Getting this module installed is taking longer than writing the script that I need to use it in... Anyone have any suggestions in general on getting libraries to install successfully?

I personally use the Debian tool dh-make-perl to change CPAN packages into Debian packages rather than having two different package managers fighting over my perl install, but it can be a little tricky if you aren't familiar with Debian source packages, and sometimes requires manual dependency tracking. Still, I guess "apt-get install dh-make-perl; dh-make-perl --install --cpan Net: :ssh:

:Perl" might be worth a shot.

Edit: If you're on x86 and something reasonably close to Debian sid, I can get some unofficial Debian packages of the libraries you'll need up, I think.

# ? Jul 2, 2008 17:29

Kidane: Dec 15, 2004; DANGER TO MANIFOLD

I had the same problem with Net: :ssh:

:Perl and ended up using Expect.pm to interface with SSH instead.

# ? Jul 2, 2008 17:31

Undecided: May 13, 2005

Thanks for the input.

I tried a bit more and still couldn't get Net: :ssh:

:Perl to work, so I ended up finding Net: :ssh:

:Expect. It actually installed and is working out fine for what I need.

# ? Jul 3, 2008 20:47

EvilJay: Jul 25, 2005

Come to think of it, unless I have installed Net::SSH from apt-get or yum it has been a pain in the rear end...

# ? Jul 5, 2008 16:30

uG: Apr 23, 2003; by Ralp

I'm using CGI::Application and HTML::Template and can't figure out how to set the runmodes params with a hashref. Is this possible?

# ? Jul 6, 2008 06:39

Mario Incandenza: Aug 24, 2000; Tell me, small fry, have you ever heard of the golden Triumph Forks?

This doesn't work?

code:

$self->run_modes({
  start => \&start,
  lol => 'hey',
})

# ? Jul 6, 2008 10:04

Notorious b.s.d.: Jan 25, 2003; by Reene

Undecided posted:

I even tried installing the package libnet-ssh-perl from apt and it goes through the process and is marked as installed..but can't be found on the disk.

dpkg -L libnet-ssh-perl

# ? Jul 6, 2008 17:25

<deleted user>

I had no idea Vim had built-in perl support (and ruby/python). I have absolutely no idea how I've not heard of it before.

Run a perl regex on each of lines 10-20. Since $_ is set to value of the line being examined, both lines below are the same:

code:

:10,20perldo $_ =~ s/foo/bar/gi
:10,20perldo s/foo/bar/gi

I'd much rather write perl regex for line manipulation than use Vim's regex.

Vim exposes some of its innards to the perl interpreter. This will print the perl version as a message on the Vim console:

code:

:perl VIM::Msg(sprintf("perl version is %vd", $^V));

Delete a random line from the current buffer:

code:

:perl $curbuf->Delete(int(rand($curbuf->Count()))+1);

I might write a View::Vim component for querying our app framework from inside Vim. That would be nice for development.

# ? Jul 8, 2008 21:32

Triple Tech: Jul 28, 2006; So what, are you quitting to join Homo Explosion?

Is there anything inherently good/bad/better/wrong with passing things by reference and modifying them? I almost never do it, I always use return values.

code:

my @stack = something_that_returns_a_list();

# versus
my @stack2;
something_that_modifies_by_ref(\@stack2);

I like to write my code semi functionally using map and whatnot, but sometimes I just need to accumulate and I think saying something like fill_er_up(\@stack, \%lookup, $items) would actually be the most concise way.

# ? Jul 8, 2008 22:25

Filburt Shellbach: Nov 6, 2007; Apni tackat say tujay aaj mitta juu gaa!

Triple Tech posted:

Is there anything inherently good/bad/better/wrong with passing things by reference and modifying them?

It doesn't really matter. I err on the side of returning values in a functional style. Modifying a reference can cause unexpected changes elsewhere in the program.

Sometimes you need references though, when you need to modify or return multiple values, or you have to return some other value.

# ? Jul 9, 2008 01:02

German Joey: Dec 18, 2004

Triple Tech posted:

Is there anything inherently good/bad/better/wrong with passing things by reference and modifying them? I almost never do it, I always use return values.
code:
my @stack = something_that_returns_a_list();

# versus
my @stack2;
something_that_modifies_by_ref(\@stack2);
I like to write my code semi functionally using map and whatnot, but sometimes I just need to accumulate and I think saying something like fill_er_up(\@stack, \%lookup, $items) would actually be the most concise way.

it depends how big your stack is, obviously! returning a new variable (essentially a copy of the data) is code-cleaner but more inefficent. its not too bad if your list is only a few thousand elements, but once you get into the millions and billions you won't really have that option. besides, indirect modification (either through references or pointers) is a pretty standard way to do things in many programming languages.

# ? Jul 10, 2008 00:14

MrHyde: Dec 17, 2002; Hello, Ladies

I'm trying to learn perl and in the process I created this little program that should list all the directories in a given path recursively. It doesn't work because for some reason in the recursive portion of the code perl doesn't seem to recognize any of the contents as either folders or directories. Does anyone know why this might be?

code:

use strict;
use warnings;

my $configFile = "config.txt";
my $logDirectory = "";
my $outputDirectory = "";
my @word_list = ();

#read in the config file
open(CONFIG, $configFile) || die("Unable to find config file");
my @config_data=<CONFIG>;
close(CONFIG);

my $setting = '';
foreach my $configLine (@config_data)
{
	#if you found a comment or an empty line get next line 
	if($configLine =~ /^#/ || $configLine =~ /^$/)
	{
		next;
	}
	#if you found a config section line make note then get the next line
	if($configLine =~ /^\[(.*)\]/)
	{
		$setting = $1;
		next;
	}
	#once you're in the config, get all the lines in that config
	for($setting)
	{
		/LogFilesDirectory/ && do
		{
			$logDirectory = $configLine;
			chomp($logDirectory);
		};
		/Output/ && do
		{
			$outputDirectory = $configLine;
			chomp($outputDirectory);
		};
		/WordList/ && do
		{
			push(@word_list, $configLine);
		};
	}
}
#end reading config file

opendir(LOGDIR, $logDirectory) || die("Couldn't open $logDirectory");
my @dirList = readdir(LOGDIR);
closedir(LOGDIR);
foreach my $fileName (@dirList)
{
	print "$fileName\n";
	if(-f $fileName)
	{
		if($fileName =~ /^.*\.log$/)
		{
			#print "$fileName\n";
		}
	}
	if(-d $fileName && $fileName !~ /\./ && $fileName !~ /\.\./)
	{
		&readSubDir("$logDirectory\\$fileName");
	}
}

sub readLogFile
{
	
}
sub readSubDir
{
	opendir(LOGDIR, $_[0]) || die("Couldn't open $_[0]");
	my @dirList = readdir(LOGDIR);
	closedir(LOGDIR);
	foreach my $fileName (@dirList)
	{
		print "\t$fileName\n";
		if(-f $fileName)
		{
			if($fileName =~ /^.*\.log$/)
			{
				#print "$fileName\n";
			}
		}
		if(-d $fileName && $fileName !~ /\./ && $fileName !~ /\.\./)
		{
			&readSubDir("$logDirectory\\$fileName");
		}
	}
}

edit to fix code, apparently [php tags mess with my escape slashes

MrHyde fucked around with this message at 01:26 on Jul 10, 2008

# ? Jul 10, 2008 01:16

ShoulderDaemon: Oct 9, 2003; support goon fund; Taco Defender

MrHyde posted:

It doesn't work because for some reason in the recursive portion of the code perl doesn't seem to recognize any of the contents as either folders or directories. Does anyone know why this might be?

You need to prepend $filename with its parent directory; if you open a directory foo and readdir out the entry for the file foo/bar, then the result of the readdir is just bar, and the various file tests won't be able to find it.

Edit: Also, periods are magic in regular expressions, and your regex tests don't do what you think they do. ~~You probably just want a single test, /(^\.)|(\.log$)/, and exclude based on that before bothering to check if the file is a normal or a directory.~~ Misunderstood your regex tests somewhat.

Edit 2: Oh, and use / for the path separator, not \. It'll work on all current operating systems, and it won't cause problems when you use it right in front of a $ in your quoted strings.

Edit 3: And I'm not sure why you have essentially the entire readSubDir function copied literally in your main code path. I'd probably restructure what you've done like so:

code:

use strict;
use warnings;

sub readLogFile {
  my ($logFile) = @_;
  # Do something with $logFile.
}

sub readLogDir {
  my ($dir) = @_;

  my @subdirs = ();

  opendir( DIR, $dir ) or die "opendir $dir: $!\n";

  while ( defined ( my $entry = readdir( DIR ) ) ) {

    next if ( $entry =~ /^\./ );

    my $path = "$dir/$entry";

    if ( -f $path and $entry =~ /\.log$/ ) {
      readLogFile( $path );
    } elsif ( -d $path ) {
      push @subdirs, $path;
    };

  };

  closedir( DIR );

  foreach my $subdir ( @subdirs ) { readLogDir( $subdir ) };

}

readLogDir( "log_dir" );

ShoulderDaemon fucked around with this message at 01:33 on Jul 10, 2008

# ? Jul 10, 2008 01:21

MrHyde: Dec 17, 2002; Hello, Ladies

ShoulderDaemon posted:

You need to prepend $filename with its parent directory; if you open a directory foo and readdir out the entry for the file foo/bar, then the result of the readdir is just bar, and the various file tests won't be able to find it.

Edit: Also, periods are magic in regular expressions, and your regex tests don't do what you think they do. You probably just want a single test, /(^\.)|(\.log$)/, and exclude based on that before bothering to check if the file is a normal or a directory.

Ah, let me edit my post because it looks like those [php] tags I used screwed with my regexes.

Regarding the prepending, why does it work before the recursion? In that instance I haven't prepended the files but they still register as file or directory. Output from this script looks like this:

code:

C:\Documents and Settings\User\Desktop>prog.pl
.
..
.TrillianLogToGraph.pl.swp
config.txt
movies.txt
spock.gif
temp (this is a folder and is identified properly in the first part)
        .
        ..
        New Folder (also a folder, but NOT identified properly...the program never sees the files inside it)
        something.log
trillian.log
TrillianLogToGraph.pl

MrHyde fucked around with this message at 01:29 on Jul 10, 2008

# ? Jul 10, 2008 01:25

ShoulderDaemon: Oct 9, 2003; support goon fund; Taco Defender

MrHyde posted:

Regarding the prepending, why does it work before the recursion? In that instance I haven't prepended the files but they still register as file or directory.

Is the initial directory being searched the same as the current working directory? In that case, the bare entry would be a valid relative path to those files.

# ? Jul 10, 2008 01:35

MrHyde: Dec 17, 2002; Hello, Ladies

ShoulderDaemon posted:

Is the initial directory being searched the same as the current working directory? In that case, the bare entry would be a valid relative path to those files.

I thought of that and moved the program to another directory and ran it. It still worked the same. The config file for the program is as follows:

code:

#Directory of thelog files
[LogFilesDirectory]
C:\Documents and Settings\User\Desktop

#Output directory
[Output]
C:\Documents and Settings\User\Desktop

#Words you wish to search for
[WordList]
the
phrase

quote:

Edit 3: And I'm not sure why you have essentially the entire readSubDir function copied literally in your main code path. I'd probably restructure what you've done like so:

It started as a "learn how to list contents of directories" and became "now do it recursively". It's a learning work in progress, not really nice and clean.

MrHyde fucked around with this message at 01:46 on Jul 10, 2008

# ? Jul 10, 2008 01:41

ShoulderDaemon: Oct 9, 2003; support goon fund; Taco Defender

MrHyde posted:

I thought of that and moved the program to another directory and ran it. It still worked the same. The config file for the program is as follows:

Doesn't matter what directory the program's in; it matters what directory is the current working directory when you run the program.

# ? Jul 10, 2008 01:42

MrHyde: Dec 17, 2002; Hello, Ladies

ShoulderDaemon posted:

Doesn't matter what directory the program's in; it matters what directory is the current working directory when you run the program.

I ran it from command line when it was on my desktop like so:

code:

C:\Documents and Settings\User\Desktop>prog.pl

then moved it and ran it like so:

code:

C:\>prog.pl

ok, cleaned it up

code:

use strict;
use warnings;

my $configFile = "config.txt";
my $logDirectory = "";
my $outputDirectory = "";
my @word_list = ();


&readConfigFile($configFile);
&readSubDir($logDirectory);

sub readConfigFile
{
	open(CONFIG, $_[0]) || die("Unable to find config file");
	my @config_data=<CONFIG>;
	close(CONFIG);

	my $setting = '';
	foreach my $configLine (@config_data)
	{
		#if you found a comment or an empty line get next line 
		if($configLine =~ /^#/ || $configLine =~ /^$/)
		{
			next;
		}
		#if you found a config section line make note then get the next line
		if($configLine =~ /^\[(.*)\]/)
		{
			$setting = $1;
			next;
		}
		#once you're in the config, get all the lines in that config
		for($setting)
		{
			/LogFilesDirectory/ && do
			{
				$logDirectory = $configLine;
				chomp($logDirectory);
			};
			/Output/ && do
			{
				$outputDirectory = $configLine;
				chomp($outputDirectory);
			};
			/WordList/ && do
			{
				push(@word_list, $configLine);
			};
		}
	}
}

sub readSubDir
{
	opendir(LOGDIR, $_[0]) || die("Couldn't open $_[0]");
	my @dirList = readdir(LOGDIR);
	closedir(LOGDIR);
	foreach my $fileName (@dirList)
	{
		if(-f "$_[0]/$fileName")
		{
			if($fileName =~ /^.*\.log$/)
			{
				print "*";
			}
			print "\t$fileName\n";
		}
		if(-d "$_[0]/$fileName" && $fileName !~ /\./ && $fileName !~ /\.\./)
		#consider using $fileName ne "." && $fileName ne ".."
		{
			print "\tD-$_[0]/$fileName\n";
			&readSubDir("$_[0]/$fileName");
		}
	}
}

sub readLogFile
{
	
}

Thanks for the help ShoulderDaemon. The file/directory names definitely had to be full paths which messed it up. I'm not sure why it worked at all before I made the changes. I'm sure it has something to do with the working directory stuff I guess I just didn't get.

MrHyde fucked around with this message at 02:14 on Jul 10, 2008

# ? Jul 10, 2008 01:45

Ninja Rope: Oct 22, 2005; Wee.

A couple of comments:

code:

my @word_list = ();

This is unnecessary. While assigning "" to scalars does have an effect (they default to "undef"), arrays are already initialized as empty arrays. It doesn't appear that your program is relying on your scalars to be initialized to empty strings, so you can probably remove that too.

code:

&readConfigFile($configFile);
&readSubDir($logDirectory);

Using ampersands on function calls is outdated. Just call them with readConfFile($configFile);.

code:

sub readConfigFile
{
	open(CONFIG, $_[0]) || die("Unable to find config file");

It's bad practice to use the 2-parameter form of open() this way. If $_[0] contains something like ">/etc/passwd", your program will overwrite /etc/passwd. Obviously that's not much of a concern here, but it's good to get into the habit of doing it the safe way. Try open(CONFIG, '<', $_[0]);.

Additionally, all-caps filehandles are outdated (but still useful). Unless you've got a really good reason to use one, it's probably better to use a simple scalar variable. Something like my $config_fh; open($config_fh, $_[0]).

In general, you should assign local names to all parameters in a sub, sort of like this:

code:

sub thing {
 my ( $config_file_name, $generic_parameter_1, @other_junk ) = @_;
 # Now use $config_file_name and such here, instead of $_[0].
}

This is mainly just for clarity, but it also makes life easier when you end up changing the sub's parameters. If you end up changing anything in @_ inside a sub, it will change the variables passed into the function (as if all elements of @_ were secretly references to the original data), so be careful when using @_ directly.

code:

	my $setting = '';

Again, no need to initialize to an empty string here.

code:

		#once you're in the config, get all the lines in that config
		for($setting)
		{
			/LogFilesDirectory/ && do
			{
				$logDirectory = $configLine;
				chomp($logDirectory);
			};
			/Output/ && do
			{
				$outputDirectory = $configLine;
				chomp($outputDirectory);
			};
			/WordList/ && do
			{
				push(@word_list, $configLine);
			};
		}

Stylistically I don't care for this at all. I think it's very obtuse and confusing to developers to see for used this way, and it would be clearer to simply match against $setting directly. This is valid perl and I'm sure it works fine, I just dislike the paradigm because I believe it's not intuitive.

While it is valid here, I really dislike using the implicit $_ whenever possible. There are side effects to using $_ in loops that may crop up if you're not aware of them, but mostly it is much clearer, I believe, to use the foreach my $var (@stuff) form, which puts the loop variable data in $var, like you did above. I would also suggest changing /WordList/ && do to a regular if statement.

Finally, I greatly dislike the use of do {} in this context (or as error handling for failed function calls, like open(...) or do { die; }. I perfer a simple if statement. Again, technically valid, but I think it can be non-intuitive. A lot of the developers I work with disagree.

code:

			if($fileName =~ /^.*\.log$/)

The construct ^.* is unnecessary here. This regular expression would be better written as /\.log$/. The "match anything at the beginning" part is implied. As a rule of thumb, any time you use .* you should think twice because it is rarely necessary and correct.

I hope this doesn't seem to strict, but I've found that the above corrections help make programs that are easier to maintain over time.

# ? Jul 10, 2008 13:54

MrHyde: Dec 17, 2002; Hello, Ladies

Ninja Rope posted:

Additionally, all-caps filehandles are outdated (but still useful). Unless you've got a really good reason to use one, it's probably better to use a simple scalar variable. Something like my $config_fh; open($config_fh, $_[0]).

What's the difference between the all-caps filehandles and just using a scalar variable?

Ninja Rope posted:

Finally, I greatly dislike the use of do {} in this context (or as error handling for failed function calls, like open(...) or do { die; }. I perfer a simple if statement. Again, technically valid, but I think it can be non-intuitive. A lot of the developers I work with disagree.

So are you saying instead of the open(...) or do { die; } (it's what I use in PHP so I just intuitively used it here) I should just put the open statement inside an if? Would I just put the {die;} portion of the code inside an else then?

Ninja Rope posted:

I hope this doesn't seem to strict, but I've found that the above corrections help make programs that are easier to maintain over time.

Nah, I'm trying to learn so this is great feedback. I'm picking most of the stuff I'm doing up off websites and whatnot (haven't gotten around to picking up a book) so I'm sure some of it may be ill advised or outdated. The comments are much appreciated.

# ? Jul 10, 2008 16:24

Triple Tech: Jul 28, 2006; So what, are you quitting to join Homo Explosion?

Goodness, I didn't know this was crit Perl time... Anyway, scalars are far, far superior to filehandles (symbol thingies?) because they will automatically close when the scalar goes out of scope. And, filehandles are global in scope (generally bad) while scalars, defined with my(), are the most limited in scope.

Small/limited scope is good.

code:

{
  open my($fh), 'foo.txt' or die;
  process($fh);
}

# $fh is already closed and dead! :)

# ? Jul 10, 2008 16:31

Midelne: Jun 19, 2002; I shouldn't trust the phones. They're full of gas.

MrHyde posted:

So are you saying instead of the open(...) or do { die; } (it's what I use in PHP so I just intuitively used it here) I should just put the open statement inside an if? Would I just put the {die;} portion of the code inside an else then?

I believe he's referring to open(CONFIG, $_[0]), suggesting that you instead explicitly open the file only for input using open(CONFIG, '<', $_[0].

The difference is that the first one, as he pointed out, does bad things to what it opens if there was preexisting data.

edit: My first post in the Perl thread! Now I am a man.

# ? Jul 10, 2008 16:47

Ninja Rope: Oct 22, 2005; Wee.

MrHyde posted:

What's the difference between the all-caps filehandles and just using a scalar variable?

People above covered it well, and it's also discussed in Conway's "Perl Best Practices", which I thought was released for free but I can't seem to find a copy of it. Basically it comes down to scoping and code clarity issues, and I think it's a lot more intuitive to use a scalar like everything else. There's also IO::Handle.

quote:

So are you saying instead of the open(...) or do { die; } (it's what I use in PHP so I just intuitively used it here) I should just put the open statement inside an if? Would I just put the {die;} portion of the code inside an else then?

Yeah. Just use an if statement if there are multiple lines needed to handle your open() (or other function) failures. I'd write it like this:

code:

if ( !open($fh, '<', $var) ) {
 # Error handling code goes here.
}

If you just want to die (or run any other simple statement), this is okay too:

code:

open($fh, '<', $var) || die "Can't open file: $!";

Which is similar to what you have already. I meant the advice for your regular expression matching code (which uses do {}), but I was also trying to point out that I have seen it elsewhere used to handle failures in open. Sorry if I was unclear.

quote:

Nah, I'm trying to learn so this is great feedback. I'm picking most of the stuff I'm doing up off websites and whatnot (haven't gotten around to picking up a book) so I'm sure some of it may be ill advised or outdated. The comments are much appreciated.

Any time. Conway's "Perl Best Practices" is a good book to follow. I think some of what he says is unnecessary, but I would say I agree with 95% of it and it's one of the best books for learning the right way to do things in Perl. In general, I try and use code that looks similar to how it would be written in C while still using standard Perl-isms where I can.

Edit:

Triple Tech posted:

code:

{
  open my($fh), 'foo.txt' or die;
  process($fh);
}

# $fh is already closed and dead! :)

This is fine, but I don't think it's really necessary. I'd rather see the explicit close() simply for consistency and in the case things get changed around later and $fh is forgotten about. However, this code is more interesting because you used open() without parentheses along with an or statement.

This works fine, but changing that or to a || completely changes the meaning of the program, despite the fact or and || apparently do the same thing. In reality, though, || binds tighter than or, so while or applies to the whole open statement, || applies only to the closest value, which in this case is 'foo.txt'. Since 'foo.txt' (a non zero-length string) is always true, code using open without parentheses and || would never die on error, since the || is testing 'foo.txt' instead of the return value from open. In this case, or will work with or without parentheses, but || will not.

(I'm not pointing this out because I think you don't know it, Triple Tech, I was pointing it out for anyone else who might not. Also, this is why I always use parentheses around open, and prefer its placement inside an if block.)

Ninja Rope fucked around with this message at 06:59 on Jul 11, 2008

# ? Jul 11, 2008 06:51

<deleted user>

MrHyde posted:

What's the difference between the all-caps filehandles and just using a scalar variable?

Perl uses something called a "typeglob" (a "GV" internally) to keep track of things defined in a package for a given name. The "things" can be scalars, arrays, hashes, filehandles, formats, subs, etc. Think of a typeglob as "all things in this package with this name". The typeglob is what allows you to have both $foo and @foo and %foo and a filehandle named foo.

Perl uses context to determine what type of thing you refer to.

code:

package main;
use strict;
use warnings;

# perl expects that STDOUT is a filehandle because that is
# what print expects (print provides the filehandle context).
print STDOUT "hello\n";

# in this context, perl thinks the same bareword is a sub!
{
   no strict 'subs';
   no warnings;
   STDOUT;  # same as &STDOUT()
}

STDOUT could be any name. The point is, perl associates the name to a thing of some type. STDOUT is only a filehandle for print because that is what it expects.

Now let's get really weird. In the same script, add:

code:

my $STDOUT = "WTF";
my @STDOUT = qw/1 2 3/;
# prints "WTF3" to stdout
print STDOUT $STDOUT . @STDOUT . "\n";

Again, print wants a filehandle first, so that is how it treats the bareword STDOUT. After that, the scalar $STDOUT is seen, and then @STDOUT (in scalar context). All of these exist at the same time because they are different types of things referred to by the same name in the symbol table.

Want more?

You can use * to refer directly to a typeglob.

code:

# we'll use this soon...
sub foo { print "foo as sub got ${_[0]}" }

# scalar reference to typeglob
my $foo = \*STDOUT;

# print scalar value of $foo to STDOUT
print $foo "$foo\n";

# now see if you can figure these out...
*main::fuz = *main::foo;
fuz($foo);
fuz(*foo);

This is actually a useful idiom (especially when you bring local into the mix), but I'll stop there. Also be aware there is a *FOO{TYPE} syntax that allows you to refer to a thing of a specific type.

Anyway, the reason a bareword filehandle is frowned upon is that you are creating a thing of a certain type by implied context in package scope. Using a scalar reference is more explicit and has lexical scope.

# ? Jul 12, 2008 19:44

<deleted user>

open implies the '<' mode if none is specified. It won't ever clobber existing files unless specifically instructed to do so.

# ? Jul 12, 2008 19:52

<deleted user>

Triple Tech posted:

Goodness, I didn't know this was crit Perl time... Anyway, scalars are far, far superior to filehandles (symbol thingies?) because they will automatically close when the scalar goes out of scope. And, filehandles are global in scope (generally bad) while scalars, defined with my(), are the most limited in scope.

Note that open assigns a filehandle reference to a scalar if passed. The scalar is not a filehandle.

quote:

Small/limited scope is good.

code:

{
  open my($fh), 'foo.txt' or die;
  process($fh);
}

# $fh is already closed and dead! :)

~~That's a bit useless because $fh only exists within the open() call and would go out of scope by the time process() was called.~~

(edit: totally wrong, ignore me)

# ? Jul 12, 2008 19:58

Filburt Shellbach: Nov 6, 2007; Apni tackat say tujay aaj mitta juu gaa!

genericadmin posted:

Triple Tech posted:
code:
{
  open my($fh), 'foo.txt' or die;
  process($fh);
}
That's a bit useless because $fh only exists within the open() call and would go out of scope by the time process() was called.

That's not true. Function arguments don't create a new lexical scope, so the scope of $fh is the open call to the close brace. Basically, anywhere you can have a value, you can my a variable on the spot and you're probably good to go.

# ? Jul 12, 2008 20:05

<deleted user>

Sartak posted:

That's not true.

Sorry about that... I never use my that way so it just looked wrong.

code:

*genericadmin::post = sub { think() && logout() };

# ? Jul 12, 2008 20:56

Filburt Shellbach: Nov 6, 2007; Apni tackat say tujay aaj mitta juu gaa!

I've got one more for you!

genericadmin posted:

open implies the '<' mode if none is specified. It won't ever clobber existing files unless specifically instructed to do so.

If the user has enough control over the filename, then they can do some damage. This is because two-arg open looks at the first few characters of the filename to figure out what mode to open the file in. Three-arg open is not susceptible to any of these:

code:

my $filename = ">credit-cards.sqlite";
open(HANDLE, $filename);

That opens your credit card database for writing, which first clears its contents!

It gets even better, because Perl supports opening handles to other programs in open, the user can run arbitrary programs as you:

code:

my $filename = "|echo MWA HA HA"; # or better yet, "|rm *"..
open(HANDLE, $filename);

I play it safe and always use three-arg open, which in the above examples will try to read from the files named >credit-cards.sqlite and |echo MWA HA HA. These are still annoying, but they can't really hurt you.

# ? Jul 12, 2008 22:05

<deleted user>

Sartak posted:

I've got one more for you!

Right, I thought someone above was saying that using the two-arg form with no mode was bad because it could clobber an existing file.

Also... does anyone in this thread do much XS?

# ? Jul 13, 2008 03:55

Subotai: Jan 24, 2004

genericadmin posted:

Also... does anyone in this thread do much XS?

I've done some in XS. It was like a year and a half ago though.

# ? Jul 13, 2008 04:09

Ninja Rope: Oct 22, 2005; Wee.

genericadmin posted:

Right, I thought someone above was saying that using the two-arg form with no mode was bad because it could clobber an existing file.

Also... does anyone in this thread do much XS?

That was me. I was saying that using the two-arg form of open is unsafe when the second argument is under the user's control. Obviously if the second argument is properly cleaned, or not under the user's control, then it is not an issue, but it never hurts to be more explicit and use the three-argument form.

I've done some XS but I wouldn't say I'm an expert. The last XS I did was creating a Perl wrapper around the HP Open View API. It worked out pretty well, considering.

# ? Jul 13, 2008 06:28

<deleted user>

quote:

I've done some XS

I hooked $SIG{__WARN__} to call pstack and was able to answer my own question. I wasn't handling an undef SV* properly.

# ? Jul 13, 2008 20:26

<deleted user>

Is anyone familiar with how fields.pm performs since the 5.009 changes? I don't have a perl > 5.8.8 anywhere to test.

These are benchmark results for 5.8.5 for creating vanilla objects and accessing an instance attribute.

no_fields: simple bless({}, $class) constructor
restricted hash: same, but uses Hash::Util::lock_keys() after bless
fields_untyped: constructor has "my $self = shift"
fields_typed: constructor has "my MyClass $self = shift"

code:

bench object creation... ($sv = $foo->{new})
                     Rate fields_untyped fields_typed restricted_hash  no_fields
restricted_hash 14165/s              --           -84%         -85%         -86%
fields_untyped  90334/s            538%             --          -7%          -8%
fields_typed    96863/s            584%             7%           --          -2%
no_fields       98646/s            596%             9%           2%           --

Looks like locking hash keys is expensive... wonder if its better >= 5.009 ..?

code:

bench attribute access ($a = $foo->{bar})...

                     Rate fields_typed fields_untyped restricted_hash  no_fields
fields_typed    1268666/s           --            -6%            -27%       -35%
fields_untyped  1349928/s           6%             --            -23%       -30%
restricted_hash 1743672/s          37%            29%              --       -10%
no_fields       1938101/s          53%            44%             11%         --

Kind of funny that untyped lexicals are faster than typed, because I thought there was opcode-level optimizations when fields is used with a typed lexical. It's repeatably slower by 5-10% though. At least lookups on restricted hv keys is not much slower than a blessed hashref.

# ? Jul 17, 2008 05:10

checkeredshawn: Jul 16, 2007

I'm looking for a better way to compare values within a hash than this method that I'm currently using. It goes through a hash and compares values by key, and does not bother comparing keys to themselves. I've searched cpan and google for modules but I can't find anything that does this nicer.

code:

foreach my $key (keys %servervalues) { 
  foreach my $other (keys %servervalues) {
    if ($key ne $other) { 
      if ($servervalues{$key} ne $servervalues{$other}) {
        print "$key differs from $other\n";
      } else { 
        print "$key returns the same result\(s\) as $other\n";
      }   
    }   
  }
}

Edit: put in code tags

# ? Jul 17, 2008 17:36

Triple Tech: Jul 28, 2006; So what, are you quitting to join Homo Explosion?

What, at a higher level, are you trying to accomplish?

# ? Jul 17, 2008 17:41

checkeredshawn: Jul 16, 2007

Triple Tech posted:

What, at a higher level, are you trying to accomplish?

The keys in the hash are DNS servers, and the values are their results when querying a given host against them.

So a typical key-value pair would look like

"someDNSserver"=>"100.100.100.100 101.101.101.101"

because the results are being stored by pushing onto an array and returning
join " ", @results

Edit: Assuming quoted post was directed to me

checkeredshawn fucked around with this message at 17:52 on Jul 17, 2008

# ? Jul 17, 2008 17:50

Adbot: ADBOT LOVES YOU

# ? May 21, 2024 20:11

Triple Tech: Jul 28, 2006; So what, are you quitting to join Homo Explosion?

Yes, I'm talking to checkeredshawn.

So, now we know what the data looks like. What exactly are you trying to determine based off of this data? Are you picking out special elements? Are you deleting duplicates? Are you... doing something?

# ? Jul 17, 2008 18:08

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Perl Short Questions Megathread: executable line noise

«‹›72 »