Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
No Safe Word
Feb 26, 2005

We never really did a great job of packaging the IRC core as a library, but you can use the decade-old-but-still-pretty-good IRC bot project that I was part of (though didn't write a lot of the core stuff): Supybot

It's fairly complete, and at least still sort of marginally maintained even though we don't really set any sort of release goals anymore (nor really have in half a decade). Still gets a few hundred downloads a month anyway :v:

Adbot
ADBOT LOVES YOU

ArcticZombie
Sep 15, 2010
Very new to Python, trying to learn it so I can automate some stuff I'd rather not do. First on my list is carrying out multiple regex replacements on a given text file. I can do this with single line regex no problem by looping through the file line by line and looking for a match. This obviously doesn't work when I'm trying to match multiline regex, any ideas how I can go through a file and do this? I'd like to avoid reading the whole file into memory at once if possible.

Emacs Headroom
Aug 2, 2003

ArcticZombie posted:

Very new to Python, trying to learn it so I can automate some stuff I'd rather not do. First on my list is carrying out multiple regex replacements on a given text file. I can do this with single line regex no problem by looping through the file line by line and looking for a match. This obviously doesn't work when I'm trying to match multiline regex, any ideas how I can go through a file and do this? I'd like to avoid reading the whole file into memory at once if possible.

Do you know how many lines it will match over maximally? You can read in that many lines and run normal regex that way.

If it's a huge number of lines and you don't want to have it all in memory, you can also just make a DFA by hand to check for matches.

SwimNurd
Oct 28, 2007

mememememe

No Safe Word posted:

We never really did a great job of packaging the IRC core as a library, but you can use the decade-old-but-still-pretty-good IRC bot project that I was part of (though didn't write a lot of the core stuff): Supybot

It's fairly complete, and at least still sort of marginally maintained even though we don't really set any sort of release goals anymore (nor really have in half a decade). Still gets a few hundred downloads a month anyway :v:

If you are writing a python ircbot this is by far the best framework. For the most part it is still actively developed and should work with python 3, you may need pull from the git repo though. I have been writing plugins for this framework for years, it is nice.

ArcticZombie
Sep 15, 2010
The number of lines to be macthed in undetermined, it will vary between matches in the file and also between files. I'm not sure what you mean by DFA, sorry.

Related question, what's the best way to actually write these replacements to the file? In it's current state the script just reads the whole file, does a replacement, writes the string to a file, reads that file back in to look for the next match for as many times as they are matches, then repeats the process for the next pattern. This seems like a bad way to go about it.

raminasi
Jan 25, 2005

a last drink with no ice

ArcticZombie posted:

The number of lines to be macthed in undetermined, it will vary between matches in the file and also between files. I'm not sure what you mean by DFA, sorry.

Related question, what's the best way to actually write these replacements to the file? In it's current state the script just reads the whole file, does a replacement, writes the string to a file, reads that file back in to look for the next match for as many times as they are matches, then repeats the process for the next pattern. This seems like a bad way to go about it.

Regular expressions are a family of miniature languages that are used to succinctly define theoretical "machines" known as (D)eterministic (F)inite (A)utomata. That's what regular expressions "do;" when you're "solving a problem with a regular expression" you're actually solving it with some DFA defined by the regular expression you use. Emacs Headroom is suggesting that you build this DFA by hand instead of abbreviating it with a regular expression because the regular expression machinery available to you is apparently not expressive enough. However, since you don't actually know what a DFA is, you might as well read the advice as "just don't use regular expressions."

e: I'm neither endorsing nor arguing with this advice, as text-munging isn't my specialty

BigRedDot
Mar 6, 2008

ArcticZombie posted:

The number of lines to be macthed in undetermined
This is what makes your problem harder. If you could put a bound, any bound, on the number of lines, then you could read in chunks that overlap by that amount and process them one at a time. If there is truly no bound, then in particular, one case is that the entire file matches. As consequence, in order to cover this case the standard regex library will require access to the entire file in memory. Another option might be to read in a buffer and keep extending it until you find a match. But such a procedure will be greedy it will always find the shortest match and maybe that's not what you want. And again there is a corner case where it will extend to the entire file if there are no matches (or the whole file is the match).

If your pattern is simple to match "by hand", ie by scanning the text character by character yourself, then that may be your only option, and you may have to resort to cython or other tools to get the performance you require.

No Safe Word
Feb 26, 2005

SwimNurd posted:

If you are writing a python ircbot this is by far the best framework. For the most part it is still actively developed and should work with python 3, you may need pull from the git repo though. I have been writing plugins for this framework for years, it is nice.

You'll definitely want to pull from the git repo or a nightly. The latest release is 3+ years old and we've definitely done some stuff since then, mostly accepting patches from others. But the latest download doesn't have things like relative imports and all that fun stuff.

ArcticZombie
Sep 15, 2010
Alright well thanks for your help guys, I'll just settle for reading the whole file, they aren't TOO large. In the mean time I did come up with a less :downs: method for handling the output.

bigmandan
Sep 11, 2001

lol internet
College Slice
I have a project I'm working on and I would like to make sure I'm not setting myself up for too many headaches down the road.

I have multiple clients (web, log services, etc..) that need to communicate with several networking devices (DSLAMS if you're curious). The thing with these devices is they are pretty finicky and ideally I only want one socket connection open to them at a time. The current solution (PHP ugh) is working fine for now, but it needs some babysitting due to too many simultaneous connections to the DSLAMS. So I'm trying to write a "command dispatch server" that will allow multiple clients to queue up commands and get the response.

I was thinking of using a threaded socketserver to handle incoming client connections and a pool of processes from multiprocessing to handle each of the network devices.

I have put together some concept code (in python 3.3) to illustrate my idea:
code:
import json
import time
import threading
import socketserver
from multiprocessing import Pipe, Queue, Process

class RequestHandler(socketserver.BaseRequestHandler):
    """
    request handler for the server...
    """
    def handle(self):        
        self.conn_local, self.conn_dslam = Pipe()
        
        # get the json message and route to the correct DSLAM
        msg = json.loads(str(self.request.recv(1024), 'ascii'))
        answer = self.route(msg)        
        
        # respond to the client
        self.request.sendall(bytes(json.dumps(answer),'ascii'))
        
        self.conn_local.close()
        self.conn_dslam.close()
    
    def route(self,msg):
        
        dslam_id = (msg['dslam']['clli'], msg['dslam']['type'], msg['dslam']['shelf'])
        
        if dslam_id in self.server.dslams:
            self.server.dslams[dslam_id]['queue'].put(self.conn_dslam)
            self.conn_local.send(msg)
            return self.conn_local.recv() # should be a dict
        else:
            return {"error": "not implemented or process not running"}
        
                
class Server(socketserver.ThreadingTCPServer):
    """
    baisc socket server with a dict of dslams...
    """    
    def __init__(self, server_address, RequestHandlerClass, dslams):
        socketserver.TCPServer.__init__(self, server_address, RequestHandlerClass)        
        self.dslams = dslams
        

class Dslam(Process):
    """
    Class to handle communication to a DLSAM... 
    Separate process to ensure only one connection to the DSLAM is made
    and to queue requests from multiple clients.  Additionally, the process 
    will handle events from the DSLAM
    """
        
    def __init__(self, queue):
        """
        @param queue: Queue of multiprocessing.Connection. The Connection is another end of a Pipe 
                      created in a threaded SocketServer RequestHandler        
        """
        Process.__init__(self)
        self.q = queue
        # socket to the dslam here...
                                
    def run(self):
        while True:
            conn = self.q.get()
            data = conn.recv() # dict
            
            #
            # code to handle request....
            #
            
            resp = {"response": "Got Message {}: {}".format(self.name, data['command'])}
            conn.send(resp)
                        
    
if __name__ == "__main__":
    
    # setup the DSLAM processes, ideally these will be loaded 
    # from a config file or database
    # production will have about 20-30...    
    adslam = ('CLLICODE','TYPE','1') #(clli, type, shelf_id)
    dslams = {}
    dslams[adslam] = {}
    dslams[adslam]['queue'] = Queue()
    dslams[adslam]['process'] = Dslam(dslams[adslam]['queue'])
    dslams[adslam]['process'].daemon = True
    dslams[adslam]['process'].name = 'ADSL DSLAM 1'
    dslams[adslam]['process'].start()
    
    # Start a threaded socket server...     
    server = Server(("localhost", 9000), RequestHandler, dslams)
    server_thread = threading.Thread(target=server.serve_forever)
    server_thread.daemon = True
    server_thread.start()
    print("server running... :", server_thread.name)
        
    while True:
        time.sleep(5)
        print("waiting for requests...")
Thoughts?

Scaevolus
Apr 16, 2007

bigmandan posted:

I was thinking of using a threaded socketserver to handle incoming client connections and a pool of processes from multiprocessing to handle each of the network devices.
You don't need multiprocessing to handle multiple network connections. Threads will work fine-- socket operations like recv() switch to the next thread when they block.

bigmandan
Sep 11, 2001

lol internet
College Slice
I should have mentioned a few more things. When the process is not serving commands from clients it will be polling the devices to act on arbitrary data. Due to the nature of most of interfaces I'll be connecting to, I'll need to use non-blocking IO, so I thought using processes would be better fit. Still use threading?

BeefofAges
Jun 5, 2004

Cry 'Havoc!', and let slip the cows of war.

Threading will still work just fine. You would want to use multiprocessing if you wanted to achieve true concurrency for large-scale number crunching.

bigmandan
Sep 11, 2001

lol internet
College Slice
If I use threading, should I consider using something other than Pipe to make sure the caller gets the correct response?

Jose Cuervo
Aug 25, 2004
I have 3 lists where the first two lists (LIST1 and LIST2) are list of numbers, and the third list (LIST3) is a list of booleans. All lists have the same number of elements. I want to compute the average difference between the corresponding elements in LIST1 and LIST2 for those elements whose corresponding entry in LIST3 is True.

Is there a Pythonic way to accomplish this (i.e. not using for loops to iterate over each list)?

peepsalot
Apr 24, 2007

        PEEP THIS...
           BITCH!

Hey folks, I've recently replaced one of our build scripts at work once an awful windows batch script with a shiny new python one. It basically handles patching a bunch of embedded binaries with some extra functionality.

Here is an example where i'm calling out to another process:
code:
def callAssembler(patchfile, objfile):
    command = ['sh5tools/sh64-elf-as', '-dsp', '-o', objfile, patchfile]
    print " ".join(command)
    return subprocess.call(command)
This function gets called from within a loop, on various different files. It is supposed to print out what it's doing before it performs the call, but when i run this script on our build server (Jenkins), I see the output from 8 or so subprocess calls, and then 8 in a row of the print statements which should have come before each individual call. Call is supposed to be blocking right? I don't understand how this can happen.

No Safe Word
Feb 26, 2005

Jose Cuervo posted:

I have 3 lists where the first two lists (LIST1 and LIST2) are list of numbers, and the third list (LIST3) is a list of booleans. All lists have the same number of elements. I want to compute the average difference between the corresponding elements in LIST1 and LIST2 for those elements whose corresponding entry in LIST3 is True.

Is there a Pythonic way to accomplish this (i.e. not using for loops to iterate over each list)?

itertools owns

code:
>>> list1 = [1, 2, 3, 4, 5, 6]
>>> list2 = [10, 10, 10, 10, 10, 10]
>>> list3 = [True, False, False, True, False, True]
>>> import itertools
>>> [(a-b) if c else None for (a, b, c) in itertools.zip_longest(list1, list2, list3)]
[-9, None, None, -6, None, -4]
Or you can skip the "else None" part if you want and you'll only get the elements where it's able to compute the difference.


e: and obviously, to complete the solution you can just take the average of what you get back using sum and len of course

No Safe Word fucked around with this message at 19:36 on Jan 30, 2013

The Gripper
Sep 14, 2004
i am winner

peepsalot posted:

Hey folks, I've recently replaced one of our build scripts at work once an awful windows batch script with a shiny new python one. It basically handles patching a bunch of embedded binaries with some extra functionality.

Here is an example where i'm calling out to another process:
code:
def callAssembler(patchfile, objfile):
    command = ['sh5tools/sh64-elf-as', '-dsp', '-o', objfile, patchfile]
    print " ".join(command)
    return subprocess.call(command)
This function gets called from within a loop, on various different files. It is supposed to print out what it's doing before it performs the call, but when i run this script on our build server (Jenkins), I see the output from 8 or so subprocess calls, and then 8 in a row of the print statements which should have come before each individual call. Call is supposed to be blocking right? I don't understand how this can happen.
It's likely that subprocess.call() is implemented in C and isn't buffering output the same way/cooperatively with the way python does, so it's flushing it's own buffer while python is holding on to the rest of the output until later, putting things out of order.

I'm not sure what the ideal solution is, you could try sys.stdout.flush() after your print statements and see if that helps or do something like:
Python code:
def callAssembler(patchfile, objfile):
    command = ['sh5tools/sh64-elf-as', '-dsp', '-o', objfile, patchfile]
    print " ".join(command)
    p = subprocess.Popen(command, stdout=subprocess.PIPE)
    output = p.communicate()[0]
    print output
    return p.returncode
That will suppress output to stdout so you can wait+retrieve process output and print that at your leisure in the right order.

Jose Cuervo
Aug 25, 2004

No Safe Word posted:

itertools owns

code:
>>> list1 = [1, 2, 3, 4, 5, 6]
>>> list2 = [10, 10, 10, 10, 10, 10]
>>> list3 = [True, False, False, True, False, True]
>>> import itertools
>>> [(a-b) if c else None for (a, b, c) in itertools.zip_longest(list1, list2, list3)]
[-9, None, None, -6, None, -4]
Or you can skip the "else None" part if you want and you'll only get the elements where it's able to compute the difference.


e: and obviously, to complete the solution you can just take the average of what you get back using sum and len of course

Thank you. I was looking into numpy (anyone else say this as num-pee and not num-pie) since it seemed to allow you to perform vector operations(?) (e.g. numpy.array(LIST3) - numpy.array(LIST2)), but I like your solution a lot.

EDIT: Can you tell me what phrase to use when searching for documentation on the "x if y else z" notation?

Jose Cuervo fucked around with this message at 20:59 on Jan 30, 2013

tef
May 30, 2004

-> some l-system crap ->

No Safe Word posted:

itertools owns

I like the bits of code I get to write, when I get to use itertools or functools or collections :3:

Yay
Aug 4, 2007

Jose Cuervo posted:

EDIT: Can you tell me what phrase to use when searching for documentation on the "x if y else z" notation?
"Ternary operator" should.

Also yes I pronounce it num-pea. To confound things further, SciPy I pronounce sigh-pie.

fart simpson
Jul 2, 2005

DEATH TO AMERICA
:xickos:

Why would you call it num pee unless you also say pee thon?

bigmandan
Sep 11, 2001

lol internet
College Slice

Yay posted:

"Ternary operator" should.

Also yes I pronounce it num-pea. To confound things further, SciPy I pronounce sigh-pie.

If you check the SciPy website, that's how they say to pronounce it!

Janitor Prime
Jan 22, 2004

PC LOAD LETTER

What da fuck does that mean

Fun Shoe

bigmandan posted:

If you check the SciPy website, that's how they say to pronounce it!

how else would you say it

taqueso
Mar 8, 2004


:911:
:wookie: :thermidor: :wookie:
:dehumanize:

:pirate::hf::tinfoil:

sigh-pee-why

Janitor Prime
Jan 22, 2004

PC LOAD LETTER

What da fuck does that mean

Fun Shoe
sipee

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

skippy

Surprise T Rex
Apr 9, 2008

Dinosaur Gum
Can we use this thread for more specific python-related stuff? (Maya scripting, etc)?

Related to that, but still fairly general - I have a vague issue with trying to work out how to figure out if two generated squares overlap in X or Y. Any ideas? Just vague suggestions in pseudocode will do, I'm sure I can figure the rest out.

accipter
Sep 12, 2003

Surprise T Rex posted:

Can we use this thread for more specific python-related stuff? (Maya scripting, etc)?

Related to that, but still fairly general - I have a vague issue with trying to work out how to figure out if two generated squares overlap in X or Y. Any ideas? Just vague suggestions in pseudocode will do, I'm sure I can figure the rest out.

I usually use the package Shapely for geometric tests.

peepsalot
Apr 24, 2007

        PEEP THIS...
           BITCH!

The Gripper posted:

I'm not sure what the ideal solution is, you could try sys.stdout.flush() after your print statements and see if that helps
Simple enough, works for me, thanks.

Jose Cuervo
Aug 25, 2004

No Safe Word posted:

itertools owns

code:
>>> list1 = [1, 2, 3, 4, 5, 6]
>>> list2 = [10, 10, 10, 10, 10, 10]
>>> list3 = [True, False, False, True, False, True]
>>> import itertools
>>> [(a-b) if c else None for (a, b, c) in itertools.zip_longest(list1, list2, list3)]
[-9, None, None, -6, None, -4]
Or you can skip the "else None" part if you want and you'll only get the elements where it's able to compute the difference.


e: and obviously, to complete the solution you can just take the average of what you get back using sum and len of course

I am not sure if I am missing something obvious, but taking out the "else None" part results in a syntax error. I should also say that I am using izip_longest since I am using Python 2.7. Any thoughts?

Jose Cuervo fucked around with this message at 02:02 on Jan 31, 2013

peepsalot
Apr 24, 2007

        PEEP THIS...
           BITCH!

x if condition else y

is python's version of the ternary operator ?: from C-likes

it doesn't make sense to do it without three operands

Jose Cuervo
Aug 25, 2004

peepsalot posted:

x if condition else y

is python's version of the ternary operator ?: from C-likes

it doesn't make sense to do it without three operands

I understand that, but No Safe Word said that

No Safe Word posted:

Or you can skip the "else None" part if you want and you'll only get the elements where it's able to compute the difference.

and I was trying to have an output vector that did not contain 'None' in the positions where it is unable to compute the difference.

EDIT: I guess I could use it like so:
Python code:
mylist= []
for (a, b, c) in itertools.zip_longest(list1, list2, list3):
	if c:
		mylist.append(b-a)

Jose Cuervo fucked around with this message at 02:19 on Jan 31, 2013

babyeatingpsychopath
Oct 28, 2000
Forum Veteran


Jose Cuervo posted:

I am not sure if I am missing something obvious, but taking out the "else None" part results in a syntax error. I should also say that I am using izip_longest since I am using Python 2.7. Any thoughts?

These things are called "list comprehensions" and work differently in python 3.

The way I wrote one after looking at your problem was:
Python code:
[list1[i] - list2[i] for i in range(len(list1)) if list3[i]]
Read up in the docs about list comprehensions. The way I did it doesn't require any additional modules. It also only returns the number of items for which list3 is true. That's how I read the problem. It's up to you to check that you don't throw any exceptions for out-of-bounds conditions, however.

edit: apparently I read correctly.

BigRedDot
Mar 6, 2008

Why are you guys making this hard.
code:
In [7]: [(a-b) for (a, b, c) in zip(list1, list2, list3) if c]
Out[7]: [-9, -6, -4]

BeefofAges
Jun 5, 2004

Cry 'Havoc!', and let slip the cows of war.

peepsalot posted:

Hey folks, I've recently replaced one of our build scripts at work once an awful windows batch script with a shiny new python one. It basically handles patching a bunch of embedded binaries with some extra functionality.

Here is an example where i'm calling out to another process:
code:
def callAssembler(patchfile, objfile):
    command = ['sh5tools/sh64-elf-as', '-dsp', '-o', objfile, patchfile]
    print " ".join(command)
    return subprocess.call(command)
This function gets called from within a loop, on various different files. It is supposed to print out what it's doing before it performs the call, but when i run this script on our build server (Jenkins), I see the output from 8 or so subprocess calls, and then 8 in a row of the print statements which should have come before each individual call. Call is supposed to be blocking right? I don't understand how this can happen.

I had the same problem in Jenkins and used sys.stdout.flush() to take care of it.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

MeramJert posted:

Why would you call it num pee unless you also say pee thon?

I hear it as "numpee" in my internal monologue because it resembles the word "numpty" (a dimwitted or bumbling person; a muppet).

BigRedDot posted:

Why are you guys making this hard.
code:
In [7]: [(a-b) for (a, b, c) in zip(list1, list2, list3) if c]
Out[7]: [-9, -6, -4]

This is what I thought. It was said in the question that the lists are all the same length, so zip() is fine.

Surprise T Rex posted:

Can we use this thread for more specific python-related stuff? (Maya scripting, etc)?

Related to that, but still fairly general - I have a vague issue with trying to work out how to figure out if two generated squares overlap in X or Y. Any ideas? Just vague suggestions in pseudocode will do, I'm sure I can figure the rest out.

As long as it's just squares (even just rectangles) with sides orthogonal to the x- and y-axes, you can check by just seeing whether they overlap in both the x- and y-dimensions.

I don't know how you represent your squares in the program, so let's say each square is an object with the attributes left, right, top, bottom (numbers representing where these borders are located). So a square whose top-left corner is at (-2, 6) and whose bottom-right corner is at (1, 3) would have left = -2, right = 1, top = 6, bottom = 3. You have two squares, s1 and s2, and you want to know whether they overlap. I'm assuming they are deemed not to overlap if they just touch at the edges.

Python code:
def rangesoverlap (range1left, range1right, range2left, range2right):
    return range1left <= range2left < range1right or range2left <= range1left < range2right

def rectanglesoverlap (r1, r2):
    return rangesoverlap(r1.left, r1.right, r2.left, r2.right) and rangesoverlap(r1.bottom, r1.top, r2.bottom, r2.top)

overlapping = rectanglesoverlap(s1, s2)
If you need to work with more general shapes or if the rectangles are skewed, you're probably going to want to look at something more complicated & capable. Also this code assumes that your squares are sensible (non-zero width and height, left < right, top > bottom).

Hammerite fucked around with this message at 05:44 on Jan 31, 2013

Jose Cuervo
Aug 25, 2004

BigRedDot posted:

Why are you guys making this hard.
code:
In [7]: [(a-b) for (a, b, c) in zip(list1, list2, list3) if c]
Out[7]: [-9, -6, -4]

This worked perfectly. Thank you.

However, I have now also learned about itertools and list comprehensions which is valuable.

tef
May 30, 2004

-> some l-system crap ->

peepsalot posted:

I see the output from 8 or so subprocess calls, and then 8 in a row of the print statements which should have come before each individual call. Call is supposed to be blocking right? I don't understand how this can happen.

See also, running python with -u.

code:
-u     Force  stdin, stdout and stderr to be totally unbuffered.  On systems where it matters, also put stdin,
              stdout and stderr in binary mode.

Adbot
ADBOT LOVES YOU

raminasi
Jan 25, 2005

a last drink with no ice
Is IronPython still being developed?

  • Locked thread