Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Bruegels Fuckbooks
Sep 14, 2004

Now, listen - I know the two of you are very different from each other in a lot of ways, but you have to understand that as far as Grandpa's concerned, you're both pieces of shit! Yeah. I can prove it mathematically.

dougdrums posted:

BSON is kinda silly. Auditing mongoDB databases is fun (because you can charge more). My DB class in college was 90% mongoDB and also 100% a waste of time.

wait you learned about mongodb in college? that technology has a learning curve of about an hour. what did you do for the rest of the semester?

Adbot
ADBOT LOVES YOU

dougdrums
Feb 25, 2005
CLIENT REQUESTED ELECTRONIC FUNDING RECEIPT (FUNDS NOW)
Oh yeah my thoughts exactly. It was more of a mac troubleshooting class, not unlike the others. Relational algebra got a week.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

I just spent days writing a bunch of github wiki pages.

I think I might have praised this thing before, but the Markdown Navigator plugin for PyCharm (well any Jetbrains product I think) is pretty frickin great for writing markdown documents. It has a bit of a learning curve to get past the basics, but it's neato.

edit: I just remembered that I've posted about this before. oops

Thermopyle fucked around with this message at 17:42 on Nov 27, 2018

Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



Wallet posted:

The best way to store a date? Clearly a string, except when you feel like using an ISODate. Want to record whether something is supposed to be on or off? Use a string and set the value to "on" or "off", obviously.

To be fair, I've seen both of these crimes committed in RDBMSs, too.

CarForumPoster
Jun 26, 2013

⚡POWER⚡
I have an easy one thats frustrating me. I want to replace a bunch of strings inside df['col1'][i] with values from df['col2'][i]

I have a column in a dataframe df_maps called name with strings in it:
Bob
Dan
Sean

Another called param1
A
B
C

I have a column called 'html' with some html in it and stuff to replace ($name and $param1)
<h2>$name</h2><span style="color: #008000;"><strong>$param1</strong>

This executes:
code:
for i, row in df_maps.iterrows():
    df_maps['combined'] = df_maps['html']
    df_maps['combined'] = df_maps['combined'].str.replace('$name', df_maps['name'][i])
    print(df_maps['name'][i])
but the output of df_maps.combined[1] in jupyter is:
<h2>$name</h2><span style="color: #008000;"><strong>$param1</strong>
instead of
<h2>Dan</h2><span style="color: #008000;"><strong>$param1</strong>

SurgicalOntologist
Jun 17, 2004

You are re-creating the column 'combined' at every iteration of the loop, essentially resetting it. Try this:

Python code:
df_maps['combined'] = ''
for i, row in df_maps.iterrows():
    df_maps.at[i, 'combined'] = row.html.replace('$name', row.name)
or
Python code:
def insert_name(row):
    return row.html.replace('$name', row.name)


df_maps['combined'] = df_maps.apply(insert_name)
(you may need to play around with the kwargs to apply to get that to work)

SurgicalOntologist fucked around with this message at 21:19 on Nov 27, 2018

CarForumPoster
Jun 26, 2013

⚡POWER⚡

SurgicalOntologist posted:

You are re-creating the column 'combined' at every iteration of the loop, essentially resetting it. Try this:

Python code:
df_maps['combined'] = ''
for i, row in df_maps.iterrows():
    df_maps.at[i, 'combined'] = row.html.replace('$name', row.name)
or
Python code:
def insert_name(row):
    return row.html.replace('$name', row.name)


df_maps['combined'] = df_maps.apply(insert_name)
(you may need to play around with the kwargs to apply to get that to work)

I feel like an idiot and that worked.

unpacked robinhood
Feb 18, 2013

by Fluffdaddy
I'm trying to have mypy check my type hints, and..I think I've no idea what I'm doing.

This runs fine:
Python code:
pos: List[Dict] = []
...
if 'users' in a_dict.keys():
   # creating a list of dicts
    users = [{'type': 'user','lng': x['location']['x'], 'lat':x['location']['y']} for x in a_dict['users']]
    for lt in users:
        pos.append(lt)
code:
File.py:83: error: Incompatible types in assignment (expression has type "Dict[str, Any]", variable has type "List[Dict[str, Any]]")
File.py:84: error: Argument 1 to "append" of "list" has incompatible type "List[Dict[str, Any]]"; expected "Dict[Any, Any]"
I'm adding dictionnaries as individual values in a list, what am I missing ?

e: another troublesome one:
Python code:
a_list = list(filter(lambda x:  urlparse(x['name']).path == '/a/normal/path', another_list))
code:
error: Invalid index type "str" for "str"; expected type "Union[int, slice]"

unpacked robinhood fucked around with this message at 15:12 on Nov 28, 2018

cinci zoo sniper
Mar 15, 2013




First example works just fine for me with strict mypy, so I'm confused what are you doing there. What Python and mypy versions are you using?

Python code:
from typing import List, Dict

pos: List[Dict] = []
a_dict = {'users': [{'location': {'x': 1, 'y': 2}}, {'location': {'x': 3, 'y': 4}}]}

if 'users' in a_dict.keys():
    # creating a list of dicts
    users = [{'type': 'user', 'lng': x['location']['x'], 'lat': x['location']['y']} for x in a_dict['users']]
    for lt in users:
        pos.append(lt)

print(pos)
Also you should replace that for block with pos += users, or pos += [....] alltogether.

Second example is not sufficient - what are you typing and how?

cinci zoo sniper fucked around with this message at 15:50 on Nov 28, 2018

Dr Subterfuge
Aug 31, 2005

TIME TO ROC N' ROLL
Another viable alternative for pos += users is pos.extend(users). They're mostly the same thing.

unpacked robinhood
Feb 18, 2013

by Fluffdaddy

cinci zoo sniper posted:

First example works just fine for me with strict mypy, so I'm confused what are you doing there. What Python and mypy versions are you using?
That's python 3.7 and mypy 0.641

cinci zoo sniper posted:

Also you should replace that for block with pos += users, or pos += [....] alltogether.

Second example is not sufficient - what are you typing and how?
Replacing the for block with pos.extend(users) seems to do the trick

I should have included the function signature in my second question:

Python code:
def find_resources(self, another_list: List[str])-> Optional[str]:
...
	a_list = list(filter(lambda x:  urlparse(x['name']).path == '/a/normal/path', another_list))
...
e:

yep

unpacked robinhood fucked around with this message at 17:35 on Nov 28, 2018

cinci zoo sniper
Mar 15, 2013




unpacked robinhood posted:

That's python 3.7 and mypy 0.641

Replacing the for block with pos.extend(users) seems to do the trick

This makes literally 0 sense, to be fully honest with you, since that doesn't change anything from type perspective.

unpacked robinhood posted:

I should have included the function signature in my second question:

Python code:
def find_resources(self, another_list: List[str])-> Optional[str]:
...
	a_list = list(filter(lambda x:  urlparse(x['name']).path == '/a/normal/path', another_list))
...

How does a_list relates to jslog, and how does function return relate to a_list? Looking at what you show I also don't see where would Union[int, slice] enter the scene altogether.

cinci zoo sniper fucked around with this message at 17:45 on Nov 28, 2018

unpacked robinhood
Feb 18, 2013

by Fluffdaddy
Thanks. I've been trying to replicate my issue on a short standalone example but so far it won't not work.
My question probably can't really be answered without seeing the complete function, which i'm not comfortable posting atm.

Aquarium of Lies
Feb 5, 2005

sad cutie
:justtrans:

she/her
Taco Defender

unpacked robinhood posted:

I'm trying to have mypy check my type hints, and..I think I've no idea what I'm doing.

This runs fine:
Python code:
pos: List[Dict] = []
...
if 'users' in a_dict.keys():
   # creating a list of dicts
    users = [{'type': 'user','lng': x['location']['x'], 'lat':x['location']['y']} for x in a_dict['users']]
    for lt in users:
        pos.append(lt)
code:
File.py:83: error: Incompatible types in assignment (expression has type "Dict[str, Any]", variable has type "List[Dict[str, Any]]")
File.py:84: error: Argument 1 to "append" of "list" has incompatible type "List[Dict[str, Any]]"; expected "Dict[Any, Any]"
I'm adding dictionnaries as individual values in a list, what am I missing ?

The first error makes it look like you used lt earlier in your function with a different type. If that's the case it also explains why it's inferring lt as a List in the second error.

quote:

e: another troublesome one:
Python code:
a_list = list(filter(lambda x:  urlparse(x['name']).path == '/a/normal/path', another_list))
code:
error: Invalid index type "str" for "str"; expected type "Union[int, slice]"

This could be a similar issue where mypy thinks the type for another_list is List[str].

Don't forget about using reveal_type to show what mypy thinks variables are, it can be really useful!

QuarkJets
Sep 8, 2008

This discussion is reminding me that I hate lambda and how it always makes code look super ugly. Does anyone else hate lambda?

Wallet
Jun 19, 2006

QuarkJets posted:

This discussion is reminding me that I hate lambda and how it always makes code look super ugly. Does anyone else hate lambda?

Yes. It's yucky.

Dominoes
Sep 20, 2007

QuarkJets posted:

This discussion is reminding me that I hate lambda and how it always makes code look super ugly. Does anyone else hate lambda?
Python's lambda syntax is deliberately clunky.

For example, this above:
Python code:
a_list = list(filter(lambda x:  urlparse(x['name']).path == '/a/normal/path', another_list))
Would look like this in JS:
JavaScript code:
let a_list = another_list.filter(x => urlparse(x.name.path == '/a/normal/path')
Or this in Rust:
Rust code:
let a_list = another_list.filter(|x| urlparse(x["name"].path == "/a/normal/path").collect();
Or this in idiomatic Python:
Python code:
a_list = [x for x in another_list if urlparse(x['name']).path == '/a/normal/path']

Dominoes fucked around with this message at 04:12 on Nov 29, 2018

The XKCD Larper
Mar 1, 2009

by Lowtax
I've been using python-mode in Vim. I am working on building a library in a subfolder. Each time I change files within the library, I have to restart vim for it to reload the library. Is there any way to force reloading of the library when I execute in Vim?

LochNessMonster
Feb 3, 2005

I need about three fitty


To improve my python skills (and to be reminded how I still know even a tiniest fraction about it) I figured to give https://adventofcode.com/2018 a shot. While I managed to solve the issue at hand, I figured I'd post it here to get some feedback on it, as well as wondering how to do a particular thing different.

The assignment was loop through a list of numbers with an operator in front of it that indicates if you should add/subtract the number from the previous number. The assignment was to calculate a "frequency", which is why my variables are called "freq" and the likes. The script should stop when it ran into the first duplicate frequency. You had to run the program multiple times since the first (few?) iteration(s) did not result in duplicate numbers.

I'm aware that using global is bad practise. I can't figure out why the program won't work without it though. When referencing freq_list or duplicate_freq from inside the run_freq_calc() function, it doesn't run into issues, but if I do that with curr_freq it throws an UnboundLocalError: local variable 'curr_freq' referenced before assignment error. I've got no idea why that's happening. Besides this question, any other feedback on my code would be more than welcome.

Python code:
file = "input.txt"
curr_freq = 0
freq_list = []
duplicate_freq = False

def run_freq_calc():
    global curr_freq
    
    with open(file) as f:
        read_data = f.readlines()

    for line in read_data:
        line = line.strip('\n')
        if line[0] == "+":
            curr_freq += int(line[1:])
        else:
            curr_freq += int(line)

        if curr_freq not in freq_list:
            freq_list.append(curr_freq)
        elif curr_freq in freq_list: 
            print(str(curr_freq) + " already exists!!")
            duplicate_freq = True
            exit()

while duplicate_freq == False:
    run_freq_calc()

baka kaba
Jul 19, 2003

PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

Python's variables are basically key/value pairs in a local dictionary, so when you assign butt = 69 it's adding an entry to that. If you try to reference a variable before you assign it, it's like looking up a key that doesn't exist (+= is basically 3 operations - read, increment, write)

The global keyword basically makes that global value accessible so you can look it up in the global dictionary, or however it works under the hood. You can access it, anyway

(hope the dictionary stuff is still true...)

LochNessMonster
Feb 3, 2005

I need about three fitty


baka kaba posted:

Python's variables are basically key/value pairs in a local dictionary, so when you assign butt = 69 it's adding an entry to that. If you try to reference a variable before you assign it, it's like looking up a key that doesn't exist (+= is basically 3 operations - read, increment, write)

The global keyword basically makes that global value accessible so you can look it up in the global dictionary, or however it works under the hood. You can access it, anyway

(hope the dictionary stuff is still true...)

Thanks for explaining. I can’t believe I’ve never ran into this issue before though.

On Stack Overflow it’s mentioned multiple times that this is not a pythonic way to do things. Any ideas on how I could avoid this?

Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



LochNessMonster posted:

Thanks for explaining. I can’t believe I’ve never ran into this issue before though.

On Stack Overflow it’s mentioned multiple times that this is not a pythonic way to do things. Any ideas on how I could avoid this?

Pass it in as, say, starting_freq. I'd say pass in the list, and the duplicate flag, too, tbh.

baka kaba
Jul 19, 2003

PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

Pass yr stuff in to the function! So you're referring to a parameter that's in scope, and then you can return a result to the calling function instead of updating a global flag

And sorry, I wasn't completely right about the local thing. Basically if you write to a variable somewhere in a function, Python assumes you're creating a local variable with that name and assigning to it. So when you try to read that same variable earlier in the function, before you've assigned to it, Python looks in the local dictionary for something that hasn't been added yet

Adding the global keyword basically tells it to go looking in higher scopes instead. But if all you do in the function is read a global value, it'll go looking automatically, so you don't need to specify global, which is probably why you haven't run into it before. Like I said, += is a read-increment-write operation, so it creates a local variable (because of the write) but the first thing you do with it is attempt a read before it's been set to anything, so you get the error

SurgicalOntologist
Jun 17, 2004

Instead of relying on the global namespace, pass your data between your functions. In your case, you are not actually using the input/output of the function in any way. Best practice would be to not refer to any variables outside the function, even the ones where it works without global. Instead, pass them in. And then, if you are basically passing the program's entire state in and out of the function, then you could probably improve how you break down your program into functions.

Because I'm a procastinator I signed up for AOC; here's my answer:

Python code:
def run_freq_calc(filename):
    frequencies_seen = set()
    current_frequency = 0
    for line in iter_lines(filename):
        frequencies_seen.add(current_frequency)
        current_frequency += value_from_line(line)
        if current_frequency in frequencies_seen:
            break
    return current_frequency


def value_from_line(line):
    return int(line.strip('+\n'))


def iter_lines(filename):
    while True:
        with open(filename) as file:
            yield from file


if __name__ == '__main__':
    print(run_freq_calc('input.txt'))

To avoid having to pass frequencies_seen, I separated the I/O so the part that needs frequencies_seen doesn't need to be called repeatedly.

SurgicalOntologist fucked around with this message at 19:00 on Dec 3, 2018

cinci zoo sniper
Mar 15, 2013




My 5 minute take on the exercise:

Python code:
# calc freq until dupl.

frequency_changes = "LochNessMonster.txt"
known_frequencies = set()
baseline_frequency = 0


def adjust_frequency(baseline: int, adjustment: str) -> int:
    return eval(str(baseline) + adjustment)


with open(frequency_changes) as file:
    for line in file:
        try:
            bad_code = int(line.strip())
            if line not in known_frequencies:
                known_frequencies.add(line)
                baseline_frequency = adjust_frequency(baseline_frequency, line.strip())
            else:
                break
        except ValueError:
            pass

print(baseline_frequency)
Efb: Basically a lazier and sloppier version of what SurgicalOntologist did.

cinci zoo sniper fucked around with this message at 19:06 on Dec 3, 2018

LochNessMonster
Feb 3, 2005

I need about three fitty


Thanks for all the great examples.

Before posting I did try passing curr_freq to the function but my program then stopped at the 1st value of the 2nd pass of the input file. Made me think that the value for curr_freq was reset to 0 and then the first value of the 2nd pass would always be the first duplicate.

I'm not really sure why it does that. I expected the variable to be updated each time the function is executed, but apparently each iteration of the function takes the (global) curr_freq = 0 as a start again. So just passing it to the function is not enough. While typing this I figured out how to solve it. It's just as dirty as the rest of the code but it does the job.

Python code:
while duplicate_freq == False:
    curr_freq = run_freq_calc(curr_freq)
Now it's time to improve/learn from all the stuff you guys posted and apply it to the next challenges.

SurgicalOntologist
Jun 17, 2004

That's good, next step would be to make duplicate_freq not a global variable either (i.e. return it from the function). Also instead of while duplicate_freq == False it would be idiomatic to do while not duplicate_freq.

Sad Panda
Sep 22, 2004

I'm a Sad Panda.
Mine was certainly not fast to run in PyCharm, because of adding each item to a list which ended up with 142,990 items, but it worked.

Python code:
filename = "1a.txt"

with open(filename) as f:
    value = 0
    list_of_values = []
    while True:
        for line in f.readlines():
            value += int(line)
            if value not in list_of_values:
                list_of_values.append(value)
            else:
                print(f"Repeated {value}")
                exit()
        f.seek(0)
I also managed 2a, but 2b stumped me.

Sad Panda fucked around with this message at 22:17 on Dec 3, 2018

QuarkJets
Sep 8, 2008

LochNessMonster posted:

Thanks for all the great examples.

Before posting I did try passing curr_freq to the function but my program then stopped at the 1st value of the 2nd pass of the input file. Made me think that the value for curr_freq was reset to 0 and then the first value of the 2nd pass would always be the first duplicate.

I'm not really sure why it does that. I expected the variable to be updated each time the function is executed, but apparently each iteration of the function takes the (global) curr_freq = 0 as a start again. So just passing it to the function is not enough. While typing this I figured out how to solve it. It's just as dirty as the rest of the code but it does the job.

Python code:
while duplicate_freq == False:
    curr_freq = run_freq_calc(curr_freq)
Now it's time to improve/learn from all the stuff you guys posted and apply it to the next challenges.

You probably forgot to return the new value of curr_freq when you tried that, or you forgot to use the local variable name instead of the global one

Fluue
Jan 2, 2008
This is more of a software-design question and I'm having trouble remembering the "pythonic" or "preferred" way of doing it so its more testable:

I'm using a Python JIRA library to interact with the JIRA Service Desk (help desk sort of thing).

There's a requirement to support different "support ticket" backends, so I figured I would create TicketManager class that wraps around the JIRA Library to give a standard interface for any of the functionality I plan on using.

I also have to work with individual support tickets, so I created a Ticket library that takes in a JIRA "issue" object and I can call methods on that.

So I have:
code:
class TicketManager:

    def __init__(self):
        self.client = Jira()

    def get_ticket(self, ticket_id):
        # returns 'issue` instance from Jira class
        self.client.get_issue(ticket_id)

    def some_other_action(self):
        # whatever

class Ticket:
    def __init__(self, issue):
        self.ticket = issue

    def get_process_owners(self):
        # would have some more complex implementation here, not just a getter
        return self.issue.managers

    def submit_for_changes(self):
        pass


### my_implementation.py
def handle_issue_update(ticket_id):
    manager = TicketManager()
    ticket = Ticket(issue=manager.get_get_ticket(ticket_id)
    owners = ticket.get_process_owners()
    ...
   ticket.submit_for_changes()

Am I on the right track with this? I have trouble converting from classic OOP from Java/C# to Python's way of doing classes.

cinci zoo sniper
Mar 15, 2013




Fluue posted:

This is more of a software-design question and I'm having trouble remembering the "pythonic" or "preferred" way of doing it so its more testable:

I'm using a Python JIRA library to interact with the JIRA Service Desk (help desk sort of thing).

There's a requirement to support different "support ticket" backends, so I figured I would create TicketManager class that wraps around the JIRA Library to give a standard interface for any of the functionality I plan on using.

I also have to work with individual support tickets, so I created a Ticket library that takes in a JIRA "issue" object and I can call methods on that.

So I have:
code:
class TicketManager:

    def __init__(self):
        self.client = Jira()

    def get_ticket(self, ticket_id):
        # returns 'issue` instance from Jira class
        self.client.get_issue(ticket_id)

    def some_other_action(self):
        # whatever

class Ticket:
    def __init__(self, issue):
        self.ticket = issue

    def get_process_owners(self):
        # would have some more complex implementation here, not just a getter
        return self.issue.managers

    def submit_for_changes(self):
        pass


### my_implementation.py
def handle_issue_update(ticket_id):
    manager = TicketManager()
    ticket = Ticket(issue=manager.get_get_ticket(ticket_id)
    owners = ticket.get_process_owners()
    ...
   ticket.submit_for_changes()

Am I on the right track with this? I have trouble converting from classic OOP from Java/C# to Python's way of doing classes.

Looks fine, in my opinion.

CarForumPoster
Jun 26, 2013

⚡POWER⚡
What I want to do: Split a dataframe column of name strings into first and last names using nameparser. Have the first and last names in their own columns.

Problem: I get an object with the original name which contains all of the name parts in a "name" object rather than a series that is appended as new columns.

Any suggestions on how to properly solve this problem?

code:
from nameparser import HumanName

def split_names(namestring):
    name = HumanName(namestring)
    firstname = name.first
    lastname = name.last
    return pd.Series([firstname,lastname])
code:

[code]
In:
names["firstname"] = testdf["Defendant"].apply(split_names)
names["firstname"][1]

Out:
<HumanName : [
	title: '' 
	first: 'LINDA' 
	middle: '' 
	last: 'HARRIS' 
	suffix: ''
	nickname: ''
]>
I thought this might be what unstack does but I get an error when trying to use it.

EDIT: This worked.

code:
components = ('title', 'first', 'middle', 'last', 'suffix', 'nickname')

def name_decomp(n):
    h_n = HumanName(n)
    return (getattr(h_n, comp) for comp in components)

rslts = list(zip(*testdf.Defendant.map(name_decomp)))

for i, comp in enumerate(components):
    testdf[comp] = rslts[i]

CarForumPoster fucked around with this message at 13:24 on Dec 4, 2018

Hadlock
Nov 9, 2004

I have a bunch of functions in a custom module in Django that performs a bunch of helm stuff for kubernetes via subprocess. I decided to split those functions out to helm.py and now my custom module has "from . import helm" and a bunch of statements that look like helm.init(), helm.repo_update() etc.

Then I went crazy and pulled out most everything from my original custom module and put them in to their own 4 modules. Line count dropped from 180 down to about 55.

From my perspective I greatly increased readability and made it easier to maintain, especially as additional features get added.

Is there any problem with this? Nothing computationally data heavy, just process stuff for CI/CD.

Sub Par
Jul 18, 2001


Dinosaur Gum
I suck at encoding and I am having a stupid problem. I am using requests to get XML from a web service. It returns unicode, which I verified by doing (2.7.13):
code:
response = requests.post(endpoint, headers = header, data = payload)
print type(response.text)
Which gives <type 'unicode'>. Great. I want to parse it with ElementTree. I have tried both of these constructions:
code:
tree = ET.ElementTree(ET.fromstring(response.text))
tree = ET.ElementTree(ET.fromstring(response.text.encode('utf-8')))
Both give me:
code:
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 4907: ordinal not in range(128)
I'm sure there's something really simple and dumb I'm just not doing, but like I said, I suck at this poo poo and just want to parse the stupid XML so I can use it. Help?

Dr Subterfuge
Aug 31, 2005

TIME TO ROC N' ROLL
Encoding is terrible and you shouldn't be using 2.7 unless someone is forcing you to.

That being said, take a look at the normalize function in unicodedata

Sub Par
Jul 18, 2001


Dinosaur Gum
I'm only using 2.7 because I have a legacy codebase that I'm too lazy to port over to 3. It's doing something pretty trivial and I'm sure I'll move it over sometime in the next few months but it's super low on the priority list. I'll check that link out. Thanks.

Gothmog1065
May 14, 2009
If I initialize a variable within a class method, does it have to use the self prefix?

code:
class Class(object):
    def __init__(self, varA, varB):
        self.varA = varA
        self.varB = varB
        
    self.res_str = ""
    num = 0

    def work_on_vars():
        num = self.varA + self.varB

        if num < 2:
            self.res_str = "This particular number is less than 2!
        else: 
            print "Do something else"
Therefore all of the instances can work on res_str, but in question, num should be a local variable that can't be called from outside of the specific function/class, correct? I can't do something like this:

code:
foo = Class(1,3)
bar = Class(3,5)
print foo.num
print bar.num

QuarkJets
Sep 8, 2008

Gothmog1065 posted:

If I initialize a variable within a class method, does it have to use the self prefix?

code:
class Class(object):
    def __init__(self, varA, varB):
        self.varA = varA
        self.varB = varB
        
    self.res_str = ""
    num = 0

    def work_on_vars():
        num = self.varA + self.varB

        if num < 2:
            self.res_str = "This particular number is less than 2!
        else: 
            print "Do something else"
Therefore all of the instances can work on res_str, but in question, num should be a local variable that can't be called from outside of the specific function/class, correct? I can't do something like this:

code:
foo = Class(1,3)
bar = Class(3,5)
print foo.num
print bar.num

Not using self means that you've defined a class attribute. This is valid, all instances of the class are accessing a shared version of the variable and that may be the behavior that you want. foo.num and bar.num will always be the same value in your code

Using self creates an instance attribute, meaning all instances of the classes are accessing their own version of the variable. Declaring "self.num = 0" in your constructor creates an instance attribute, and then foo.num and bar.num can be different values

SurgicalOntologist
Jun 17, 2004

I'm doing a bunch of computations and saving them to a database. The following function is supposed to be committing to the database every 100 results (based on the variable commit_interval), and stopping after 10,000 (the value of total). But I'm not getting 10,000 results on each backtest_set, instead I'm getting 9,900. Can anyone spot the error? Am I making a basic off-by-one -ish mistake or do I have some weird race condition? I can't figure it out. Here's the code.
Python code:
    results = []
    for i, result in enumerate(tqdm(
            backtest_set.run(),
            unit='entry', desc='Backtesting', total=total, unit_scale=True,
    ), 1):
        results.append(dict(backtest_id=backtest_set.id, fantasy_points_hundredths=result))
        if not i % commit_interval or i == total:
            db.execute(models.BacktestResult.__table__.insert(), results)
            db.commit()
            results = []
        if i == total:
            break
Any ideas?

The actual work is here is done inside the generator function backtest_set.run(). And if you haven't seen it before, tqmd is just a progress bar library (a pretty great one), as I'm using it just wraps an iterable. The only other things there are my models and sqlalchemy session db.

Adbot
ADBOT LOVES YOU

cinci zoo sniper
Mar 15, 2013




SurgicalOntologist posted:

I'm doing a bunch of computations and saving them to a database. The following function is supposed to be committing to the database every 100 results (based on the variable commit_interval), and stopping after 10,000 (the value of total). But I'm not getting 10,000 results on each backtest_set, instead I'm getting 9,900. Can anyone spot the error? Am I making a basic off-by-one -ish mistake or do I have some weird race condition? I can't figure it out. Here's the code.
Python code:
    results = []
    for i, result in enumerate(tqdm(
            backtest_set.run(),
            unit='entry', desc='Backtesting', total=total, unit_scale=True,
    ), 1):
        results.append(dict(backtest_id=backtest_set.id, fantasy_points_hundredths=result))
        if not i % commit_interval or i == total:
            db.execute(models.BacktestResult.__table__.insert(), results)
            db.commit()
            results = []
        if i == total:
            break
Any ideas?

The actual work is here is done inside the generator function backtest_set.run(). And if you haven't seen it before, tqmd is just a progress bar library (a pretty great one), as I'm using it just wraps an iterable. The only other things there are my models and sqlalchemy session db.

If your dataset has only 10000 records for test then this will be off by one error, maximum i is 9999 and can not be equal 10000, and the last modulo operation will resolve to not True.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply