Python information and short questions megathread.

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.

«‹›484 »

hooah: Feb 6, 2006; WTF?

Thermopyle posted:

Functions are objects. Generally, objects can have arbitrary attributes. See PEP-232.

You can also create a function-y type of thing by creating an class with a __call__ method.

Ok, so the student probably got poo poo from some website somewhere that he didn't understand (intro to programming class).

# ? Jul 2, 2015 18:59

Adbot: ADBOT LOVES YOU

# ? May 9, 2024 04:47

mekkanare: Sep 12, 2008; We have detected you are using ad blocking software.

Please add us to your whitelist to view this content.

This is a shot in the dark, but does anybody happen to have experience with OpenCV? I'm running python 2.7.10 with OpenCV 3.0.0. I'm trying to understand how the mouse_and_match example in the source codes works.

On the left is my code, on the right is the example code.

What I'm trying to get from it is how it fakes the selection rectangle while the mouse is held down. That is, the marching ants you get when using a cropping tool in something like MSPaint. As is shown in my image, my code on the left isn't working exactly right.
Any help would be appreciated.

Also this is my second exposure to python, the first time being doing the Django tutorial. I'm more used to C/C++, but I've never really done any GUI stuff in it either.

mekkanare fucked around with this message at 22:48 on Jul 2, 2015

# ? Jul 2, 2015 22:45

suffix: Jul 27, 2013; Wheeee!

mekkanare posted:

This is a shot in the dark, but does anybody happen to have experience with OpenCV? I'm running python 2.7.10 with OpenCV 3.0.0. I'm trying to understand how the mouse_and_match example in the source codes works.

On the left is my code, on the right is the example code.

What I'm trying to get from it is how it fakes the selection rectangle while the mouse is held down. That is, the marching ants you get when using a cropping tool in something like MSPaint. As is shown in my image, my code on the left isn't working exactly right.
Any help would be appreciated.

Also this is my second exposure to python, the first time being doing the Django tutorial. I'm more used to C/C++, but I've never really done any GUI stuff in it either.

The main difference I see is that the example code makes a new copy of the image for each frame ("img = cv2.cvtColor(gray, cv2.COLOR_GRAY2BGR)") and draws on and displays that, while you draw on the original image, keeping all the rectangles you draw.
You can do something like "img2 = img.copy()" to copy the image without the color conversion.

# ? Jul 2, 2015 23:17

mekkanare: Sep 12, 2008; We have detected you are using ad blocking software.

Please add us to your whitelist to view this content.

suffix posted:

The main difference I see is that the example code makes a new copy of the image for each frame ("img = cv2.cvtColor(gray, cv2.COLOR_GRAY2BGR)") and draws on and displays that, while you draw on the original image, keeping all the rectangles you draw.
You can do something like "img2 = img.copy()" to copy the image without the color conversion.

Oh thanks for the quick reply, that was exactly it too!
It really seems like they would have gone a more memory efficient way rather than copying objects.

# ? Jul 2, 2015 23:41

QuarkJets: Sep 8, 2008

mekkanare posted:

Oh thanks for the quick reply, that was exactly it too!
It really seems like they would have gone a more memory efficient way rather than copying objects.

OpenCV is drawing directly onto the image, so every time that you call rectangle on the same array that just throws another rectangle. If you don't want the old rectangles on it, then you'll need to make a copy of the image first, and then draw onto the copy. You can do this in a memory efficient way by just copying into an array that has already been allocated

# ? Jul 3, 2015 07:31

Plasmafountain: Jun 17, 2008

I've got issues with array slicing and iteration - inadvertently overwriting values that should (?) remain static, but I cant see exactly where I've gone wrong.

Pastebin: http://pastebin.com/7CREYYwX[1]

I'm having an issue with the functions on and after line 162, where I'm trying to fill in the arrays initial_theta (theta_copy for static values) and initial_nu (nu_copy). The values already there are boundary conditions that need to be satisfied; the NaN values are the ones that need to be replaced by calculation.

These two arrays are sliced columnwise (166-169) with two columns of interest - the a column and the b column. The b column is the column that has NaN values being replaced by calculated values by the function. The a column is the column that contains the values that the b column is calculated from.

These two columns get passed to different functions to calculate the values.

If you run the code and compare the output of the function nuandthetacheck with the initial array copies (nu_copy & theta_copy) a couple of things become apparent:

1) Values of theta simply get passed from column to column overwriting the initial boundary conditions.

2) The nu array is filled out entirely while the theta array has missing values in the final column.

I guess really my question is: What have I missed here? Is it array indexing of some kind? As far as I can tell my indexing avoids overwriting the initial given values in each array but obviously this isnt working quite like I intended.

# ? Jul 5, 2015 08:11

QuarkJets: Sep 8, 2008

It looks like you're dereferencing theta with index i and then setting values using index j. Did you mean to do that?

# ? Jul 5, 2015 09:32

Dominoes: Sep 20, 2007

Hey dudes, learning Flask. The tutorial I'm using describes 'Fake objects', which are dictionaries created in a view that are called using object notation in templates.

The syntax is clean, but nonstandard. Why not use dictionary syntax in the template, or use a namedtuple in the view, to keep things consistent?

Note: It appears that while dict syntax is invalid in the templates, using namedtuples works fine in the views file. It seems more consistent, and prevents having to write the attribute names for each assignment.

Dominoes fucked around with this message at 10:24 on Jul 5, 2015

# ? Jul 5, 2015 10:01

Plasmafountain: Jun 17, 2008

QuarkJets posted:

It looks like you're dereferencing theta with index i and then setting values using index j. Did you mean to do that?

I take it you're referring to this bit:

code:

def find_theta_and_nu2(theta_values,nu_values):
   
    for i in range(1,mesh_size-1):
        for j in range(1,mesh_size-1):
            theta_a = theta_values[:,i-1]
            theta_b = theta_values[:,i]
            nu_a = nu_values[:,j-1]
            nu_b = nu_values[:,j]

And then later:

code:

  
theta_values[:,j] = intermediate_step_theta
nu_values[:,j] = intermediate_step_nu

I thought in this case the second nested for j still carried the i indexing through - although I think youre right with the second part, theta_values[:,j] should probably be theta_values[:,i].

Hmm. Changed the above to:

code:

def find_theta_and_nu2(theta_values,nu_values):
   
    for i in range(1,mesh_size-1):
	 theta_a = theta_values[:,i-1]
         theta_b = theta_values[:,i]
        for j in range(1,mesh_size-1):
            theta_a = theta_values[:,i-1]
            theta_b = theta_values[:,i]
            nu_a = nu_values[:,j-1]
            nu_b = nu_values[:,j]

code:

  
theta_values[:,i] = intermediate_step_theta
nu_values[:,j] = intermediate_step_nu

And got the same result.

# ? Jul 5, 2015 10:22

baka kaba: Jul 19, 2003; PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

Tried stepping through it with a debugger? That way you can watch it fill out entries and work out why it's tripping up on some

# ? Jul 5, 2015 12:41

Plasmafountain: Jun 17, 2008

baka kaba posted:

Tried stepping through it with a debugger? That way you can watch it fill out entries and work out why it's tripping up on some

Ive had a go at doing so with Spyder's inbuilt debug prompt and pdb but I get the feeling that I'm doing that wrong as well - stepping into the function and then trying to proceed from there leaps to the end of the script. :/

# ? Jul 5, 2015 15:36

hooah: Feb 6, 2006; WTF?

Zero Gravitas posted:

Ive had a go at doing so with Spyder's inbuilt debug prompt and pdb but I get the feeling that I'm doing that wrong as well - stepping into the function and then trying to proceed from there leaps to the end of the script. :/

Please for the love of God use something better than Spyder. Probably anything will do, but I'd recommend PyCharm. I just finished being a TA for an intro to programming course using Spyder and its debugger is poo poo.

# ? Jul 5, 2015 17:44

pmchem: Jan 22, 2010

I haven't used its built-in debugger much, but pretty much everything else in Spyder has been quite smooth and pleasant to use. I recommend it for numerical work.

# ? Jul 5, 2015 20:05

Plasmafountain: Jun 17, 2008

Had a look at Pycharm and I think I'm going to echo Pmchem's opinion - Spyder seems to be better for my kind of thing which is essentially crunching a lot of numbers.

To that end, I discovered my problem by doing a lot of calculations by hand, although I'm not sure how to fix it and I'd really appreciate any pointers in that direction.

http://pastebin.com/rQ4EWyG0

As a brief recap, the two output arrays Theta and Nu have columns that are calculated from their own and the others previous columns as the range is stepped through from 1 - mesh_size from left to right.

For some reason and as far as I can tell, Theta[1,2] which is 0.11475 is the single wrong value, and it is wrong because it is the only time that it performs a calculation where the values for the previous Theta column are not used in the calculation of the current Theta column [:,2]- although they should be as the values for Theta[:,1] were calculated on the previous iteration.

I have a feeling that its something to do with the funky indexing - despite almost every other value in the array calculating fine (apart from those values that depend on the funky one above) the apparent indexing goes beyond the array index being calculated - the range is 1,mesh_size in the primary function and 1,mesh_size-1 in the subsequent functions that actually do most of the work. This should put the array index to five, but that cant be because the highest index in a length 5 array would 4.

Its puzzling and I can't quite figure it out.

# ? Jul 6, 2015 00:24

baka kaba: Jul 19, 2003; PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

Well I don't entirely know what's going on in there :yayclod:

or much about these arrays, but aren't you skipping calculations for the first and last elements in each column? You're iterating over range(1, size-1), which starts on index 1 instead of 0, and stops before size-1 - so in a 5-element list you're handling indices 1, 2 and 3. You're accounting for that when you do a calculation, but you're not actually updating those edge elements

I have no idea what I'm talking about so I might be completely misunderstanding what's happening, so maybe it's all cool. Learn to use the debugger! It's important - you might be hitting Step Over when you mean to do Step Into, or whatever the equivalents are, so learning how that works will help you a ton. It's so much faster than spamming logging in hopeful places

# ? Jul 6, 2015 00:47

Plasmafountain: Jun 17, 2008

The equations I'm using mean that I can specify values at the boundaries of an array and fill the rest of it in later. The indices 0 and 4 are already specified, its meant to simply calculate the stuff in the middle. This is why the initial row of nu gets its own little function to calculate that first before the rest of it is filled in - the calculation cant propagate from left to right without the edge values.

Its also why Im puzzled about the error/s - the calculation is proceeding for only *some* elements as if the boundary values arent there.

# ? Jul 6, 2015 00:53

baka kaba: Jul 19, 2003; PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

That's after the first iteration of find_theta_and_nu2 - does that look right? Based on those values I get 0.11475 for row1 col2 as well

# ? Jul 6, 2015 02:01

Plasmafountain: Jun 17, 2008

I found the error, it was in my nu_b function for calculating the top line of nu. It was wrong and threw everything else off simply because there was a positive instead of a negative. :argh:

# ? Jul 6, 2015 13:25

BigRedDot: Mar 6, 2008

Shameless plug time: Bokeh version 0.9.1 released today, get it with conda or pip. Lots of good stuff, especially around interactions for static docs: http://bokeh.pydata.org/en/latest/docs/user_guide/interaction.html.

I'm also giving a talk about Bokeh at SciPy in a few days, I'll post the link whenever the video goes up.

# ? Jul 6, 2015 23:24

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

fletcher posted:

Hmm maybe the original encoding was ok, I added a .encode('utf-8') and it seems to be working now:
code:
r = self.session.post(url,
                              headers={'Content-Type': 'text/csv;charset=UTF-8', 'Accept': 'text/csv'},
                              data=payload.getvalue().encode('utf-8'))
How come I had to add that though?

Bumping this one again...still not clear on why I had to do this

# ? Jul 6, 2015 23:52

accipter: Sep 12, 2003

fletcher posted:

Bumping this one again...still not clear on why I had to do this

What happens if you use 'Content-Type': 'text/csv'? Do you still have to do the ...encode('utf-8') to get it to work?

# ? Jul 7, 2015 00:04

Gothmog1065: May 14, 2009

Okay, kind of a general coding question but since I'm in Python I'll ask here.

I'm doing the mars landing game over at codingames.com, and I have a "best way" question. What I'm trying to do is find the "flat" plane on a 2d grid. I know I can pound something together with a for loop that compares the Y values and if they're the same over a certain length, I keep going until I have the end, get the start and end points to get the length. Is there a better way to do this? The X and Y values are given in different lists, and the "surface" is created by drawing a line between each point sequentially. Here is the exact input as given by the game:

Python code:

for i in range(N):
    # LAND_X: X coordinate of a surface point. (0 to 6999)
    # LAND_Y: Y coordinate of a surface point. By linking all the points together in a sequential fashion, you form the surface of Mars.
    LAND_X, LAND_Y = [int(j) for j in input().split()]

I would probably do a separate list something similar to this:

code:

start Y = 0
for i in range(N):
    Compare Y points, if they are the same, do nothing
    if they are different, get the 'length' from the beginning, if it's < 2 keep going
   else break the loop

I don't need this right now, but I'm looking to wrap my head around the more complex parts. The rest I can probably cobble together by myself.

# ? Jul 7, 2015 14:55

onionradish: Jul 6, 2006; That's spicy.

I have a file download function that I'd like to add some sort of KB/sec rate limiting to. I've tried attempting to add time.sleep(n) values using static and calculated values for 'n', but results have been crap. What should I be doing?

Python code:

SESSION = requests.Session()

def download(url, filepath):

    with open(filepath, 'wb') as f:
        r = SESSION.get(url, stream=True)
        for chunk in r.iter_content(1024):
            f.write(chunk)

# ? Jul 7, 2015 16:04

Lyon: Apr 17, 2003

Gothmog1065 posted:

Okay, kind of a general coding question but since I'm in Python I'll ask here.

I'm doing the mars landing game over at codingames.com, and I have a "best way" question. What I'm trying to do is find the "flat" plane on a 2d grid. I know I can pound something together with a for loop that compares the Y values and if they're the same over a certain length, I keep going until I have the end, get the start and end points to get the length. Is there a better way to do this? The X and Y values are given in different lists, and the "surface" is created by drawing a line between each point sequentially. Here is the exact input as given by the game:
Python code:
for i in range(N):
    # LAND_X: X coordinate of a surface point. (0 to 6999)
    # LAND_Y: Y coordinate of a surface point. By linking all the points together in a sequential fashion, you form the surface of Mars.
    LAND_X, LAND_Y = [int(j) for j in input().split()]
I would probably do a separate list something similar to this:
code:
start Y = 0
for i in range(N):
    Compare Y points, if they are the same, do nothing
    if they are different, get the 'length' from the beginning, if it's < 2 keep going
   else break the loop
I don't need this right now, but I'm looking to wrap my head around the more complex parts. The rest I can probably cobble together by myself.

So you have a list of X coordinates and a list of Y coordinates and the indexes match up (which is kind of gross in my opinion but you didn't build the exercise). There are a few different approaches depending on what the output needs to be... if you want to find the longest stretch of flat X coordinates you probably need to iterate over the Y coordinates and create either a list of all the X coordinates that are in that flat stretch or record the start and end points.

code:

LAND_X = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
LAND_Y = [5, 1, 1, 3, 1, 1, 1, 1, 7, 8, 1, 9, 9]
x_min = 0
x_max = 0
flat_surfaces = []
temp_surfaces = []

assert(len(LAND_X) == len(LAND_Y))

for i in range(len(LAND_Y)):
    if LAND_Y[i] < 2 and LAND_Y[i-1] >= 2:        
        x_min = i
    elif LAND_Y[i] < 2 and LAND_Y[i+1] >= 2:
        x_max = i
        flat_surfaces.append((x_min, x_max))

print(flat_surfaces)

Output:
[(1, 2), (4, 7)]

This was a quick and dirty attempt but it outputs a list of tuples in the format of (start_of_flat_surface, end_of_flat_surface). It currently ignores any flat surface that has a length of 1 though this should be simple to fix if you put more thought into your if statements. Basically the two if statements check to see if we went from a high Y value to a low Y value, if that's the case then set that as a new starting point. The second if statement checks to see if we go from a low Y value to a high Y value, if that's the case then it sets that as the ending point and adds the tuple to the list. Because of the way these if statements are setup it will always skip a single low Y value.

The easiest way would just be to iterate over the Y values and build lists of the flat surfaces.

code:

for i in range(len(LAND_Y)):
    if LAND_Y[i] < 2:
        temp_surfaces.append(i)
    elif LAND_Y[i] > 2 and temp_surfaces != []:
        flat_surfaces.append(temp_surfaces)
        temp_surfaces = []

Output:
[[1, 2], [4, 5, 6, 7], [10]]

This second bit of code uses all the same variables as the first bit so you can just swap in one for loop for the other (or you can run them both but make sure you reset flat_surfaces to [] in between the loops). I'm sure there will be more elegant solutions offered but I was bored at lunch

.

Edit: My first solution will be full of interesting bugs because I didn't actually take into account for off by one errors, however because Python lists can be negative indexed it seems to run without breaking but I'm sure there are bugs/issues. I don't have the time now but I'll fix it later.

Lyon fucked around with this message at 17:41 on Jul 7, 2015

# ? Jul 7, 2015 17:29

QuarkJets: Sep 8, 2008

onionradish posted:

I have a file download function that I'd like to add some sort of KB/sec rate limiting to. I've tried attempting to add time.sleep(n) values using static and calculated values for 'n', but results have been crap. What should I be doing?
Python code:
SESSION = requests.Session()

def download(url, filepath):

    with open(filepath, 'wb') as f:
        r = SESSION.get(url, stream=True)
        for chunk in r.iter_content(1024):
            f.write(chunk)

Google for "token bucket algorithm"

# ? Jul 7, 2015 18:56

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

accipter posted:

What happens if you use 'Content-Type': 'text/csv'? Do you still have to do the ...encode('utf-8') to get it to work?

Yup. With 'Content-Type': 'text/csv' and without encode('utf-8') it does not work correctly.

It does seem to work fine 'Content-Type': 'text/csv' and encode('utf-8')

I still don't get it!

# ? Jul 8, 2015 03:45

Plasmafountain: Jun 17, 2008

EDIT: embarrassingly bad error. :derp:

Plasmafountain fucked around with this message at 14:04 on Jul 8, 2015

# ? Jul 8, 2015 10:47

Dominoes: Sep 20, 2007

Database question. Using Django, but it's more of a general ORM question. I'm looking to store data that can have one of a few set choices. It seems like a common way of doing this is by using integers, since they take up less space compared to writing the data out for each row. Is there a clean way to map these integers to their names cleanly in an ORM like Django's or SQLAlchemy?

Do databases intelligently stores charfields to avoid repetition?

It seems like this article provides a solution, although its focus is more about ways to restrict choices than data storage; it uses integer for choices as its example. Also: It has a simple solution using Django's choices kwarg, and comments about Python not having enum. Maybe that, with 3.4's enum would be an ideal solution.

Dominoes fucked around with this message at 12:16 on Jul 8, 2015

# ? Jul 8, 2015 12:08

Helado: Mar 7, 2004

fletcher posted:

Yup. With 'Content-Type': 'text/csv' and without encode('utf-8') it does not work correctly.

It does seem to work fine 'Content-Type': 'text/csv' and encode('utf-8')

I still don't get it!

What does your stringIO class say its encoding is set to (payload.encoding iirc)? Check if everything is getting opened as utf-8 all the way through. Secondly, I believe the post method allows data to be a file-like object, so rather than call getvalue() you should be able to just pass it the io.StringIO object (it's one less conversion to worry about). The other thing might be your source data; if they exported the CSV from Excel, I've had issues in the past of wonky file encodings.

# ? Jul 8, 2015 15:10

Munkeymon: Aug 14, 2003; Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.

Dominoes posted:

Database question. Using Django, but it's more of a general ORM question. I'm looking to store data that can have one of a few set choices. It seems like a common way of doing this is by using integers, since they take up less space compared to writing the data out for each row. Is there a clean way to map these integers to their names cleanly in an ORM like Django's or SQLAlchemy?

Do databases intelligently stores charfields to avoid repetition?

It seems like this article provides a solution, although its focus is more about ways to restrict choices than data storage; it uses integer for choices as its example. Also: It has a simple solution using Django's choices kwarg, and comments about Python not having enum. Maybe that, with 3.4's enum would be an ideal solution.

You should probably have a table that stores the choices and use foreign keys in whatever table they're related to to reference them. That way you can keep your data (available choices) out of your logic layer and adding or removing one becomes much simpler.

# ? Jul 8, 2015 15:20

Dominoes: Sep 20, 2007

Munkeymon posted:

You should probably have a table that stores the choices and use foreign keys in whatever table they're related to to reference them. That way you can keep your data (available choices) out of your logic layer and adding or removing one becomes much simpler.

I feel like that adding (and migrating, ugh) a table for each choice field might be overkill; This package seems like it would handle this situation elegantly. Apparently MySQL and Postgres have built-in enum fields, but Django's ORM (natively) doesn't support them, to keep the interface homogeneous for all database types. Note that the package I link doesn't use these, but provides an interface to char and int fields.

Dominoes fucked around with this message at 21:52 on Jul 8, 2015

# ? Jul 8, 2015 21:25

Hed: Mar 31, 2004; Fun Shoe

"Dominoes" posted:

It seems like a common way of doing this is by using integers, since they take up less space compared to writing the data out for each row. Is there a clean way to map these integers to their names cleanly in an ORM like Django's or SQLAlchemy?

Is there a reason you aren't satisfied with the built in model choices field?

# ? Jul 8, 2015 22:22

Dominoes: Sep 20, 2007

Hed posted:

Is there a reason you aren't satisfied with the built in model choices field?

It seems clumsy, since using an integer (or single-letter char)for the choice is to reduce stored data size and improve performance. (so you're storing 1 and 2 a million times instead of "The H.M.S. Pinnafore" and "The Pirates of Penzance".) It would be nice not to expose it, rather to use what it stands for when modifying or accessing the database.

Perhaps using the built-in choices field, with its (value, description) tuple's the best answer, but I'm looking at other options to compare it with.

Dominoes fucked around with this message at 22:52 on Jul 8, 2015

# ? Jul 8, 2015 22:47

Thermopyle: Jul 1, 2003; ...the stupid are cocksure while the intelligent are full of doubt. �Bertrand Russell

Dominoes posted:

It seems clumsy, since using an integer (or single-letter char)for the choice is to reduce stored data size and improve performance. (so you're storing 1 and 2 a million times instead of "The H.M.S. Pinnafore" and "The Pirates of Penzance".) It would be nice not to expose it, rather to use what it stands for when modifying or accessing the database.

Perhaps using the built-in choices field, with its (value, description) tuple's the best answer, but I'm looking at other options to compare it with.

To be honest, this sounds like some premature optimization. Just use the easy built-in facilities.

If you run into performance or resource constraints then worry about it. I don't think you will.

# ? Jul 8, 2015 23:17

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

Thermopyle posted:

To be honest, this sounds like some premature optimization. Just use the easy built-in facilities.

If you run into performance or resource constraints then worry about it. I don't think you will.

I think this is the right answer.

If you are dead set on trying to optimize maybe at least look at something like this: https://pypi.python.org/pypi/django-enumfield

# ? Jul 9, 2015 00:00

Hed: Mar 31, 2004; Fun Shoe

That's really my followup... I haven't heard any good reasons against it and in the hours you've been researching it you could have implemented it, tested it against real data. If you ever need to switch over to something more complex, data migrations aren't very hard. I get it, it may seem weird to encounter it in the Django docs at first, but there's a real good chance it will suit your needs. If not, you can move to something else that does.

# ? Jul 9, 2015 00:15

Dominoes: Sep 20, 2007

Thanks for the info dudes. Currently, I have the db set up with a module similarly-named to the one fletcher linked (In my post above), that uses the stock Python enum. Using Django's choices, or just storing everything in a Charfields should be fine for this project, but since I'm new to databases, I'm trying to learn what options are out there. Intuitively, it makes sense that using some type of optimization for repeated strings will significantly reduce database size.

Dominoes fucked around with this message at 00:38 on Jul 9, 2015

# ? Jul 9, 2015 00:29

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

Helado posted:

What does your stringIO class say its encoding is set to (payload.encoding iirc)? Check if everything is getting opened as utf-8 all the way through. Secondly, I believe the post method allows data to be a file-like object, so rather than call getvalue() you should be able to just pass it the io.StringIO object (it's one less conversion to worry about). The other thing might be your source data; if they exported the CSV from Excel, I've had issues in the past of wonky file encodings.

I double checked the source data it seems to be ok and encoded properly

payload.encoding was None. I tried setting it to utf-8 but it gave me:

quote:

AttributeError: attribute 'encoding' of '_io._TextIOBase' objects is not writable

I tried getting rid of getvalue() but I get a stack trace into the requests module with error message:

quote:

TypeError: 'str' does not support the buffer interface

I feel like I am totally missing something here...

# ? Jul 9, 2015 00:38

BigRedDot: Mar 6, 2008

SciPy videos up on YouTube surprisingly fast. Here's my Bokeh talk

https://www.youtube.com/watch?v=c9CgHHz_iYk

# ? Jul 9, 2015 05:45

Adbot: ADBOT LOVES YOU

# ? May 9, 2024 04:47

Helado: Mar 7, 2004

fletcher posted:

I double checked the source data it seems to be ok and encoded properly

payload.encoding was None. I tried setting it to utf-8 but it gave me:

I tried getting rid of getvalue() but I get a stack trace into the requests module with error message:

I feel like I am totally missing something here...

I'll admit I'm not terribly familiar with the requests module, so just did a quick lookup on the post method. Back to the encoding issue, when you use io.StringIO it is basically a string in memory. If you check the parent class for StringIO and you see that StringIO doesn't really have a buffered backing (https://docs.python.org/3/library/io.html#io.TextIOBase.detach). Which is why the getvalue() is necessary for the post method. So, encoding is none because stringIO is just a string stored in memory, there is no encoding. It's stored as a unicode string in some format. When you need to write the file out (in this case sending bytes to a POST), you need to convert it to a stream of bytes encoded in the proper format. We need to convert the unicode string to a utf-8 stream of bytes (https://docs.python.org/3/howto/unicode.html#converting-to-bytes). If you were to read out the data from your file as bytes and into a io.BytesIO object, it would come out already encoded because it was never decoded to a string.

# ? Jul 9, 2015 06:23

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.

«‹›484 »