Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.
It seems like Pandas just making numpy behave like data frames from R or tables from Matlab. Is that the basic idea or is there more to it?

Adbot
ADBOT LOVES YOU

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

rock2much posted:

I did this! :woop:
Any recommendations for more projects?

Just work through the problems on Project Euler.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.
Is there a canonical way to get a column vector in numpy? The obvious a.T doesn't work on a normal vector. If I want to get a 1-5 multiplication table in Matlab I can do:

code:
a = 1:5;
a'*a
The similar numpy equivalent:

code:
a = np.arange(1,6)
a.T * a
Doesn't work because a.T is the same thing as a.

There's obviously a bunch of ways to do this:

code:
a = np.arange(1,6)
np.atleast_2d(a).T * a

a.reshape(len(a),1) * a

b = np.array([np.arange(1,6)]).T
a*b
All of those work, but they also seem rather crude.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

Symbolic Butt posted:

You can use the outer product, that's what I use when I need something like that: np.outer(a, a)

Maybe someone better than me can answer you why working with column vectors aren't that great. :v:

Well, there are other times when column vectors are useful I would think, unless there's some other more pythonic structure I don't know about. Say I have a 3x2 matrix M and want to add a third column that's a linear combination of the first two. In Matlab I can do:

code:
>> M = [1 2; 4 3; 3 1]

M =

     1     2
     4     3
     3     1

>> [M, 2*M(:,1)-M(:,2)]

ans =

     1     2     0
     4     3     5
     3     1     5

Whereas in python the things I've been able to come up with work, but seem really messy:

Python code:
>>> M = np.array([[1,2],[4,3],[3,1]])
array([[1, 2],
       [4, 3],
       [3, 1]])

>>> np.vstack((M.T, 2*M[:,0] - M[:,1])).T  # the double transpose method
array([[1, 2, 0],
       [4, 3, 5],
       [3, 1, 5]])

>>> np.hstack((M, (2*M[:,0] - M[:,1]).reshape(3,1)))  # the explicit reshape method
array([[1, 2, 0],
       [4, 3, 5],
       [3, 1, 5]])
I'm sure there's a "right" way to do this, but it's not obvious to me what it is.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

Reformed Pissboy posted:

Nothing wrong with this approach, but you don't need to iterate over string.printable -- i in string.printable without the for will evaluate as "does string.printable contain i?"

Python code:
for i in input:
    if i in string.printable:
        answer += i

Also, to hell with list comprehensions :hehe:
Python code:
import string
input = raw_input()
answer = filter(lambda c: c in string.printable, input)
print answer


What's wrong with list comprehension approach?

Python code:
>>> thedog = 'the dog'
>>> [c for c in thedog if c in string.hexdigits]
['e', 'd']

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

QuarkJets posted:

PEP 8 disagrees

Really the best reason to have anonymous functions in other languages is that other languages don't let you define named functions within the scope of other functions. But Python lets you define a function wherever you want, with or without invoking lambda, so lambda is just a superfluous way to define a function

You can use a lambda function inline without assigning it. PEP8 is fine with that usage.

edit:

By which I mean:

Python code:

# This...
map(lambda x: x*x, my_list)

# Is generally preferable to this...
def sq(x):
    return x*x

map(sq, my_list)

# whereas PEP8 prohibits this...
sq = lambda x: x*x
map(sq, my_list)

KernelSlanders fucked around with this message at 21:37 on Jul 24, 2014

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

Crosscontaminant posted:

The standard library (e.g. operator) provides useful functions like itemgetter for use as key functions and with map/reduce, so there should be no need for a lambda for anything this simplistic. If it's more complex, give it a name and documentation so people who come after you know what the hell you're doing and can use it elsewhere without having to refactor.

How is map(operator.pow, my_list, [2]*len(my_list)) more readable? Or are you suggesting something else?

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

floppo posted:

apologies if this is too broad a question for this thread. I am interested getting data from twitter - specifically a list of all the people following a specific account. My goal is compare two such lists and see who is connected to who - this would require a second set of queries.

I know a little bit of python, but mostly from homework-type assignments that don't require getting data. I've found the twitter API website but I could use a bit more hand-holding. In short does anyone know a good guide to scraping data from twitter using python? I thought I would narrow my search to python stuff for now, but feel free to suggest alternatives if you know of them.

I assume this isn't some money-making venture and you're doing this to gain some experience from the project. In that case, the most valuable skill you will get out of a project like that is the ability to make sense of API docs and build a compliant interface to them. I would strongly suggest you just give it a try. If you're completely lost, the best first hint I can give is that the API will work by you formatting URLs a certain way (as described in the docs) and making web requests with them. The web service will reply with a text document with the data you want in a format also described in the docs.

I strongly suggest, you start by just trying to get something (anything) from the API and build from there. I've done some twitter integration before, so if you get stuck come back or shoot me a PM.

If you don't care about learning how to do it, there is a module called python-twitter that may help, but I haven't really played with it.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

ShadowHawk posted:

Python code:
#!/usr/bin/env python3
import itertools
print("\n".join(("".join(fizzbuzz) or str(number) for number, fizzbuzz in itertools.islice(enumerate(zip(itertools.cycle(("fizz","","")), itertools.cycle(("buzz","","","","")))), 1, 101))))
Do I get the job?

I always hated questions like that because there are so many possible answers and I never know what they're looking for. Are you trying to show how you're competent to write professional code rather than homework assignments? Use the library. Do you know how to solve basic programming problems? Write a legible homework-style function with superfluous comments. Do you really know the ins and outs of the platform? Do something like ShadowHawk.

It's a bit like that old Google interview question of, "On a new hardware platform, how would you determine if the stack grows up or down in memory?" I always thought the answer should be, "Read the documentation." Anyone writing code to solve that one is tunnel-visioning and just wasting company time. Of course, I don't work for Google, so what do I know.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

QuarkJets posted:

Is this MySQLdb or something else? For whatever reason I never have to use commit() with MySQLdb

Depending on whether you're using MyISAM or InnoDB, MySQL may not support transactions. It's also possible on InnoDB to set autocommit on.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.
You can also open a terminal, navigate to the PyCharm project directory, and run the script from the command line there.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.
Is there a good way to filter on a list of tuples? For example, let's say I want to sum the second values where something about the first value is true. This can be done with a list comprehension, but feels a little clunky.

Python code:
my_array = [('a', 1), ('b', 2), ('c', 3), ('a', 4)]
sum([x[1] for x in my_array if x[0] == 'a'])
#> 5

' '.join([x[0] for x in my_array if mod(x[1],2) == 0])
#> 'b a'
Obviously that works, but something about it doesn't seem quite right. I suppose I could write some filter/transform/acumulate function, but it really seems like there should be a better builtin.

Python code:
def fta(filter_func, transform_func, accumulate_func, input_list):
    return accumulate_func(map(transform_func, filter(filter_func, input_list)))


fta(lambda x: x[0]=='a', lambda x: x[1], lambda x: sum(x), my_array)
#> 5
Actually, now that I wrote it down, that looks much worse.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

Symbolic Butt posted:

Your example is kind of odd, why can't you use a dictionary? Or even something like this:

Python code:
my_list = ['a', 'b', 'c', 'a']
sum(i+1 for i, c in enumerate(my_list) if c == 'a')
# 5

' '.join(c for i, c in enumerate(my_list) if (i+1)%2 == 0)
# 'b a'

Because my_list is the output of a library function that returns a list of tuples. I have no idea why it does this rather than returning a dictionary. Also, the fact that the second values of my tuples were 1,2,3,4 was just a poor choice in setting up my example and could be anything. I do think your tuple expansion combined with ShadowHawk's point about using the generator leads to something pretty clean.

Python code:
my_list = [('a', 1), ('b', 2), ('c', 3), ('a', 4)]
sum(num for string, num in my_list if string == 'a')
#> 5
I think the two problems with the original were the allocating of a list to be immediately tossed and the cryptic x[0], x[1] calls. This seems to fix both.

KernelSlanders fucked around with this message at 20:17 on Aug 17, 2014

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

dirtycajun posted:

Okay I made the changes, turns out I did need the break, and it would only work if I did another time call in the inner loop. It can probably get cleaned up some but this functions:

...

It wasn't turning off just from the while function, was I doing something wrong further or will it work if I just do a time call in the inner loop to keep it updated?

edit: yea, just needed the other time call, no break. DERP

If your plan is to leave it running until you CTRL-C the process from the terminal, you should probably look into trapping signals so that you can shut the light off right before your application quits.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

BigRedDot posted:

Oh, I do too, hence the scare quotes. :) Let's go another direction and make the ultra-flexible enterprise scorer that can perform arbitrary accumulations and transforms on the input with configurable decoders:

code:
''' The score.py module provides functions for generalized word 
scoring, as well as specialized scorers for particular rulesets.

'''

def score_word(word, decode, transform=lambda x: x, accum=sum):
    ''' Score an input word according to given decode, transform, and 
    accumulation policies.

    Args:
        word (str): 
            a word to score
        decode (callable): 
            callable taking one letter as input that maps the letter to 
            its score
        transform(callable, optional) : 
            callable that performs any necessary transformation on each 
            letter before decoding (default: identity)
        accum (callable, optional): 
            callable the reduces a sequence of letter scores into a 
            final score (default: sum)

    Returns:
        float : score

    Examples:

    >>> score_word("za", decode=lambda x: 2)
    4
    >>>

    '''
    return accum(decode(letter) for letter in transform(word))

#
# Scrabble (tm) specific scoring functions
# 

_SCRABBLE_SCORE_BY_LETTER = {
    "c": 3,
    "b": 3,
    "d": 2,
    "g": 2,
    "f": 4,
    "h": 4,
    "k": 5,
    "j": 8,
    "m": 3, 
    "q": 10,
    "p": 3,
    "w": 4,
    "v": 4,
    "y": 4,
    "x": 8,
    "z": 10,
}


def scrabble_word(word):
    ''' Score a word according to Scrabble (tm) rules.

    Args:
        word (str) : 
            a word to score

    Returns:
        float : score

    Examples:

    >>> scrabble_word("za")
    11
    >>>

    '''
    return score_word(
        word, 
        decode=lambda x: _SCRABBLE_SCORE_BY_LETTER.get(x, 1), 
        transform=lambda x: x.lower()
    )

if __name__ == '__main__':
    import doctest
    doctest.testmod()
Sorry, it's Labor Day, and I'm, er... uh, working? I leave it as an exercise to the reader to extend this and define an accumulator to handle scoring (double, triple)-(letter, word) modifiers. Don't forget to update the doctests, and re-run Sphinx to generate new API documentation (you'll need sphinx-napoleon to handle the Google style docstrings). Pull requests welcome!

Edit: Unicode support is an open issue; see the GH tracker.

Let's take a simple problem and rather than taking five minutes to solve it, let's figure out if it's a special case of some much larger set of problems that we might possibly (but probably won't) want to solve at some point in the future. Then let's develop a spec for an extensible framework and class library that can be called upon to solve any of those larger classes of problems using unnecessarily cryptic chains of method calls to execute the same functionality as we could have done yesterday when we started all this.

Sorry, but solving a problem that can be solved in 20 lines of code (unless it's running a space shuttle or pacemaker) shouldn't involve opening powerpoint to make a presentation on your proposed "framework."

This is why I hate java.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

Steampunk Hitler posted:

Now I want to be able to drop the installer_version however since there are multiple values there, I'd like to take any row where the only difference is the installer_version column, and sum() the count column. Anyone have an inkling how to do this?

It should be

Python code:
df = df.drop("installer_version", 1).drop_duplicates()
Edit: Oh, sum the count. Need to read better.

Python code:
columns = ['day', 'distribution_type', 'python_type', 'python_release', 'python_version', 'operating_system']
df = df.groupby(columns).count.sum()

KernelSlanders fucked around with this message at 16:18 on Sep 9, 2014

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

Hughmoris posted:

Let me preface this by saying I'm not a programmer, just a guy trying to automate crap at work...

I have a folder with lots of text files. In the text files are account numbers and associated information. The text files are sorted and named by a range of dates. For instance:
code:
CQMB_01012014_01082014.txt
CQMB_01082014_01152014.txt
CQMB_01152014_01222014.txt
If I want to look for an account that was at the hospital on 01/10/2014 then I would open CQMB_01082014_01152014.txt etc...

I have my eyes on a larger program but right now I simply want to be able to provide the script a date, and the script knows which text file to open. If the date falls on the edge, for instance 01/08/2014, then it would open both text files with that date.

Any advice on how to tackle this?

Start by reading the entire file list into memory, parse the start and end date parts of the filename into datetime objects, then find the ones that overlap your target date. Hint: ranges that overlap your target date are the ones where the start date is less than or equal to your target and the end date is greater than or equal to your target. That kind of indexing is the sort of thing pandas does really cleanly, but it can certainly be done just using lists as well.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.
Just chiming in to echo that HTMLParser is amazing and you should use it.

e: ^^ BeutifulSoup is also very nice, although they are quite different in the way you use them.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

kraftwerk singles posted:

I'm having a strange issue rounding numbers in pandas and exporting to dict.

my_series is the result of some aggregations on a dataframe with .astype('float') applied at the end.

Unrounded:
Python code:
my_series.unstack(level=0).to_dict()
{1410843600: {u'a': 0.081347309478284627,
  u'b': 0.035099699535645998,
  u'c': 0.61429595738869158,
  u'd': 0.019871619776017483,
  u'e': 0.24938541382136029},
 1410930000: {u'a': 0.074382538770821363,
  u'b': 0.039919586444572087,
  u'c': 0.59180547578020293,
  u'd': 0.017710128278766991,
  u'e': 0.27618227072563661}
...}
No arguments to np.round():
Python code:
np.round(my_series.unstack(level=0)).to_dict()
{1410843600: {u'a': 0.0,
  u'b': 0.0,
  u'c': 1.0,
  u'd': 0.0,
  u'e': 0.0},
 1410930000: {u'a': 0.0,
  u'b': 0.0,
  u'c': 1.0,
  u'd': 0.0,
  u'e': 0.0}
...}
Rounding to 2 decimal places (desired):
Python code:
np.round(my_series.unstack(level=0), 2).to_dict()
{1410843600: {u'a': 0.080000000000000002,
  u'b': 0.040000000000000001,
  u'c': 0.60999999999999999,
  u'd': 0.02,
  u'e': 0.25},
 1410930000: {u'a': 0.070000000000000007,
  u'b': 0.040000000000000001,
  u'c': 0.58999999999999997,
  u'd': 0.02,
  u'e': 0.28000000000000003}
...}
Some of the numbers are not being rounded as I would expect.
to_csv(), to_records(), etc. results in the correct rounding. Only to_dict() poses a problem. Is this a bug?

This appears to be working as intended. You told python you only cared about two decimal places and then complained when the 18th decimal place wasn't what you expected.

Basically, when you call round on 0.081347309478284627, you're asking for a number that is indistinguishable from 0.08 within floating point precision. That is exactly what you got. The problem is, 0.08 cannot be specified exactly in binary just like 1/3 cannot be specified exactly in decimal. Try subtracting 0.08 from 0.080000000000000002, you should get 0 not 2E-12. If you want number represented exactly to two decimal places, use a decimal class.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

SurgicalOntologist posted:

The problem is that out.append(i) modifies out in-place and returns None. In a comprehension, you need to use the value itself.

So you should try:

Python code:
out = [i for i in ints if i not in out]
However, this won't work either, because the comprehension on the right is not assigned to out until after it's constructed. So the not in check is not going to find anything. Your original is probably fine if you care about the order of the items in the list. Otherwise look into set.

There's also the numpy.unique function.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

Cultural Imperial posted:

Another vote for pycharm, Mac or PC.

4

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

regularizer posted:

Gotcha, thanks. I was trying to learn list comprehensions before the tutorial got to it. On a related note, when I finish the codecademy tutorial in a few days what should I do next? I'd like to get a more in-depth understanding of python before moving on to another language. I was thinking about trying to make a simple program I could run on osx, so is there something I could do to learn about making GUIs with python?

The answer to that depends entirely on why you are learning python. If it's to have fun, then pick something (simple, always start simple) that you think would be fun to try to build it. It really doesn't matter what it is. If you're learning python to get a job, then it depends on what kind of job: general developer, web developer, front-end developer, data scientist all suggest different classes of projects you should try.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

This should probably just get added to the OP at this point.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

Thermopyle posted:

I just wanted to stop in and say that I've been using mypy for the last week (I've just played with it previously) and it's pretty super if you've thought that Python could use option static type checking.

I won't use it all the time, but for larger programs its pretty nice (well it's still experimental so I won't use it for larger programs either yet).

It's an interesting idea, but seems to go against a lot of pythonic practices, especially the centrality of the dict or list with non-uniform types.

Dominoes posted:

I like how the type signatures provide a quick insight of what the function's doing. It's especially useful when inputting or outputting complex/nested data structures.

While that can be a plus when things are not otherwise well documented, but it doesn't replace proper documentation. I've been doing a fair bit of scala programming lately and it's amazing how many library authors think "type safe" means "don't have to write documentation."

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.
You seem to have a circular dependency.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

SurgicalOntologist posted:

Given this spec for communicating with a treadmill over serial, what data type should I be sending? I've never done this kind of thing before. I thought it should be b'C0' for example, but that doesn't work. Do I need to do some kind of encoding? I'm a bit out of my element here.

I'm also not entirely confident I've got the device/communication set up correctly with pyserial. I do get
code:
USB COM PS1
T
(no newline after the T)

when I power-cycle the treadmill, but after that I don't seem to get anything, even the error codes, no matter what I send in.

Is that all the spec you have? It looks like not really enough. They should also provide you the unit's desired baud rate and parity settings.

To test the communication, you should be able to write 0xA4 and read out 0xB4.

Python code:
ser.write(0xA3)
r = ser.read()
print(hex(r))
Remember, though, any time you have moving parts controlled by a computer you need to have an emergency stop button that physically disconnects power from the unit. University labs like to skip this step, but it can go really, really badly.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

SurgicalOntologist posted:

Figured out the bytes thing. It had to be dev.write([0xA1]) (i.e. it expected an iterable).

Yeah that makes sense. Glad you got it working.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.
What's wrong with the comprehension here? It seems pretty clean.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.
The operator and is the builtin logical operator you should probably be using unless you're dealing with numpy arrays.

For the loop, you could pre-filter with a generator or list comprehension, although I don't know that it's any clearer than your second example.

Python code:
for x in (y for y in sequence if meets_condition(y)):
    do a thing
    do another thing
or

Python code:
filtered_sequence = [x for x in sequence if meets_condition(x)]
for x in filtered_sequence:
    do a thing
    do another thing

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

SurgicalOntologist posted:

Is it possible to remove the leading zero when string formatting a float guaranteed to be between -1 and 1? I always look into this when I'm annotating correlations, and I always give up and do it in two steps. Now, though, I'm using a library that has you pass in the formatting string... so am I out of luck?

When all else fails, regex.

Python code:
x = 0.31415925
re.sub(r'0\.(\d+)', r'.\1', "%.3f" % x)  # returns '.314'

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.
If not for the downward only constraint the longest path is any of the many possibilities that touch every node. For the downward only version, why wouldn't you just start at the bottom and work back to the top? For the row one up from the bottom, each node is worth its weight plus the greater of the two below it. Assign that as the new weight for the node and repeat for the row above. O(n)

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

Dominoes posted:

Is there a way to make Pycharm stop weak-warning uppercase variable names? Unlike some inspections, the only alt-enter option is to rename the variable. I can't find a suitable error code to ignore on this page. I can disable the pep8 naming convention inspection, but that's broader than I'd like.

The easiest way, of course, is to use PEP8 compliant variable names.

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

Dominoes posted:

Scientific programming makes sense with uppercase variable names (including things like Δ) when convention calls for them. Popular libraries like scikit learn use them in documentation.

I'm not sure that arguing you want unicode variable names is much of an improvement.

I don't follow your point about sklearn. Are you saying that there is code in an sklearn docstring that doesn't follow sklearn coding standards?

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

Symbolic Butt posted:

idk writing simply dt is a super old convention in scientific programming, graphics programming, maybe even programming in general I'd say

Yeah, changing to δt kind of misses the point. What's wrong with dt or deltaT?

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.
Er... delta_t I mean. :bang:

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

SurgicalOntologist posted:

If you're on OSX you can also do os.system('say X') which can be a lot of fun.

This is amazing. Thank you.

I can now annoy the hell out of my office mates.

e: Holy poo poo git log 2>&1 | say -f -

KernelSlanders fucked around with this message at 18:48 on Jan 24, 2016

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

Lysidas posted:

Some tools on Windows (like Notepad, as I recall) prepend U+FEFF to files before saving as UTF-8. Though this doesn't function as a byte order mark (since the concept of byte order doesn't apply to UTF-8), it's still technically valid and serves as a marker that "this file is definitely UTF-8 content".

Python defines a special codec called utf-8-sig that strips the leading UTF-8 "signature" from the beginning of the file.


I'm continually amazed how Windows manages to mangle files. We bought some Bing search ads and hooked up our normal analytics pipeline to their API to download some CSVs that say how the ads did. Care to guess how it broke?

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

qntm posted:

Line endings?

Yup. Even if you work at Microsoft, who sits down to write a web endpoint and thinks CR LF termination is a good idea?

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.

Hammerite posted:

Wikipedia says... "Most textual Internet protocols (including HTTP, SMTP, FTP, IRC, and many others) mandate the use of ASCII CR+LF ('\r\n', 0x0D 0x0A) on the protocol level, but recommend that tolerant applications recognize lone LF ('\n', 0x0A) as well." I'm not sure whether you were downloading files rather than making web requests, but it seems to me that in that case Microsoft are quite entitled to send you CR+LF files, and if you don't account for that possibility then that's your bag.

Downloading files.

Adbot
ADBOT LOVES YOU

KernelSlanders
May 27, 2013

Rogue operating systems on occasion spread lies and rumors about me.
Is there a stupid simple unconstrained non-linear solver for python similar to Matlab's fminunc?

  • Locked thread