|
I want to find specific sequence patterns in a list of tuples, based on the second element in the tuple. For example, given a list like: code:
I can think of ways to do this by manually iterating through the list and testing the second tuple element against a bunch of IF statements, but suspect there's a smarter, more Python-like way to do it. I've just started with the language, and am already blown away by how much code Python's constructors eliminate.
|
# ¿ May 24, 2012 17:01 |
|
|
# ¿ May 14, 2024 16:11 |
|
Will that give me the sequence, though? The order will matter, and that's the part I'm not sure how do do "elegantly." For example, "animal, vegetable" vs. "vegetable, animal" should return [cow, corn] and [carrot, cat] respectively. Myabe a Regular Expression? However, the dictionary form would be helpful elsewhere in code as a list of items in a category. Is there an easy way to convert the list example (assumed assigned to a variable) to that dictionary format? I'm doing it currently by iterating the list with separate constructors with custom IF statements. The categories (second tuple parameter) are hard-wired, but the entries (first tuple parameter) will vary. onionradish fucked around with this message at 18:35 on May 24, 2012 |
# ¿ May 24, 2012 18:23 |
|
Thanks for the ideas on pattern matching! A big help, especially learning about zip(); I'll play around with code this weekend.
|
# ¿ May 25, 2012 17:34 |
|
What is the "best practices" method for working with settings/INI-type files in Python? For example, I have a main script that downloads using a custom host and password to be specified in an INI-like file, and I want to update that INI with the "last-date-processed" so it doesn't process old files. The scripts are running under my complete control, and I'm not worried about someone adding malicious values or codes. Should I manually read/write a TXT file and parse it, should I use "import xxx.ini" to load it into my script, should I use something like JSON or pickle to save/load the variables, or is there a preferred library I should use?
|
# ¿ Oct 11, 2012 18:05 |
|
OnceIWasAnOstrich posted:There is ConfigParser built in if you like INI-style files and want to be able to easily hand-edit them which can be a pain for JSON files for people who don't use JSON or javascript.
|
# ¿ Oct 11, 2012 19:15 |
|
I could use some advice on project structure so I'm not kicking myself later. I'm about to migrate to new machine and start a large personal project "for reals" with versioning, proper classes, etc. instead of the folder of hodgepodge scripts I've been using to learn. When I've done development in other languages, all the code and assets were contained within a single \\dev\projectname path. It was easy to manage dependencies because third-party scripts, modules, frameworks, etc. were revisioned along with the rest of the code. This project will require some third-party modules (numpy, etc.). Since these get installed to Python's directory, how should I manage dependencies on these modules so I can re-create the development environment when I have to move machines, restore from backup, etc.? I can just keep copies of the module installation packages, but I'm guessing there are better ways to go about it.
|
# ¿ Dec 13, 2012 18:45 |
|
Thanks for the advice; and, the sample dirtree is a big help, so thanks for that detail, Haystack! I'll need to do some more reading and testing on virtualenv as soon as I finish uninstalling all the bloatware that came with the new system. In the example usage given in the docs: code:
Also, when I'm installing modules/packages, do I install to a project's particular virtualenv, to the native Python directory or both? For example, if I want a module like numpy or PIL to be available to every script, I'm assuming that I install it to native Python directory. For my actual project, I'm assuming I'll also need to install it to its particular virtualenv so that version is the one that will run.
|
# ¿ Dec 13, 2012 21:44 |
|
Awesome! Thanks, everybody. Glad I asked!
|
# ¿ Dec 13, 2012 22:57 |
|
Following up on virtualenv, I'm having failures when trying to install packages, and I'm not sure what I'm doing wrong. I'm on Windows, and using CMD as the shell. I'm able to create a project folder: "virtualenv test". After I do that, I'm navigating to the test directory (D:\test\project) and entering "scripts\activate". Then I'm trying "pip install lxml" as an example. A bunch of stuff scrolls by and results in "failed with error code 1". I'm not sure what this means: code:
|
# ¿ Dec 14, 2012 20:20 |
|
is right -- for f's sake -- I picked lxml thinking it would be a stupidly simple test package! I'm torn between being happy it's not my fault and angry about the hoops I'm going to have to jump through.... I'll give the VS 2008 and "manually-copy-the-binary" methods a try.
|
# ¿ Dec 14, 2012 22:48 |
|
Hard NOP Life posted:Why is everyones first instinct to try and compile it instead of just installing the binaries? edit: onionradish fucked around with this message at 18:51 on Dec 15, 2012 |
# ¿ Dec 15, 2012 18:29 |
|
An update to my earlier post: the linked StackOverflow post was about using "easy_install" in a Windows virtualenv, but testing with binaries for PIL, it only appears to work, meaning it actually doesn't. The package only gets partially installed. The same SO post includes a suggestion to change the registry around, which seems really hacky.
onionradish fucked around with this message at 18:53 on Dec 15, 2012 |
# ¿ Dec 15, 2012 18:50 |
|
JetBrains' sale blew up their whole process. And of course it's kind of a cascading mess with some people not having a record of the purchase from the processor, others having ordered multiple times when it wasn't clear whether the order was going through. So now in addition to whatever keygen queue backlog they had from the sale, they've also got a backlog of inquiries to sales@jetbrains.com to work through. The conflicting communication is kind of the issue. One source says 48 hours, one says within 5-6 hours. I don't mind a couple days waiting for a key, as long as I know to expect that. I don't like needing to play Internet Detective to find Twitter and blog posts to figure out what I'm supposed to do: just chill for a couple days? contact sales? http://blog.jetbrains.com/blog/2012/12/21/to-all-who-placed-an-order-during-the-end-of-the-world-sale/ https://twitter.com/jetbrains In the end, after reading comments on the blog from people claiming they got their keys "no problem," I joined the herd and sent my info their sales address too. I really like PyCharm and was going to buy it anyway in January, so I'm thrilled to pick it up for $25. edit: The email response from sales is: "If your license doesn't reach you by Monday, please let us know and we'll make sure to help you out." onionradish fucked around with this message at 15:40 on Dec 22, 2012 |
# ¿ Dec 22, 2012 14:45 |
|
I'm trying to whip up a script to convert between Unicode and ASCII HTML entities for some website work, but am failing somewhere on the Unicode conversion. The conversion to named entities works fine, but fails when I convert back to accented Unicode, and specifically on the & rsquo ; entity. What am I missing? Full code at Pastebin code:
onionradish fucked around with this message at 16:39 on Feb 19, 2013 |
# ¿ Feb 19, 2013 16:35 |
|
UTF/HTML stuff...The Insect Court posted:You're over-thinking this.
|
# ¿ Feb 20, 2013 16:38 |
|
A stupid post was here.
onionradish fucked around with this message at 20:39 on Mar 5, 2013 |
# ¿ Mar 5, 2013 20:33 |
|
I spent a day and a half fighting a similar "problem" and had myself convinced I didn't understand Unicode either. My code was actually fine -- the problem was the output console (Windows Powershell). Try running the code in IDLE, or writing the output to a file and see if you're getting the value you expect. It may be your output console that doesn't understand Unicode, not you. From IDLE: >>> m=u'm\u0325' >>> print m m̥ edit: ^^ the M-dot doesn't show when it's enclosed in a 'code' block on the forum
|
# ¿ Apr 12, 2013 01:07 |
|
Is it bad practice to put a lot of the "prep work" for a class into its __init__? I'm writing a throwaway script that parses a recipe from a URL to get more familiar with lxml, writing classes, unit tests and exception-handling. In my first cut at the script, "recipe = Recipe(url)" fetches the HTML from the URL, parses it, then populates a bunch of class attributes. Should I instead be calling a method to do that on the object after initializing it? Something like "recipe = Recipe()" then "recipe.getfromurl(url)"?
|
# ¿ Jul 9, 2013 01:25 |
|
Thanks for the __init__ feedback. The googletesting link Chosen posted is great timing because I'll be trying to set up tests next so I can refactor now where needed. I'd actually done all of the "work" in the __init__ through methods as Ronald Raiden suggested, but the idea of making the parameter optional (even though it'd always be provided in practice) seems like it'd be better for testing, allowing creation of a "plain" instance and then assertions against the methods.
|
# ¿ Jul 9, 2013 14:01 |
|
Using lxml, foo.text_content() will return plain text contained within a particular level, but it strips out all HTML tags. How can I get the raw HTML contained within a particular level? For example, how do I get all the content between <div class='dummy'> and </div> not just "the quick brown fox"? (This is probably stupidly easy, but I'm not seeing it....)code:
onionradish fucked around with this message at 21:43 on Jul 25, 2013 |
# ¿ Jul 25, 2013 21:34 |
|
A helpful resource that should probably be added to the OP is the pyvideo.org website, which archives presentations given at various Python conferences. Some speakers and presentations are better than others, but there are gems of useful, practical information on unit tests and web frameworks (Flask to Django), standard and other modules (Requests, Pygame, etc.), details on core capabilities like iterators/generators, astronomy and other specialized topics, and so on, usually with links to sample code.
|
# ¿ Aug 30, 2013 15:39 |
|
I'm migrating some of the helper scripts I've written over the years in AutoIt to Python to improve my Python coding. Some of these AutoIt scripts have minimalist Windows GUI elements like a system tray icon that shows that the script is running and can display status as a tooltip on that icon. I'd like to replicate that functionality with the least-possible effort and module dependencies. wxPython seems to be a decent cross-platform library for basic GUI stuff like tray icons, though the documentation seems thin. Can anyone vouch for it and whether or not any of the reference books ("wxPython in Action", "wxPython 2.8 Application Development Cookbook") are worthwhile?
|
# ¿ Sep 26, 2013 17:21 |
|
I use both the Windows console and PyCharm, but when I started learning I was using just Notepad++ and the console. (I later moved up to Spyder before switching to PyCharm.) One of the early and recurring negative experiences I had with the Windows console is its inability to display any Unicode or Windows characters. I spent far too much time trying to understand what was wrong with some piece of code only to discover that there was nothing wrong with the code at all. The problem was "print"-ing to a crappy console. An IDE (even IDLE) can at least run code:
A full IDE like PyCharm can be overwhelming for sure, but it can be used as just an editor and output console without learning much about the rest of the IDE -- there's still plenty of stuff in it I've never used at all. While learning, I appreciated its auto-inspection to catch stupid typos and its "nagging" about missing docstrings, line length, etc. to remind me about good coding habits. When I ignore the guidelines, it's a conscious choice. It's also much easier to set breakpoints, step through and watch variables in a GUI for code as it starts to get more complex than it is at the console level.
|
# ¿ Nov 23, 2013 23:14 |
|
I built a basic web app using BaseHTTPServer that does simple formatting based on an sqlite database. It works totally fine as is, and importantly automatically launches my browser when I run the script:Python code:
|
# ¿ Dec 4, 2013 22:55 |
|
Pollyanna posted:Additionally, I can see that /AAPL,GOOG,IBM would be functionally identical to /GOOG,AAPL,IBM which sets off my "don't repeat yourself" alarm. I don't know if it's a false alarm, though.
|
# ¿ Dec 22, 2013 17:51 |
|
You can do conditional regex matching with a look-ahead regex. It can make the regex pattern really gnarly, so you'll have to decide whether it's worth it for code readability.Python code:
|
# ¿ Dec 22, 2013 18:42 |
|
I recently split a single script into separate files with grouped related functions so the project would be easier to manage. After I split the script, there were some global constants defined in the parent script that the imported scripts could no longer see. I was able to move the constants to the appropriate files, or just add them as passed function parameters -- so Python may be helping enforce good coding practices -- but it got me curious to understand the scope of variables across imported files better. As an example, if I wanted a "parent" script to set global constants or config that imported scripts might use, like the path to a master directory, is there an appropriate way to access those parent variables from an imported script? Is that what __init__.py is for, or is the idea bad practice in general, and an indication that they should be passed parameters?
|
# ¿ Feb 23, 2014 21:32 |
|
QuarkJets posted:For constant parameters used in more than one place I create a py file that just defines those values, and then whenever I need those constants they're just an import away.
|
# ¿ Feb 23, 2014 23:12 |
|
I'm writing some unit tests for a script that parses an html page. For testing, I'm using a reference HTML file that's a full WGET of the target page. What's best practice for the assert against a function that returns all tags according to a selector, like lxml's "cssselect()" or bs4's "find_all()"? Let's say that the reference page is supposed to return 14 <a> items as a list. Is it enough to just verify the "len()" of the results, just check a few of the actual values (maybe first and last), or verify that the full result list matches my list of expected results? The answer might be "it depends," and that's ok. Mostly I'm wondering if the either or both of first two methods would be considered insufficient or bad practice.
|
# ¿ Mar 11, 2014 19:37 |
|
Dren posted:Why do you have it in a special function? Isn't returning all the tags matching a selector what find_all() does? Why would you stuff that behavior in a new function then try to test it? Maybe I oversimplified the example or maybe I'm over-testing. In actual practice, it would only be a function becasue it's going to be called multiple times and has conditionals. Maybe a better example would be a function that returns the urls of leeched images found on a page. (I have a couple of clients that I've been unable to break of the habit when they make blog posts.) So a "leeched_images(url)" function might return 10 of 14 found <img> on site A, and 1 of 6 <img> on site B. Super-hacky pseudo-code below. If the right thing to do is test the "img_is_on_host(imgsrc)" function and not test "leeched_images()" at all, that's fine. Just trying to understand where to draw the line on testing. Python code:
onionradish fucked around with this message at 23:00 on Mar 11, 2014 |
# ¿ Mar 11, 2014 22:57 |
|
Thanks for taking the time to write that out -- it was really helpful.
|
# ¿ Mar 12, 2014 17:51 |
|
I'm frequently using a pattern to read from a text "config" file with values on a single line, and want to ignore blanks and comments (lines that start with #). Examples of the config files are RSS URLs to be scraped, folders to be indexed, etc. Is there are more compact or better pattern than this generator? Python code:
|
# ¿ May 28, 2014 20:07 |
|
What's the right way to gracefully handle missing imports -- and thus unavailble capabilities -- that are not critical to a script? For example, I have a script that I use on my home and work systems. At home, the script uses gntp/Growl to display a nice pop-up graphic "toaster" notification. On my work system, I don't have Growl installed and just displaying the notification text in the console is good enough. Is what I've done below 'right' or is there a more proper way? Python code:
|
# ¿ Jun 22, 2014 15:38 |
|
The second approach will be a lot cleaner for this script. Thanks!
|
# ¿ Jun 22, 2014 16:16 |
|
the posted:I'm using Beatbox to query a database in Salesforce. I'm grabbing a list of Account zip code fields. They get returned as type 'instance' and I convert them to a string before doing anything with them. the, when you initialize beatbox, are you using beatbox.Client() or beatbox.PythonClient()? I remember looking at the code for beatbox when you were posting about having to wrap everything in str() and shuddering -- as SurgicalOntologist has suggested, the API is godawful. However, it looks like it's because the default Client API is just a wrapper around some horrifying XML thing that is just dumping out results in a list that you then have to wrap in str(), which is what's part of what's causing your encoding issues. beatbox includes PythonClient which claims to turn "the returned objects into proper Python data types. e.g. integer fields return integers" and appears to return a list of dictionaries (their example): Python code:
|
# ¿ Jul 11, 2014 17:35 |
|
the posted:Huh, I've been using Client. Interesting. Switch and your code should be MUCH easier to work with. You'll have to change some of your code where you were doing list indexing to get access a record's field, but result['FirstName'] will be a lot easier to read and work with than remembering that result[3] is supposed to be FirstName. It looks like PythonClient also supports accessing the dictionary in dot-notation, meaning result.FirstName would also work.
|
# ¿ Jul 11, 2014 17:50 |
|
the posted:FYI I just tried this and it's popping an error. Ideas? Are you getting that error on the svc = beatbox.PythonClient() line or somewhere else? If you do import beatbox; dir(beatbox) does PythonClient show up in the list? If not, see if your version is the same as the PyPi version. I'm assuming you retyped the code; the first colon looks like a typo here: for i in query[sf.records:]:
|
# ¿ Jul 16, 2014 21:08 |
|
the posted:How would I handle reading times that are like: %f treats the microsecond value as a fraction, so '.6' gets correctly parsed into '600000' microseconds. \/ \/ \/ onionradish fucked around with this message at 02:53 on Jul 17, 2014 |
# ¿ Jul 17, 2014 02:16 |
|
thegasman2000 posted:Sorry I am sure your all fed up of answering this but whats the text editor of choice for a python newbie? Looking for something that will prompt and highlight errors such as indentation before compiling if possible. Trying out textmate at the moment but its not highlighting an error I am getting. It catches syntax errors before the script is run, warns when code isn't following PEP8 "best practices" in formatting so I learned good habits early, can pop-up documentation on functions and arguments, and includes breakpoint debugging and other tools to inspect variables, step through loops, etc. I've grown into using some of its advanced features like integration with version control and test automation, but you can ignore all of those kinds of features when starting and just use it to edit scripts. The Community Edition is free.
|
# ¿ Jul 18, 2014 13:19 |
|
|
# ¿ May 14, 2024 16:11 |
|
thegasman2000 posted:Can i access that from pycharm?
|
# ¿ Aug 16, 2014 15:28 |