|
John DiFool posted:In 2006 I made ~30k on a 6-month *internship*. So ~60k a year. ~85k in 2022 dollars. You are absolutely shafting your interns. Here's an idea, if you don't want to do the entry-level Python job for purportedly less money than the internship you had, then don't! And let people make their own decisions
|
# ? Oct 30, 2022 00:57 |
|
|
# ? Jun 5, 2024 03:26 |
|
I think it's pretty reasonable to call out that something might be significantly below-market - often the sort of new-to-the-industry person who's the focus of that ad doesn't have a good idea of what typical compensation looks like. If it is actually good compensation for the sort of applicant being targeted, then it's not like it's hurting anybody if that applicant has a look around and realises what typical market rates are first.
|
# ? Oct 30, 2022 01:17 |
|
Macichne Leainig posted:Here's an idea, if you don't want to do the entry-level Python job for purportedly less money than the internship you had, then don't! Nah, I’m going to call out people who are paying way below market. Rather help out those starting out who may not know better. Employers who talk about high expectations while offering low wages are almost universally exploiting workers. I didn’t live in a particularly high COL area either. Hopefully people considering this position will see this exchange and wonder why they’d want to work for someone that severely undervalues their time and labor. Macichne Leainig posted:And let people make their own decisions Maybe you should let me speak so they have more data to make their own decisions instead of being a passive aggressive prick.
|
# ? Oct 30, 2022 01:32 |
|
I have Pave.com benchmarking, have hired multiple interns and cited paying above market for a specific well known employer but otoh you did have one job more than a decade ago so maybe I’ll just ignore all that real data I have and listen to some goon.
|
# ? Oct 30, 2022 01:41 |
|
CarForumPoster posted:I have Pave.com benchmarking, have hired multiple interns and cited paying above market for a specific well known employer but otoh you did have one job more than a decade ago so maybe I’ll just ignore all that real data I have and listen to some goon. Congrats on the cheap labor my dude.
|
# ? Oct 30, 2022 01:47 |
|
It’s funny how quickly hiring managers get in a huff if you dare suggest they might be paying poo poo wages.
|
# ? Oct 30, 2022 01:48 |
|
Yea it's weird how people acting ethically and transparently get in a huff when a belligerent moron accuses them of unethical behavior.
|
# ? Oct 30, 2022 04:57 |
|
Is there a 'fight about how capitalism sucks for everyone' thread y'all could go fight in instead?
|
# ? Oct 30, 2022 05:15 |
Falcon2001 posted:Is there a 'fight about how capitalism sucks for everyone' thread y'all could go fight in instead? Yeah but there’s a global market interpretation lock
|
|
# ? Oct 30, 2022 05:16 |
|
Macichne Leainig posted:Here's an idea, if you don't want to do the entry-level Python job for purportedly less money than the internship you had, then don't! Is posting in the thread somehow not letting people make their own decisions? IMO if it's okay to post a job posting then it's also ok for people to remark on it in whatever way they want.
|
# ? Oct 30, 2022 07:47 |
|
I have a large amount of blood glucose time series data from here: https://public.jaeb.org/datasets/diabetes. Each row of the time series data has at least 4 columns - an identifier of which trial the data belongs to, an identifier of which subject from the trial the data belongs to, the date and time of the blood glucose, and the actual value of the blood glucose, and conservatively there are over 20 million rows. Given a list of subjects and a date time for each subject, I would like to be able to quickly obtain the next 2 hours of blood glucose data for each of those subjects, and I will be performing this query repeatedly. I have no real experience with databases, but given the size of the data it seems like it would be best to store this data in a database and then run queries to extract the data I want. Is a Sqlite database with a single table the best way to store this data given the queries I want to run? The single table would have four columns (trial ID, subject ID, date time, and blood glucose value).
|
# ? Oct 30, 2022 19:34 |
|
Jose Cuervo posted:I have a large amount of blood glucose time series data from here: https://public.jaeb.org/datasets/diabetes. Each row of the time series data has at least 4 columns - an identifier of which trial the data belongs to, an identifier of which subject from the trial the data belongs to, the date and time of the blood glucose, and the actual value of the blood glucose, and conservatively there are over 20 million rows. SQLite with a multi-column index on subject id and timestamp should work well. If you think you'll need to query by timestamp only but not subject only, reverse that order. https://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys
|
# ? Oct 30, 2022 19:52 |
|
Jose Cuervo posted:Is a Sqlite database with a single table the best way to store this data given the queries I want to run? The single table would have four columns (trial ID, subject ID, date time, and blood glucose value). Yes that should be fine. You should be able to pick up the syntax of queries pretty easily
|
# ? Oct 30, 2022 20:37 |
|
Twerk from Home posted:SQLite with a multi-column index on subject id and timestamp should work well. If you think you'll need to query by timestamp only but not subject only, reverse that order. The subject ID is not unique across different trials, so I would want a multi-column index with trial ID, subject ID, and timestamp, right? Or would it be more efficient to create a new column (say trial_subject) which concatenates trial ID and subject ID which would be a unique identifier for each subject's data, and then use a multi-column index with trial_subject and timestamp? I will never need to query by timestamp only because even subjects in the same trial participated over different weeks / months, depending on when they enrolled, so I will always be specifying the trial, the subject, and the timestamp.
|
# ? Oct 30, 2022 21:37 |
|
Don't make a new column that's duplicating information from other columns. Having the primary key be (trial ID, subject ID, draw time timestamp) is fine if those are always unique. If those can be duplicated (i.e. data entry error, or two measurements from the same tube of blood), just have a surrogate primary key (autocreated meaningless ascending integer). Throw some indexes at it later if it actually is slow, but I'd be surprised if it matters. You shouldn't overthink this; it's not a particularly large database and it's going to be bottlenecked by python slowness, not the database. Even for a poorly designed schema, most of the time is going to be spent creating python object versions of the results, not looking up the results themselves Making up pessimistic numbers: - 64 bytes text trial ID - 64 bytes text subject ID - 64 bytes timestamp because SQLite does datatypes bizarrely and will be storing text - 8 bytes of floating point measurement = 200 bytes per record * 20 million rows => ~4GB of database. That isn't particularly big.
|
# ? Oct 30, 2022 22:30 |
|
Foxfire_ posted:Don't make a new column that's duplicating information from other columns. Having the primary key be (trial ID, subject ID, draw time timestamp) is fine if those are always unique. If those can be duplicated (i.e. data entry error, or two measurements from the same tube of blood), just have a surrogate primary key (autocreated meaningless ascending integer). Throw some indexes at it later if it actually is slow, but I'd be surprised if it matters. So I am running into my first problems. I create my database and table as follows: Python code:
Python code:
code:
Jose Cuervo fucked around with this message at 02:08 on Oct 31, 2022 |
# ? Oct 31, 2022 02:04 |
|
Don't form your own string of insert values, use placeholder binding instead: https://docs.python.org/3/library/sqlite3.html#sqlite3-placeholders
|
# ? Oct 31, 2022 02:20 |
|
Because DCLP3 isn't quoted, it's being interpreted as a variable, not a string. Ideally you want your database api to handle string escaping for you because 1) it's a huge headache and 2) the whole sql injection thing once you need to worry about malicious data, etc. Should look more like this (can't remember if sqlite uses %s or just % as the placeholder): code:
but... if you've already got the data in memory as a dataframe, are you sure you want/need to use a database at all, instead of working directly on the dataframe?
|
# ? Oct 31, 2022 02:21 |
|
Jose Cuervo posted:
You're handing "INSERT INTO blood_glucose VALUES (DCLP3, DCLP3-001-001, 2017-12-02 00:05:33, 131)" with all the values substituted into a string to the database engine to execute. That's problematic because it has to parse the individual values back out, and its parsing expects to see strings in quotes, not bare. For this insert, it'd want: code:
Instead of handing a mixed data+instructions query to the database engine, you want to be telling it "Execute this purely instructions query (that is coming from a string constant) using these purely-data parameter bindings" To do that with python's sqlite interface, you use either "?" for the parameter values, or ":somename" for a named one, then provide the bindings when you run the query. QuarkJets is pointing you at the documentation for it. The query would look something like this: Python code:
Python code:
Other stuff: - If you don't ever want to have null/blank values, add "NOT NULL" to the columns when you create the database. It will yell at you if you try to do an insert or edit that does that. - If your sqlite installation is new enough, adding STRICT to the table creation options will turn on strict mode and disable a bunch of Javascript-esque "when I ask for something that doesn't make sense, apply type conversions instead of reporting an error" - If it were me and this was something that I might keep around after a couple month gaps, I'd stick the units for the glucose measurement in the column name
|
# ? Oct 31, 2022 02:54 |
|
Mods please rename thread "The Plaza of Python" tia
|
# ? Oct 31, 2022 06:26 |
|
QuarkJets posted:Is posting in the thread somehow not letting people make their own decisions? Yeah sure but dumb quips about how much money you make/made are useless I could say I make $500k and even if it was true it adds just as little as the rest of my posts because nobody cares Macichne Leainig fucked around with this message at 21:00 on Oct 31, 2022 |
# ? Oct 31, 2022 20:58 |
|
Macichne Leainig posted:Yeah sure but dumb quips about how much money you make/made are useless That's an uncharitable reading of their posts; they were providing those numbers (which they probably didn't just make up) to support their opinion that the amount being offered in an earlier post is too low. That's informative; OP was offering data that may be useful to anyone interested in the offered position. And it provoked a discussion over what a fair internship salary should be, which was interesting
|
# ? Oct 31, 2022 21:45 |
|
QuarkJets posted:Don't form your own string of insert values, use placeholder binding instead: Thanks. I am going to read through the documentation more carefully which is what I should have done in the first place. Zoracle Zed posted:Because DCLP3 isn't quoted, it's being interpreted as a variable, not a string. Ideally you want your database api to handle string escaping for you because 1) it's a huge headache and 2) the whole sql injection thing once you need to worry about malicious data, etc. Should look more like this (can't remember if sqlite uses %s or just % as the placeholder): I have a single one of the trials in a DataFrame, but there are multiple trials and I estimate there will be about 20 million rows or more once everything is read in, and I figured this would be a good time to use a database. Foxfire_ posted:You're handing "INSERT INTO blood_glucose VALUES (DCLP3, DCLP3-001-001, 2017-12-02 00:05:33, 131)" with all the values substituted into a string to the database engine to execute. That's problematic because it has to parse the individual values back out, and its parsing expects to see strings in quotes, not bare. For this insert, it'd want: This made things work just right. I am going to read through the documentation some more, but things seem to be working just fine now and I seem to be able to query with no issues.
|
# ? Nov 1, 2022 02:07 |
|
Ok, got an example I have to do for class, and since the guys in the Java thread have been helpful, I'm here to ask you for Python stuff. What is a good primer for how to use APIs? The assignment for this week is to use the API for DuckDuckGo to search "presidents of the united states." In the "related Topics" field of the JSON response, all the presidents should be listed. I know how to compare lists, but how do I properly use an API? All I got given by my teacher was a link to a Linkedin video.
|
# ? Nov 2, 2022 01:08 |
|
samcarsten posted:Ok, got an example I have to do for class, and since the guys in the Java thread have been helpful, I'm here to ask you for Python stuff. What is a good primer for how to use APIs? The assignment for this week is to use the API for DuckDuckGo to search "presidents of the united states." In the "related Topics" field of the JSON response, all the presidents should be listed. I know how to compare lists, but how do I properly use an API? All I got given by my teacher was a link to a Linkedin video. if you type code:
|
# ? Nov 2, 2022 01:23 |
|
In terms of libraries/etc to look into I think that Requests is probably the most straightforward one to use and the most popular. https://requests.readthedocs.io/en/latest/
|
# ? Nov 2, 2022 02:12 |
|
Falcon2001 posted:In terms of libraries/etc to look into I think that Requests is probably the most straightforward one to use and the most popular. Yeah iirc with requests it's stupidly easy to get a dict if you're expecting to receive a json object, like: Python code:
You may have to set a user-agent too, you can google how to do this
|
# ? Nov 2, 2022 04:32 |
Is there any way to "alias" the @property decorator to something else? I have a Django model which has a field called "property" already (i.e. referring to a real estate property) and it's making it impossible for me to make @property methods
|
|
# ? Nov 2, 2022 11:58 |
|
Can you just do _property = property somewhere in the file before the property field is defined?
|
# ? Nov 2, 2022 12:06 |
I'll be damned, I can. Thanks!
|
|
# ? Nov 2, 2022 12:19 |
|
Data Graham posted:Is there any way to "alias" the @property decorator to something else? Disclaimer: I feel like a novice ITT and rarely use classes or make Django apps anymore. I’ve never made a decorator though I think I know how they work from using them frequently. Having a variable named property is like having a variable named list, yea? It’s gonna overwrite the property() built in that the decorator is calling, no? I’d think you COULD alias it by recreating the built in that the decorator @property is calling, but idk if you should. If it’s an already deployed thing where there WILL be consequences from changing the model’s redefinition of property, then perhaps a property_decorator = property() that you import whenever you need the @property (now @property_decorator would solve this? Again I’ve never done this in practice. Simple solution is to not use variable names that are built ins.
|
# ? Nov 2, 2022 12:22 |
Yeah it's an already-deployed thing. I'm trying to clean it up and adding @propertys so it isnt().all().a().chain().of().things().like().this() is part of the effort
|
|
# ? Nov 2, 2022 12:27 |
|
I have some DI pattern questions, so general compsci stuff. This is for a project I'm moving away from (new job starts on Monday) but I'm curious about what the downside of the method I worked out is. When I took over this project, we had a ton of module-level 'Global' variables for dependencies. This obviously was a nightmare for testing, but fixing it was difficult because I was working at the end of the 'chain' so to speak, so setting up full proper DI from the program start would involve a lot of refactoring. I was generally trying to move to factory patterns instead of overloaded __init__ anyway, and struck upon this setup for handling this. Python code:
Is this just a matter of 'when reading the code, you won't easily be able to see that a downstream function of this function uses this external dependency?' like an implicit vs explicit problem? Or are there other downsides of this pattern I'm not seeing due to lack of experience?
|
# ? Nov 2, 2022 19:10 |
|
Okay, so looking at the requests thing, it seems like it only takes dictionaries as queries. 'm only supposed to search with the string "presidents of the united states" How am I supposed to do that? I don't really understand the syntax.
|
# ? Nov 2, 2022 19:52 |
samcarsten posted:Okay, so looking at the requests thing, it seems like it only takes dictionaries as queries. 'm only supposed to search with the string "presidents of the united states" How am I supposed to do that? I don't really understand the syntax. From this: saintonan posted:if you type See the q=presidents%20of%20the%20united%20states part? That's a key-value pair. The key is q and the value is presidents%20of%20the%20united%20states. Dictionaries are for mapping keys to values. So make a dict like this: Python code:
(Requests will also handle urlencoding the spaces into %20 symbols for you.)
|
|
# ? Nov 2, 2022 20:12 |
|
ok, so my current code looks like this:code:
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) I've googled it but none of the answers seem to help.
|
# ? Nov 2, 2022 20:18 |
Those other keys in the query string I mentioned will be important.
|
|
# ? Nov 2, 2022 20:28 |
|
Worth noting that Python has a pretty good interactive experience, right out of the box. If I run your code:Python code:
12 rats tied together fucked around with this message at 20:53 on Nov 2, 2022 |
# ? Nov 2, 2022 20:50 |
|
samcarsten posted:I've googled it but none of the answers seem to help. 12 rats are correct above, and I will add that you should try to become comfortable with the following as well: code:
code:
|
# ? Nov 2, 2022 20:58 |
|
|
# ? Jun 5, 2024 03:26 |
|
I recommend iPython for a bit nicer of a commandline experience.
|
# ? Nov 3, 2022 00:31 |