Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Zoracle Zed
Jul 10, 2001
are you responsible for anonymizing PII? stay the hell away imo

Adbot
ADBOT LOVES YOU

huhu
Feb 24, 2006

Jabor posted:

One possibility is to pick a random direction and distance, instead of doing X and Y separately.

Another possibility is to draw Bezier curves with randomly generated control points.

You could bias your diffs in some direction - so that it looks generally random at the small scale, but slowly tracks across the plot when you look at it overall.

--

Really there are lots of fun things you can do with this.

How do I begin to think like this is really my question. Should I go find a book about math and art or something…

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

Qtamo posted:

Knew I forgot something :doh: The original db tables stay in our internal environment, but the transformed data is a part of a larger dataset that gets sent to another party, and we can't guarantee that the data is safe there (well, can't really guarantee it in our own environment either but you get my point). Of course there's contract stipulations etc., but we'd rather keep things as safe as reasonably possible, since this other party doesn't need the original identifiers (but needs to know which have identical identifiers).

One really obvious place to start is to just have a database table internally that maps the true identifier to the opaque external identifier and back. Nobody without access to that table needs to do the conversion, so there's no need to use a hashing scheme or anything that would allow somebody else to do the transformation in one direction.

The big wrinkle is whether you're sure that providing this third-party with a stable user identifier actually meets your requirements of not leaking the underlying user list. This is a tricky problem!

For example, if your data includes a location track of each user, knowing where a particular user is at 3am every night is basically all you need to know to figure out exactly who they are. This is a pretty obvious example but there are a whole lot of more subtle things that can break privacy and there's a ton of literature on reidentification that you should probably be aware of if you're designing such a scheme.

Qtamo
Oct 7, 2012

Zoracle Zed posted:

are you responsible for anonymizing PII? stay the hell away imo

Thankfully this isn't about PII, though I do deal with it in my main, GIS-related work. With PII the correct solution is always "get the security/privacy people in the org to accept the solution in writing", which I'll do here as well but don't want to propose an improvement that's lovely.


Jabor posted:

One really obvious place to start is to just have a database table internally that maps the true identifier to the opaque external identifier and back. Nobody without access to that table needs to do the conversion, so there's no need to use a hashing scheme or anything that would allow somebody else to do the transformation in one direction.

The big wrinkle is whether you're sure that providing this third-party with a stable user identifier actually meets your requirements of not leaking the underlying user list. This is a tricky problem!

For example, if your data includes a location track of each user, knowing where a particular user is at 3am every night is basically all you need to know to figure out exactly who they are. This is a pretty obvious example but there are a whole lot of more subtle things that can break privacy and there's a ton of literature on reidentification that you should probably be aware of if you're designing such a scheme.

The internal mapping table wasn't actually obvious to me before, guess I got too carried away by the current implementation! Thanks, I'll probably start working towards that solution. I also realized I don't even know if the reversibility is a hard requirement or just some sort of nebulous "we might need it in the future" kind of a thing, If it isn't, I'll try to get that part dropped entirely which would make things way simpler.

This isn't PII or anything related (I realize my vague description and poorly thought of Alice/Bob example probably made it seem like it is), and the other data that's included doesn't have things that could be used to identify single rows, so reidentification shouldn't be a problem here. If this was PII, I probably wouldn't even touch this if the reversibility of the identifier was a hard requirement, it'd seem like a breach and/or lawsuit waiting to happen.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
Even if individual rows are suitably anonymous, since your whole plan is to have a single key linking many rows from the same user, it's possible that the aggregate is not actually anonymous. A good case study would be the 2006 AOL search logs release.

Qtamo
Oct 7, 2012

Jabor posted:

Even if individual rows are suitably anonymous, since your whole plan is to have a single key linking many rows from the same user, it's possible that the aggregate is not actually anonymous. A good case study would be the 2006 AOL search logs release.

This isn't user data. A vague-enough way of describing would probably be that the non-unique identifier is a category of sorts, and we'd prefer the names of the categories to be unknown to others. Every single row bears no relation to the other rows aside from this one datapoint. Well, since they're categories of sorts there's obviously a relation, but that isn't included in the table. But it's true that reidentification is a massive pita whenever dealing with something that's even vaguely related to individual people or users. I learned this the hard way in my main GIS work, thankfully as an internal whoopsie instead of a lawsuit or anything. Since then I've started considering absolutely anything as PII unless I can be absolutely sure that there's no way to relate it to a person using any other data source. Turns out, in the GIS space pretty much anything is PII and computers were a massive mistake :suicide:

Jose Cuervo
Aug 25, 2004
I have been watching Corey Schafer's Python Flask tutorial on youtube and have made it to past part 5:
https://www.youtube.com/watch?v=44PvX0Yv368
where he restructures the app to avoid issues with circular dependencies etc, and past part 7:
https://www.youtube.com/watch?v=803Ei2Sq-Zs
where he creates the user account etc.

The tutorial is somewhat old and uses the old style of SQLAlchemy queries, and I was trying to update the way in which the models were defined to be in the declarative mapping style.
This is the version which worked:
Python code:
File: __init__.py

from flask import Flask
from flask_sqlalchemy import SQLAlchemy
from flask_login import LoginManager

app = Flask(__name__)
app.config['SECRET_KEY'] = 'mysecretkeygoeshere'
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///tutorial_database.db'

db = SQLAlchemy(model_class=Base)

login_manager = LoginManager(app)
Python code:
File: models.py

from tutorial import db, login_manager
from flask_login import UserMixin

class User(db.Model, UserMixin):
    id = db.Column(db.Integer, primary_key=True)
    email = db.Column(db.String(120), unique=True, nullable=False)
    password = db.Column(db.String(60), nullable=False)
This code worked fine (it follows along with what Corey has in his tutorial).

After reading the SQLAlchemy documentation and the flask-sqlalchemy documentation (https://flask-sqlalchemy.palletsprojects.com/en/3.1.x/models/), I updated the files to the following:
Python code:
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
from flask_login import LoginManager
from sqlalchemy import MetaData
from sqlalchemy.orm import DeclarativeBase


class Base(DeclarativeBase):
    metadata = MetaData(naming_convention={
        "ix": 'ix_%(column_0_label)s',
        "uq": "uq_%(table_name)s_%(column_0_name)s",
        "ck": "ck_%(table_name)s_%(constraint_name)s",
        "fk": "fk_%(table_name)s_%(column_0_name)s_%(referred_table_name)s",
        "pk": "pk_%(table_name)s"
    })


app = Flask(__name__)
app.config['SECRET_KEY'] = 'mysecretkeygoeshere'
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///tutorial_database.db'

db = SQLAlchemy(app=app, model_class=Base)

login_manager = LoginManager(app)
Python code:
File: models.py

from tutorial import db, login_manager
from flask_login import UserMixin
from sqlalchemy import Integer, String, Float, Boolean, DateTime, ForeignKey
from sqlalchemy.orm import Mapped, mapped_column, relationship


class User(db.Model, UserMixin):
    __tablename__ = 'users'

    id: Mapped[int] = mapped_column(Integer, primary_key=True)
    email: Mapped[str] = mapped_column(String(120), unique=True, nullable=False)
    password: Mapped[str] = mapped_column(String(60), nullable=False)
however, now when I attempt to run the website I get the following error message:
"sqlalchemy.exc.ArgumentError: Class '<class 'tutorial.models.User'>' already has a primary mapper defined."

I have tried looking this up but cannot find any results which point me towards what I am doing wrong.

boofhead
Feb 18, 2021

I'm not a flask/sqlalchemy person but is it possible that there already exists a table mapped to the User class (created during previous steps) called "user", and now in the latest step you're trying to tell it that the User class actually maps to an overriding table called "users" ? It looks like you added this in after already creating the table from the class:

code:
__tablename__ = 'users'
and I think flask-sqlalchemy might automatically create table names from class names by just snakecasing it, rather than automatically adding plurals. Because the error you've posted seems to suggest that there's a conflict where the 'User' class is already mapped to a table somewhere, and it's not the table specified by __tablename__

try removing the above line from the code and see if that works

Generic Monk
Oct 31, 2011

boofhead posted:

I'm not a flask/sqlalchemy person but is it possible that there already exists a table mapped to the User class (created during previous steps) called "user", and now in the latest step you're trying to tell it that the User class actually maps to an overriding table called "users" ? It looks like you added this in after already creating the table from the class:

code:
__tablename__ = 'users'
and I think flask-sqlalchemy might automatically create table names from class names by just snakecasing it, rather than automatically adding plurals. Because the error you've posted seems to suggest that there's a conflict where the 'User' class is already mapped to a table somewhere, and it's not the table specified by __tablename__

try removing the above line from the code and see if that works

I want to say that they're providing the table name like that because 'user' is a reserved word in postgres. Better solution than mine when following that tutorial which was to call it 'AppUser' lmao

Agree on starting the database afresh and doing a db.create_all() just eliminate database fuckery as a cause tho

Jose Cuervo
Aug 25, 2004

boofhead posted:

I'm not a flask/sqlalchemy person but is it possible that there already exists a table mapped to the User class (created during previous steps) called "user", and now in the latest step you're trying to tell it that the User class actually maps to an overriding table called "users" ? It looks like you added this in after already creating the table from the class:

code:
__tablename__ = 'users'
and I think flask-sqlalchemy might automatically create table names from class names by just snakecasing it, rather than automatically adding plurals. Because the error you've posted seems to suggest that there's a conflict where the 'User' class is already mapped to a table somewhere, and it's not the table specified by __tablename__

try removing the above line from the code and see if that works

I tried that but then it fails with the following error:
"sqlalchemy.exc.InvalidRequestError: Class <class 'tutorial.models.User'> does not have a __table__ or __tablename__ specified and does not inherit from an existing table-mapped class."

I also tried using the name 'user' but that did not work either (threw the same error message).

Generic Monk posted:

I want to say that they're providing the table name like that because 'user' is a reserved word in postgres. Better solution than mine when following that tutorial which was to call it 'AppUser' lmao

Agree on starting the database afresh and doing a db.create_all() just eliminate database fuckery as a cause tho
I named the table 'users' because that is what was suggested in Section 2.3 of the book 'SQLAlchemy 2 in Practice' by Miguel Grinberg where he states "A very common naming convention for database tables is to use the plural form of the entity in lowercase, so in this case the table [whose class name is Product] is given the products name. This contrasts with the convention used for the model class names, which prefers the singular form in camel case."

Both errors are being thrown when I try to run my populate_db.py script where I have the following code:
Python code:
File populate_db.py

from tutorial import app, db, bcrypt

with app.app_context():
    db.drop_all()
    db.create_all()

    # Add Test Users to DB
    db.session.add(User(email='test@tutorial.com',
                        password=bcrypt.generate_password_hash('password').decode('utf-8')))
    db.session.commit()
Surprisingly the code works if I do not try to include the Base class, that is I keep the original code for __init__.py:
Python code:
File: __init__.py

from flask import Flask
from flask_sqlalchemy import SQLAlchemy
from flask_login import LoginManager

app = Flask(__name__)
app.config['SECRET_KEY'] = 'mysecretkeygoeshere'
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///tutorial_database.db'

db = SQLAlchemy(app)   # <------ I made a mistake when copy pasting originally and this is what I should have had in the original code for __init__.py, not db = SQLAlchemy(model_class=Base)

login_manager = LoginManager(app)
but have the declarative style of model class definition in models.py:
Python code:
File: models.py

from tutorial import db, login_manager
from flask_login import UserMixin
from sqlalchemy import Integer, String, Float, Boolean, DateTime, ForeignKey
from sqlalchemy.orm import Mapped, mapped_column, relationship


class User(db.Model, UserMixin):
    __tablename__ = 'users'

    id: Mapped[int] = mapped_column(Integer, primary_key=True)
    email: Mapped[str] = mapped_column(String(120), unique=True, nullable=False)
    password: Mapped[str] = mapped_column(String(60), nullable=False)

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Qtamo posted:

Knew I forgot something :doh: The original db tables stay in our internal environment, but the transformed data is a part of a larger dataset that gets sent to another party, and we can't guarantee that the data is safe there (well, can't really guarantee it in our own environment either but you get my point). Of course there's contract stipulations etc., but we'd rather keep things as safe as reasonably possible, since this other party doesn't need the original identifiers (but needs to know which have identical identifiers).
If anonymization is a problem you actually have, then every developer and developer client you have is also a risk. You should probably consider Baffle or Immuta or something and just export your normal tokenized fields when sending data to third parties

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug
Anyone have any experience with one of the Python-only frontend frameworks like Anvil or JustPy? I have some internal tooling I'd like to build with not a lot of time, and all things being equal, it'd be nice to not have to learn JS and a modern frontend library. The alternative would just be a bunch of CLI stuff using Click/etc, which isn't awful (or Textual if I decide to get really fancy) but it'd be nice to be able to just link someone to a URL.

Like sure, at some point I'll come back to it but I'd like to be able to slap together some basic UIs in front of some automation/diagnostic stuff. My only real requirements would be something that lets you build the whole UX in Python (so not just Django/Flask), and ideally something with a decent default style setup so stuff looks reasonably 'built this century'.

Falcon2001 fucked around with this message at 08:11 on Nov 19, 2023

His Divine Shadow
Aug 7, 2000

I'm not a fascist. I'm a priest. Fascists dress up in black and tell people what to do.
Well I wrote my first small python app. I almost broke my brain with Oauth2, problem was microsofts backend, god I hate MS. In the end it was not possible to get imap/smtp working so I had to use microsoft.graph and that worked.

Now I got a little app that searches my inbox, finds a particular mail, processes it as desired and sends it along. A little manual work gone.

CarForumPoster
Jun 26, 2013

⚡POWER⚡

His Divine Shadow posted:

Well I wrote my first small python app. I almost broke my brain with Oauth2, problem was microsofts backend, god I hate MS. In the end it was not possible to get imap/smtp working so I had to use microsoft.graph and that worked.

Now I got a little app that searches my inbox, finds a particular mail, processes it as desired and sends it along. A little manual work gone.

As someone who has hated themselves for this: if you have to rotate your app's credentials in Azure AD PUT THAT ON YOUR CALENDAR RIGHT NOW.

Son of Thunderbeast
Sep 21, 2002
Haha, oauth lockout, classic

onionradish
Jul 6, 2006

That's spicy.
Speaking of OAuth, is there an easy way or recommended module that will allow a Windows PC to respond to the callback to get a token after passing in the initial key to an arbitrary service?

I just want to authorize access for a simple personal script on my home PC. The OAuth hassle to set up a public server vs using a simple authentication key is pushing a bunch of "I could automate that" projects to "UGH; maybe some other day..."

CarForumPoster
Jun 26, 2013

⚡POWER⚡

onionradish posted:

Speaking of OAuth, is there an easy way or recommended module that will allow a Windows PC to respond to the callback to get a token after passing in the initial key to an arbitrary service?

I just want to authorize access for a simple personal script on my home PC. The OAuth hassle to set up a public server vs using a simple authentication key is pushing a bunch of "I could automate that" projects to "UGH; maybe some other day..."

Im not 100% sure I understand the ? If you need to refresh tokens on some interval on windows just schedule a task to run every n minutes. If this is a PC that is intermittently on such that tokens may expire in that time, you'll need to have some scheduled or always on thing handle that, such as a AWS Lambda and/or a Zapier task.

Lambda + Zapier make easy work of this if you dont wanna gently caress with configuring a Lambda to be able to reach the public internet.

Or is the callback you're referring to in this case requesting a user browse to a web page and log in?

EDIT: I forgot that you prob have to register a URL to make this request, and I am assuming that they won't let you register localhost. Yea, try Zapier they might have something for this use case. Otherwise AWS lambda set up with Zappa should give you an API Gateway that you can register and would cost nothing for infrequent use.

CarForumPoster fucked around with this message at 04:52 on Nov 23, 2023

His Divine Shadow
Aug 7, 2000

I'm not a fascist. I'm a priest. Fascists dress up in black and tell people what to do.

CarForumPoster posted:

As someone who has hated themselves for this: if you have to rotate your app's credentials in Azure AD PUT THAT ON YOUR CALENDAR RIGHT NOW.

Yeah the moment I started dealing with azure credentials I realized that this is gonna be a recurring PITA.

Pyromancer
Apr 29, 2011

This man must look upon the fire, smell of it, warm his hands by it, stare into its heart

onionradish posted:

Speaking of OAuth, is there an easy way or recommended module that will allow a Windows PC to respond to the callback to get a token after passing in the initial key to an arbitrary service?

I just want to authorize access for a simple personal script on my home PC. The OAuth hassle to set up a public server vs using a simple authentication key is pushing a bunch of "I could automate that" projects to "UGH; maybe some other day..."

Just use OAuth in a way you don't need to use callback - client credentials flow to get a token from client id and secret, or do what all security recommendations tell to not do and enable implicit flow to get token with username and password without authorization code callback step.

FISHMANPET
Mar 3, 2007

Sweet 'N Sour
Can't
Melt
Steel Beams
I'm confused because the callback is for your app's authentication, why does your app need to authenticate to itself to function?

FWIW, using Google's oauth provider allows setting localhost as the callback URL.

LightRailTycoon
Mar 24, 2017
I use client certificate auth for my Microsoft 365 apps.
If I remember correctly, they let you use arbitrarily long certificate expiration, so renewal is a problem for 10 years from now me.

onionradish
Jul 6, 2006

That's spicy.
Sorry for the ambiguity of my OAuth question earlier. Specifically, I'm wanting to access the Pocket API from my Windows desktop to pull all my "read or file this later" bookmarks I've dropped in there from my phone while traveling.

The part I'm scratching my head about is Step 2 and onwards of the Pocket authentication process -- getting a request token at the request_uri, since the calling app is a Python script on my home Windows PC and isn't a publicly-accessible address. That request_uri gets used in a couple of places downstream through the process.

One of the Pocket API packages I found on PyPi suggests some random person's website in Germany to send your callback to. Another one does it's own redirect to an obfuscated/shortened goo.gl link. Running authorization through either of those sounds like a terrible idea that gives credentials for complete access to your account to some unknown entity.

I'll look at Zapier; I hadn't considered that as a possiblity. If I'm over-complicating this or misunderstanding, I'd greatly appreciate advice. All of the other APIs I've worked with only need a simple "consumer key" passed through with the request so the OAuth stuff is new and I've been putting off learning how to interact with it since it's just home hobby stuff.

CarForumPoster
Jun 26, 2013

⚡POWER⚡
It looks like zapier might already have a pocket integration in which case you don’t need to worry about any of this.

https://zapier.com/apps/pocket/integrations

Send your stuff wherever you want like a Google sheet, MS Teams, or use webhooks to make your own api.

rowkey bilbao
Jul 24, 2023
re: Pocket, I'd like to know if there's a know/documented way to make a page Pocket friendly, as in I want to make my own web page, save it to pocket and access it with pocket's clean view, which is what it does for news sites for example.

I've tried hosting a boilerplate page with article tags and clean html on my own domain but pocket won't scrape it, and the app will just redirect me to a web view

StumblyWumbly
Sep 12, 2007

Batmanticore!
I'm writing a bunch of PyTest code to run some integration tests. We have a top level test dispatcher that sends out some JSON parameters to a system that runs the code, then we use PyTest to check that everything ran correctly. We need the PyTest code to have access to the parameters, so the parameters go into the execution environment, and in conftest.py we grab the test parameters and parameterize them to make it accessible to the individual tests.

This feels like an inelegant solution to a standard situation, so I wonder if there's a better solution. Specifically:
- Am I over using conftest.py? Is it really the best solution to manage input parameters or is there something I'm missing?
- As multiple people add tests, we have a bit of disagreement between philosophies of "Pass an object containing all parameters to each test, let the test grab what it wants" vs "Separate each test parameter to its own pytest parameter in conftest" Are there standards I should pay attention to here?
- Should I just not use PyTest at all? Our test case is more "Check that multiple files meet the parameters provided", not "check one thing against multiple parameters". I like PyTest in general but it definitely feels like an awkward fit here.

monochromagic
Jun 17, 2023

StumblyWumbly posted:

I'm writing a bunch of PyTest code to run some integration tests. We have a top level test dispatcher that sends out some JSON parameters to a system that runs the code, then we use PyTest to check that everything ran correctly. We need the PyTest code to have access to the parameters, so the parameters go into the execution environment, and in conftest.py we grab the test parameters and parameterize them to make it accessible to the individual tests.

This feels like an inelegant solution to a standard situation, so I wonder if there's a better solution. Specifically:
- Am I over using conftest.py? Is it really the best solution to manage input parameters or is there something I'm missing?
- As multiple people add tests, we have a bit of disagreement between philosophies of "Pass an object containing all parameters to each test, let the test grab what it wants" vs "Separate each test parameter to its own pytest parameter in conftest" Are there standards I should pay attention to here?
- Should I just not use PyTest at all? Our test case is more "Check that multiple files meet the parameters provided", not "check one thing against multiple parameters". I like PyTest in general but it definitely feels like an awkward fit here.

This does seem inelegant, and I'll freely admit I'm not quite sure what you're trying to solve here. I'm assuming that "parameters go into the execution environment" means something like passing them as environment variables. I'd use fixtures picking them up instead, rather than trying to use parametrize because that functionality is more for unit tests. With respect to separation, I prefer to separate/isolate tests as much as possible so I wouldn't pass an object around with all parameters in any case.

In general pytest is a super powerful testing suite, and probably has the best ergonomics for basically any language, and much of that power comes from its fixture concept. One thing I would add is to be careful when implementing fixtures as they can contain a lot of "magic behaviour" so isolation and good documentation is essential.

StumblyWumbly
Sep 12, 2007

Batmanticore!

monochromagic posted:

This does seem inelegant, and I'll freely admit I'm not quite sure what you're trying to solve here. I'm assuming that "parameters go into the execution environment" means something like passing them as environment variables. I'd use fixtures picking them up instead, rather than trying to use parametrize because that functionality is more for unit tests. With respect to separation, I prefer to separate/isolate tests as much as possible so I wouldn't pass an object around with all parameters in any case.

In general pytest is a super powerful testing suite, and probably has the best ergonomics for basically any language, and much of that power comes from its fixture concept. One thing I would add is to be careful when implementing fixtures as they can contain a lot of "magic behaviour" so isolation and good documentation is essential.

Right, the problem we're solving is testing out sensor hardware. The input parameters say, essentially, "take measurements on channels <A, B, C> at rate <X> for <Y> seconds, <Z> times". Other code runs those parameters on the sensors, this will give us Z files, we want to make sure channels A, B, C are present, with the specified rate and duration. In reality channels A, B, C will each have their own test criteria, like max and min values. We group all those channel specific values into the test parameter object, and in conftest.py we have code like:
Python code:
def pytest_generate_tests(metafunc):
    if "file_name" in metafunc.fixturenames and "channel" in metafunc.fixturenames:
        test_chans = get_channels_from_input_parameters()
        test_order = []
        for file in get_files_from_input_parameters():
            for chan in test_chans:
                test_order.append((file, chan))
        metafunc.parametrize(["ide_file_name", "channel_id"], test_order, scope="session")
    # Separately handle situations where a test wants just the file name or just the channel
    if "test_parameters" in metafunc.fixturenames:
        test_parameter_object = get_test_param_from_whatever()
        metafunc.parametrize("test_parameters", [test_parameter_object])
This does a few weird things:
- Set the test order so we open the file, test all channels on it, then close the file, to minimize file overhead (not sure if this is a real problem)
- Each test gets the file, channel id, and the test parameters. The test will use the channel ID to pull its data out of the file, and get its parameters from the test parameter object. Parameterizing it at a higher level seems like it would increase overhead a lot
- I _think_ we parameterize the test parameters here because we may change them with command line arguments, or maybe we wanted to just generate the parameter object once?

Since these tests are linked to hardware, they all run on Raspis, so compute power is not huge, but I am going through some steps that reduce isolation, so maybe I should re-evaluate that choice.
In general, I might be better off if I just always write the test parameters to a file, and use a standard fixture to pull it out, instead of trying to handle command line vs environment.
I'm not sure on the best way to handle splitting up the file data or input parameter data without making more of a mess.

monochromagic
Jun 17, 2023

StumblyWumbly posted:

words & code

Right, so if I understand this correctly you are verifying that the measurements your hardware produces are correct? I definitely think you'd be better off writing to a file and reading from a fixture, because the current functionality seems bloated and less useful. Consider also if it's worth making files for channels individually (obviously not sure how much control you have over the generating code or if that even makes sense - I don't work with hardware).

Generally, being obvious rather than DRY or whatever is a boon when testing, in my opinion, because it makes tests easier to isolate and expand upon. Right now your conftest seems to work for some very specific cases, which might bite you in the future.

Also this

StumblyWumbly posted:

- Set the test order so we open the file, test all channels on it, then close the file, to minimize file overhead (not sure if this is a real problem)
seems... not great, but might be necessary in order to run on RasPis. But also my point about separate files for separate channels still stands here.

QuarkJets
Sep 8, 2008

StumblyWumbly posted:

I'm writing a bunch of PyTest code to run some integration tests. We have a top level test dispatcher that sends out some JSON parameters to a system that runs the code, then we use PyTest to check that everything ran correctly. We need the PyTest code to have access to the parameters, so the parameters go into the execution environment, and in conftest.py we grab the test parameters and parameterize them to make it accessible to the individual tests.

This feels like an inelegant solution to a standard situation, so I wonder if there's a better solution. Specifically:

StumblyWumbly posted:

- Am I over using conftest.py? Is it really the best solution to manage input parameters or is there something I'm missing?

Seems alright to me.

StumblyWumbly posted:

- As multiple people add tests, we have a bit of disagreement between philosophies of "Pass an object containing all parameters to each test, let the test grab what it wants" vs "Separate each test parameter to its own pytest parameter in conftest" Are there standards I should pay attention to here?

Each test should have a well-defined interface. Args are well-defined. A dataclass instance is well-defined. A dataclass is a great way to consolidate a large number of parameters in a test. Or if you're using something simpler, like a dictionary, that can be ok so long as it's well-documented.

Monolithic parameter objects should be avoided because they harm readability and they harm debugging when something goes wrong

StumblyWumbly posted:

- Should I just not use PyTest at all? Our test case is more "Check that multiple files meet the parameters provided", not "check one thing against multiple parameters". I like PyTest in general but it definitely feels like an awkward fit here.

It seems like an alright use to me

StumblyWumbly
Sep 12, 2007

Batmanticore!
Ok, thanks, sounds like things are a little ugly but I'm not missing anything big. Changing the file format is a neat idea but not possible because the format is part of the test, and pre-processing it would be a big change.
We are using a dataclass for the test parameters. I may move it into Pydantic for a little extra checking on input parameters.

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug
I asked over in the CI/CD thread, but asking here as well: Anyone have experience with implementing Jupyter Notebooks as operational runbooks? Netflix apparently uses this as well as some other teams internally, but there's enough odd edges around it that I figure I should keep asking for advice.

Some minor notes:
- Our infra is mostly internal and proprietary (FAANG company) so we can't use most out of the box integration stuff I see in other 'ops runbook' software/services. We have python clients for all of it, which is where Jupyter started looking pretty positive.
- I'm reasonably confident I can tie it to organizational goals, we have a bunch of 'improve runbooks' and 'better impact tracking' stuff that all are pretty reasonable.
- The sorts of runbooks I'm looking to implement would mostly be incident response stuff at first, focusing on automating the 'gather information' section first and then allowing incremental moves towards more advanced runbooks/etc.
- We would be hosting some sort of centralized JupyterHub or other solution for this, rather than individual dev envs.

QuarkJets
Sep 8, 2008

Using git with jupyter notebooks is a huge clusterfuck, that's enough to keep me away from trying to use notebooks for ops

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug

QuarkJets posted:

Using git with jupyter notebooks is a huge clusterfuck, that's enough to keep me away from trying to use notebooks for ops

Ugh yeah, I saw this. I heard there's a plugin or something to store them in a more sane format but haven't dug into it yet.

Edit: Really, the problem here is that I need about half of the stuff Jupyter provides, so maybe it's better to just say that I'd like a platform that does the following:
- Web UI (so you can link to a specific runbook/etc from tickets or other documentation, but also so we have a centralized controlled environment for this, no risking individual devs not having their pipelines updated/etc)
- Must allow for some form of authentication, I'll have to figure out how to tie in our internal auth systems no matter what, so I assume I can probably hack in most stuff as long as it's not hardcoded to things like Github/etc.
- Can execute Python code inline with some form of documentation, Markdown preferred.
- Has the ability to set cells to read-only, while others can be writable (Doesn't matter if you can overcome the read-only via override steps, I just want to ensure there's a basic guardrail against accidental changes.)
- Must be OSS / Self-hostable
- Readable diffs and source control

It's the whole arbitrary code execution portion that is the trickiest, because obviously most documentation platforms are going to be extremely wary of getting anywhere near that part, for obvious reasons.

I looked at a couple various platforms and while there are OSS options, a lot of them assume an agent setup/etc for many supported infrastructures, and we basically already have solutions for most of that; monitoring/etc are all solved problems that are not really a good idea to try and replace internally. Fiberplane and Runme.dev both basically fall into this.

I know some internal teams are already doing this, so going to get some time with the people running their instances and get some feedback on the pros and cons they've run into and how they solve it.

Falcon2001 fucked around with this message at 22:37 on Nov 26, 2023

sugar free jazz
Mar 5, 2008

I'm trying to automate some analyses in UCINET6 on Windows to streamline a data analysis process and have no idea what to do at this point since this kind of automation isn't something i'm really experienced with. I've fiddled with powershell scripts, python, and AHK, but I can't get it to actually do anything after opening the application and I'm not sure how to even go about figuring that out outside of just emailing the guys who wrote it since it's a relatively small, technical piece of software and I'm not actually good at this type of programming. Any suggestions on what kind of avenues to go down to start figuring this out in Python?

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug

sugar free jazz posted:

I'm trying to automate some analyses in UCINET6 on Windows to streamline a data analysis process and have no idea what to do at this point since this kind of automation isn't something i'm really experienced with. I've fiddled with powershell scripts, python, and AHK, but I can't get it to actually do anything after opening the application and I'm not sure how to even go about figuring that out outside of just emailing the guys who wrote it since it's a relatively small, technical piece of software and I'm not actually good at this type of programming. Any suggestions on what kind of avenues to go down to start figuring this out in Python?

If you're looking for a really hacky solution, then you could use something like https://pyautogui.readthedocs.io/en/latest/ but frankly that's not a particularly effective way of approaching the problem unless this is a personal workflow you can hand-manage.

I would look at if there's any way to run this from a commandline/etc or otherwise invoke the program in such a way that you can tell it to do something consistently. I don't see a documented setup for that from a quick google search, so reaching out to the support DL might not be a bad idea.

I'd approach this from the perspective of 'how can I make this program do its analysis with no human input, given appropriate files/etc in the right places'. Once you figure that out, Python is an excellent language for setting up the files/etc in the right places, invoking UCINET, and then analysing/manipulating the output.

saintonan
Dec 7, 2009

Fields of glory shine eternal

Is there a tutorial somewhere on automating web scraping where I need the automaton to log in to an account, navigate to a page, then manipulate a GUI to produce a series of tables that the automaton scrapes and saves as a csv? I did this years ago using iMacros, but figured there'd be a relatively straightforward Python/Chromium solution.

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug

saintonan posted:

Is there a tutorial somewhere on automating web scraping where I need the automaton to log in to an account, navigate to a page, then manipulate a GUI to produce a series of tables that the automaton scrapes and saves as a csv? I did this years ago using iMacros, but figured there'd be a relatively straightforward Python/Chromium solution.

Look into Selenium, it's the go-to for this sort of thing.

I. M. Gei
Jun 26, 2005

CHIEFS

BITCH



Okay can someone tell me what the gently caress this PyCharm error means and how the gently caress do I fix it? :ughh:

pre:
C:\Users\[USER_NAME]\PycharmProjects\pythonProject1\venv\Scripts\python.exe C:\Users\[USER_NAME]\PycharmProjects\pythonProject1\venv\[CODE_WINDOW_NAME].py
Unable to create process using '"C:\Users\[USER_NAME]\AppData\Local\Microsoft\WindowsApps\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\python.exe" C:\Users\[USER_NAME]\PycharmProjects\pythonProject1\venv\[CODE_WINDOW_NAME].py'

Process finished with exit code 101
I took a break from coding and just had to renew my PyCharm license to make it work again. I'm trying to run the code that's been up since I last used PyCharm in...... September, I think? It ran just fine back then, but now it's giving me this error message and not running at all.

What gives? How do I make my poo poo run again?

Hed
Mar 31, 2004

Fun Shoe
Did you uninstall Python 3.11 or upgrade to 3.12 since last time?
It looks like your venv is messed up. I’d take that “create process using” line one argument at a time and make sure all those files exist. Probably easier to bring up project settings and try to make a new venv dedicated to the project.

Adbot
ADBOT LOVES YOU

sugar free jazz
Mar 5, 2008

Falcon2001 posted:

If you're looking for a really hacky solution, then you could use something like https://pyautogui.readthedocs.io/en/latest/ but frankly that's not a particularly effective way of approaching the problem unless this is a personal workflow you can hand-manage.

I would look at if there's any way to run this from a commandline/etc or otherwise invoke the program in such a way that you can tell it to do something consistently. I don't see a documented setup for that from a quick google search, so reaching out to the support DL might not be a bad idea.

I'd approach this from the perspective of 'how can I make this program do its analysis with no human input, given appropriate files/etc in the right places'. Once you figure that out, Python is an excellent language for setting up the files/etc in the right places, invoking UCINET, and then analysing/manipulating the output.

this is in fact a personal workflow i can hand manage! pyautogui also feels very familiar since i've done some screen scraping with selenium, so this will hopefully work. wish it could be done quietly but don't let the perfect be the enemy of the good etc. appreciate it!

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply