Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
jawbroken
Aug 13, 2007

messmate king
surely you can just dump the contents of the container, it’s not magic. if you’re trying to protect your IP you’re going to have to do it with copyright, patents, obfuscation, etc.

Adbot
ADBOT LOVES YOU

Foxfire_
Nov 8, 2010

Zugzwang posted:

Good to hear it's working reasonably well now. Being easy to reverse-engineer would be a dealbreaker though if I do need to make it closed-source.
Python is always going to be inherently easy to reverse engineer. You are ultimately feeding easily reversable bytecode into an interpreter. It's not going to be that hard to identify the bytecode in RAM, even if you obfuscated it a lot on the disk.

Zugzwang
Jan 2, 2005

You have a kind of sick desperation in your laugh.


Ramrod XTreme
Yeah, that makes sense. This is still hypothetical at this point, so I am mostly trying to figure out what paths I would take if we needed to go there.

To be frank, I’m not even sure it’s feasible for this to be closed source, and I’d rather it not be. Will ultimately be up to management at my organization though.

QuarkJets
Sep 8, 2008

jawbroken posted:

surely you can just dump the contents of the container, it’s not magic. if you’re trying to protect your IP you’re going to have to do it with copyright, patents, obfuscation, etc.

You can encrypt docker layers for in-flight protection, and singularity lets you encrypt the entire root file system of the image (including any source code living there). Singularity claims that their encryption is maintained even while the container is running

There's also pyce, which lets you deploy encrypted python bytecode (.pyce, which you just import like you would .pyc or .py modules). I'm assuming that'd be as reliable as encrypting an executable, no idea whether it actually works though

jawbroken
Aug 13, 2007

messmate king
i guess i don’t understand how that could work. i’m sure the file system can be encrypted on the drive, but it’s going to need to be decrypted so the program can be run because practical homomorphic encryption doesn’t exist. it must boil down to some form of DRM, essentially

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
Reading the linked pyce post, it literally just decrypts the code into memory when loading it. So absolutely trivial to bypass for anyone who knows what they're doing. I don't care enough to go look up what singularity containers are, but based on that precedent I'm just gonna assume they're equally bad and won't stop a skilled attacker.

If you're worried about your software being ripped off by the people you're selling it to, the right answer is to put something in the contract they're signing with you that says they can't do that. (You should be doing that - specifying what they can and can't do with the software - even if you're not worried about them ripping it off!). Then if they breach it you can sue the pants off them. If your current legal team is not up to that sort of thing, fire them and get a new one that actually knows what they're doing as far as contract law goes.

Phobeste
Apr 9, 2006

never, like, count out Touchdown Tom, man
Lol I used to work for a company where I argued for making a python api to interact with a device that we sold. Wouldn’t be useful without the physical object. They still wanted it “protected” (?????) so I used this library the obfuscated it by making every symbol named like 1lll1111l1 except it couldn’t do it for public symbols and on and on it loving sucked lol

Data Graham
Dec 28, 2009

📈📊🍪😋



Big "CEO just found out the Javascript code is just sent to the end user unencrypted and they can just steal all our code, find a way to encrypt it NOW" energy

QuarkJets
Sep 8, 2008

jawbroken posted:

i guess i don’t understand how that could work. i’m sure the file system can be encrypted on the drive, but it’s going to need to be decrypted so the program can be run because practical homomorphic encryption doesn’t exist. it must boil down to some form of DRM, essentially

Well yeah, it's DRM, I thought that's what OP was trying to do. The point that they advertise is that you can publish the encrypted bytecode to pypi and then you can give the keys to whoever pays you. It's the same thing you'd be doing with an encrypted executable but skips the step of building an executable

CarForumPoster
Jun 26, 2013

⚡POWER⚡

Data Graham posted:

Big "CEO just found out the Javascript code is just sent to the end user unencrypted and they can just steal all our code, find a way to encrypt it NOW" energy

I once made a demo web app using Glide and it just sent my whole google sheet to the user.

boofhead
Feb 18, 2021

I just heard that our world wide website domain url is broadcasting an IP address!! Turn it off right now!!!!

QuarkJets
Sep 8, 2008

They've already pierced the firewall, we have to cut the hardline!

Data Graham
Dec 28, 2009

📈📊🍪😋



Make it so they can’t download our images!!

duck monster
Dec 15, 2004

Oysters Autobio posted:

Yeah I've yet to see any Python based GUI not just look like a slightly more modern version of Visual Basic. Like all the examples in qt designer or tkinter all just scream Microsoft office modal box. I think unfortunately we've all been exposed to too many web based GUIs which have defined the aesthetic and UX expectations in the past decade of "everything-as-a-webapp". Though it's surprising because all that it takes is some thicker/heavier input boxes, minimal gradients/colors/container lines, some highlight responsiveness and heavy padding so that your GUI looks less like this




and more like this:



Granted the latter is a React front end but still surprising you need these godawful webpack javascript monstrosity compilers to do anything in the front end but that's the web for you 🤷. The ppl who work on this as their day job all use javascript now so it's the most supported even if it's garbage.

Only exception as far as python based that I've seen is a few front end libraries like plotly Dash and streamlit. Granted these are aimed at the data world so I dunno how useful they are for anything more complex than a demo app.

If it's just as an internal tool another bit earlier option are Jupyter notebooks with ipywidgets and other ipython based widget libraries. ipyvuetify for example looks great. Google's collaboratory makes the best looking notebooks though they're all cloud based. If you still like working out of an IDE you can get VS Code extensions for them too.

QML stuff works great with python and in the right hands it can do anything a web gui can *and more*. You'll want a graphic/front end designer whos trained with QML, but my experience is with a little training they love it. Its *far* saner than HTML/CSS/Frameworkhell. Same licensing bullshit as QT, which can make bosses twitchy, buuuuut its python, not compiled so as long as you just include the QT dlls as is (LGPL lets you work that way), it'll be fine.

Oysters Autobio
Mar 13, 2017
I'm a bit stuck on how to fix this beyond saying gently caress it and waiting for a sane data warehouse or API or something.

I'm trying to automate some processing of spreadsheets that eventually we turn into a Tableau dashboard but I'm completely stuck with password protection on Excel. It's one of these "encrypted password" level of passwords so unlike the Protect Workbook or Protect Worksheet I can't seem to remove the password from even within Excel.

I've run into this poo poo before so I decided to go for working this into pandas and openpyxl.

According to google and stack overflow, openpyxl should be able to read an .xlsx file even if "password protected", and then just saving it is enough to remove the password protection. Not clear if this also refers to the "encrypted password" setting or just the Protected Workbook passwords.

Problem is whenever I try to read one of the .xlsx files it outputs a "Bad Zip file error."

Tried this both with using openpyxl as a read engine on pandas and openpyxl on its own.

According to google this is usually because the sheets aren't actually configured as tables so when openpyxl tries to read the .xlsx file (which is basically a .zip) it doesn't see anything.

Lo and behold on manual inspection none of the sheets are formatted as tables, they're just text on cells.

Now, the solution in this case is usually just to read it in as a .csv, but I can't even open it in openpyxl to begin with so aside from doing it manually I don't really know what other ways I could actually work with these stupid things. Luckily aside from the formatting, the tables themselves are actually fairly clean.

CarForumPoster
Jun 26, 2013

⚡POWER⚡

Oysters Autobio posted:

I'm a bit stuck on how to fix this beyond saying gently caress it and waiting for a sane data warehouse or API or something.

I'm trying to automate some processing of spreadsheets that eventually we turn into a Tableau dashboard but I'm completely stuck with password protection on Excel. It's one of these "encrypted password" level of passwords so unlike the Protect Workbook or Protect Worksheet I can't seem to remove the password from even within Excel.

I've run into this poo poo before so I decided to go for working this into pandas and openpyxl.

According to google and stack overflow, openpyxl should be able to read an .xlsx file even if "password protected", and then just saving it is enough to remove the password protection. Not clear if this also refers to the "encrypted password" setting or just the Protected Workbook passwords.

Problem is whenever I try to read one of the .xlsx files it outputs a "Bad Zip file error."

Tried this both with using openpyxl as a read engine on pandas and openpyxl on its own.

According to google this is usually because the sheets aren't actually configured as tables so when openpyxl tries to read the .xlsx file (which is basically a .zip) it doesn't see anything.

Lo and behold on manual inspection none of the sheets are formatted as tables, they're just text on cells.

Now, the solution in this case is usually just to read it in as a .csv, but I can't even open it in openpyxl to begin with so aside from doing it manually I don't really know what other ways I could actually work with these stupid things. Luckily aside from the formatting, the tables themselves are actually fairly clean.

I’m having a hard time understanding the particulars here.

If it’s actually an XLSX file can you unzip it it as a password protected zip and copy the innards?

Alternatively, can this be done on a windows laptop that sits in a corner somewhere such that you use pyautogui or a cursed pywin32 interaction with excel? SOMETHING to get the data out, and into a place where it can be read into pandas.

CompeAnansi
Feb 1, 2011

I respectfully decline
the invitation to join
your hallucination
I know this isn't really helpful for solving your problem, but why is anything that ends up in a tableau dashboard starting out in spreadsheets? :negative:

Oysters Autobio
Mar 13, 2017

CarForumPoster posted:

I’m having a hard time understanding the particulars here.

If it’s actually an XLSX file can you unzip it it as a password protected zip and copy the innards?

Alternatively, can this be done on a windows laptop that sits in a corner somewhere such that you use pyautogui or a cursed pywin32 interaction with excel? SOMETHING to get the data out, and into a place where it can be read into pandas.

By unzipping it do you mean with just like 7zip or something? I'll take a look and see if thats something I can do.

I have come across pywin32 but I wanted to exhaust more standard or common approaches and make sure I wasn't missing some more obvious solution

duck monster
Dec 15, 2004

CompeAnansi posted:

I know this isn't really helpful for solving your problem, but why is anything that ends up in a tableau dashboard starting out in spreadsheets? :negative:

Ah my naive child.

The real world is filled with these poisonous documents. Especially in the sciences and in government.

Wait till you get a load of how they do unicode.

Excel is hell, and excel is everywhere.

CarForumPoster
Jun 26, 2013

⚡POWER⚡

Oysters Autobio posted:

By unzipping it do you mean with just like 7zip or something? I'll take a look and see if thats something I can do.

I have come across pywin32 but I wanted to exhaust more standard or common approaches and make sure I wasn't missing some more obvious solution

Docx and xlsx are literally zip files yea. The “document” is made from a couple XML files contained therein. I’m kinda assuming you’re right about your conclusions about the doc protection and proposing other, significantly worse, things to get it done. You def should not go with either of those if you can handle getting the data to Python via CLI or some mature library like pandas.

Zugzwang
Jan 2, 2005

You have a kind of sick desperation in your laugh.


Ramrod XTreme

duck monster posted:

Ah my naive child.

The real world is filled with these poisonous documents. Especially in the sciences and in government.

Wait till you get a load of how they do unicode.

Excel is hell, and excel is everywhere.
Just yesterday, Excel very helpfully removed the leading and trailing zeroes from numbers that were supposed to be strings. What would I do without assistance like this?

Biffmotron
Jan 12, 2007

Once I had some awful manually entered spreadsheet where a date column was in whatever format the interns thought was a good idea at the time: mm/dd/yy, mm-dd-yyyy, 22 Mar, 2023 and so on. Excel's "everything is a date" feature proved useful.

Once.

Oysters Autobio
Mar 13, 2017
Sigh, all this talk about the shittiness of Excel just keeps reminding me that while I really love UX/UI and data viz stuff, just thinking about JavaScript, CSS, HTML/DOM and bundle packages poo poo gives me nightmares.

C'mon, give me a full Python web dev stack. I know I've already complained about this but it still just doesn't make sense given Python's ubiquity, especially with ML and DS being all the rage.

Is it too much to ask for an entire industry of professionals to completely replace a tech stack? Huh???

CarForumPoster
Jun 26, 2013

⚡POWER⚡

Oysters Autobio posted:

Sigh, all this talk about the shittiness of Excel just keeps reminding me that while I really love UX/UI and data viz stuff, just thinking about JavaScript, CSS, HTML/DOM and bundle packages poo poo gives me nightmares.

C'mon, give me a full Python web dev stack. I know I've already complained about this but it still just doesn't make sense given Python's ubiquity, especially with ML and DS being all the rage.

Does Dash count? I have multiple deployed Dash apps that are written 100% in python, but technically it’s react under the hood.

Oysters Autobio
Mar 13, 2017
Oh ya true, I haven't fully explored Dash enough and I really should.

CarForumPoster
Jun 26, 2013

⚡POWER⚡

Oysters Autobio posted:

Oh ya true, I haven't fully explored Dash enough and I really should.

Dash + Dash bootstrap components is where it’s at for any single page app IMO.

Seventh Arrow
Jan 26, 2005

So I just got back from a hackerrank coding test and was whammied with the following:

quote:

Rearrange an array of integers so that the calculated value U is maximized. Among the arrangements that satisfy that test, choose the array with minimal ordering. The value of U for an array with n elements is calculated as:

U = arr[1]xarr[2]x(1%arr[3])xarr[4]x...xarr[n-1] x (1%arr[n]) if n is odd

It's not the whole thing. There was another algorithm for if n was even. I knew I wouldn't be able to solve it, so I just wrote down as much as I could until the timer ran out, so I could analyze it later.

I guess this requires dynamic programming(?) Does anyone think they could have solved this in 30 minutes? Hypothetically you're not supposed to use any outside materials, even websites, but that seems like a pretty goofy requirement. Being a developer often means using whatever tools you have at your disposal.

12 rats tied together
Sep 7, 2006

assuming those "x" are multiplication i would guess you're supposed to realize that multiplication is commutative and that modulus isn't (?), so the way to maximize the value of U is to maximize the result of 1 % arr[n].

which seems weird because 1 % x is either going to be a 0, a 1, or a negative number. if you have two negative numbers you want to get the largest negative number you can out of it, otherwise, you want the smallest.

if you have zero negative numbers you just want to make sure arr[n] is nonzero, i guess? weird question

Jabor
Jul 16, 2010

#1 Loser at SpaceChem
Trying to do this without any outside resources is some bullshit "do you know exactly how the Python % operator works".

Even once you know that, it's more of a bullshit math problem rather than a programming one.

Anyway the way to solve it is:
- Partition the input into two groups - one group that contains all the largest numbers (by absolute value), and the other group with the smallest numbers.
- Sort each group separately, in ascending order
- Fill the output array such that you take the lowest value from the large-number array whenever it will be multiplied directly, and the lowest value from the small-number array when it will be used in the % calculation.

Then there are a bunch of edge cases to cover:
- if there are any zeros, instead you just sort the array in ascending order and tweak it so that zero never gets used in a % calculation.
- if there are any ones (or negative ones), those go in the large-number partition. if there are so many ones that they'd spill over to the small-number partition regardless, ignore the partitions and just sort ascending.
- if there are an odd number of negative inputs, you actually want to multiply the smallest numbers directly (so that the final result is least negative).

Bad question imo.

Seventh Arrow
Jan 26, 2005

I'm not even sure how one would go about learning that kind of thing. On the one hand, I need to be able to pass coding tests if I want to get a job (that salesforce-based company gave me the boot btw), on the other hand, my job is usually building ETL pipelines and I've never once been in a situation where that required hashmaps or outer space algorithms.

SporkOfTruth
Sep 1, 2006

this kid walked up to me and was like man schmitty your stache is ghetto and I was like whatever man your 3b look like a dishrag.

he was like damn.
This might be more of a general OOP question, but it's Python-originated:
When should a variable be an input to a method of a class versus a class property?

Pseudospecifics:
I have a new class I'm writing, NewThing, that subclasses the original Thing class. Thing has a method called update() whose signature is update(self, input, **kwargs).

Here, **kwargs gets passed to an internal method that all implementing versions of update() use. I want to overwrite that with a NewThing.update() method that depends on a second (potentially external) variable, param1, so it'd be NewThing.update(self, inputs, param1, **kwargs) in the definition or at least uses param1 in its implementation.

But:
1. I know that instinct is likely wrong, because the methods have to have the same signature due to the class inheritance and the LSP.
2. If I just "kept" the original signature, I could parse **kwargs for param1 inside NewThing.update(), and pass the rest on to the internal method as a new **kwargs, but that seems like a bad idea in terms of documentation, type hinting, etc.
3. I could make param1 a class property, but that would mean I'd need a new instance every time the parameter changes, which seems too heavyweight.

This last point is relevant to how Thing/NewThing are used.
Usually, input comes from a loop:
code:
thing_instance = Thing()
for input in inputs
    result = thing_instance.update(input)
But with NewThing and the process will be more like:
code:
# "Functional" NewThing version 
# newthing_instance = NewThing()
for event in events:
    inputs = foo[event]
    param1_event = bar(inputs)
    # "Property" New Thing version?
    # newthing_instance = NewThing(param1=param1_event)
    for input in inputs:
        # Functional version
        result = newthing_instance.update(input, param1_event)
        # Property version 
        result = newthing_instance.update(input)
where foo is some dictionary-like thing, and bar() is some other function not specifically tied to the class.

What's the best practice here?

a dingus
Mar 22, 2008

Rhetorical questions only
Fun Shoe
My first instinct is that NewThing can't be a subclass of Thing because the signatures are different. Hiding a required param in **kwargs is just a sneaky workaround like you've said. I'd probably inspect your input within the events loop and either feed it to Thing or NewThing depending on what it goes to.

SporkOfTruth
Sep 1, 2006

this kid walked up to me and was like man schmitty your stache is ghetto and I was like whatever man your 3b look like a dishrag.

he was like damn.

a dingus posted:

My first instinct is that NewThing can't be a subclass of Thing because the signatures are different.

Well, I get to choose if the signatures are different, is the issue. Thing is a class from a package I'm using and I'm making NewThing to make an algorithm I'm working on compatible with the package's API. Strictly speaking, Thing is a subclass with specific other methods that I'm harnessing, derived from a parent metaclass MetaThing that defines the update function signature.

Mathematically and conceptually speaking, NewThing should be a subclass of Thing -- the equations implemented in update() are very similar, but for how param1 shows up.

a dingus posted:

I'd probably inspect your input within the events loop and either feed it to Thing or NewThing depending on what it goes to.

Unfortunately, at least in how I envision these classes being used, you would never use instances of both classes in the example work loop I used.

It occurs to me now that might have been confusing -- I want to use *only one* of the NewThing implementations in that processing loop, but I listed them together. The more correct way would have been to say that either we use the “param1 as function parameter way”
code:
# "Functional" NewThing version 
newthing_instance = NewThing()
for event in events:
    inputs = foo[event]
    param1_event = bar(inputs)
    for input in inputs:
        # Functional version
        result = newthing_instance.update(input, param1_event)
OR
the property way:
code:
for event in events:
    inputs = foo[event]
    param1_event = bar(inputs)
    # "Property" New Thing version
    newthing_instance = NewThing(param1=param1_event)
    for input in inputs: 
        result = newthing_instance.update(input)

a dingus
Mar 22, 2008

Rhetorical questions only
Fun Shoe
IMO I wouldn't change the function signature of NewThing.update() if it's going to be a subclass of Thing. Changing it means you can't use NewThing in place of Thing without changing the way things are written. NewThing wouldnt conform to the metaclass either.

I'd be skeptical of using **kwargs as a way to pass in your new param1since in reality it sounds like it would be required in NewThing and its confusing to the user without it being explicit. You'd have to know how NewThing.update() works to know you actually need to pass that in.

If Im understanding correctly maybe NewThing would be better off as a new type which is composed of Thing instead of inheriting it.

Then I'd probably do this:

SporkOfTruth posted:

code:
# "Functional" NewThing version 
newthing_instance = NewThing()
for event in events:
    inputs = foo[event]
    param1_event = bar(inputs)
    for input in inputs:
        # Functional version
        result = newthing_instance.update(input, param1_event)

Another idea is inheriting Thing and overriding update to add param1_event as a parameter with a default value of None and doing different things depending on whether param1_event is passed in or not.

edit* oh boy I don't know how code snippets work

a dingus fucked around with this message at 21:54 on Mar 29, 2023

12 rats tied together
Sep 7, 2006

a dingus posted:

If Im understanding correctly maybe NewThing would be better off as a new type which is composed of Thing instead of inheriting it.

Agree with this, had half a reply typed out but cancelled it because I was phone posting.

From the examples/text posted it seems like there is just Thing and it has one of two different types of UpdateBehavior, so I would model it like that, making sure that I understand fully what the difference is between the UpdateBehaviors is and that I name them after it instead of having OldUpdateBehavior and NewUpdateBehavior.

QuarkJets
Sep 8, 2008

Are you sure that you need a subclass? Do you need additional persistence of state beyond what Thing provides or are you just wrapping its update method with some pre and post processing?

SporkOfTruth
Sep 1, 2006

this kid walked up to me and was like man schmitty your stache is ghetto and I was like whatever man your 3b look like a dishrag.

he was like damn.

QuarkJets posted:

Are you sure that you need a subclass? Do you need additional persistence of state beyond what Thing provides or are you just wrapping its update method with some pre and post processing?

I guess I should stop beating around the bush about the functionality, because as much as I appreciate the help, I think my obfuscation is loving with the understanding.

This is a recursive filter class (like a Kalman filter) that handles the update step, but that also involves calculations of the filter gain, pre-update predictions, and (potentially) linearizations of the measurement function around the prior predicted state. Each of those elements is an internal method of the base class, which get called in sequence in the update() method. The lowest level meta class wraps both an update() method and an explicit observation model object as a property, and operates on bigger structures that carry metadata about the state being filtered.

My new subclass is a variant of the filter that takes in a particular reduction parameter that only enters through an overwritten version of one internal method if I do it right, so I do, indeed, need to subclass if I want those other internal methods "for free" (and for super.update() to automatically grab my new method).

This makes the property form more appealing, because I can just add a property for the reduction parameter in the subclass and call self.param1 inside my implementation of the internal method. It then would persist if I need to call it for multiple observations. If I used the functional form, I’d have to rewrite update() and all the methods that pass into it up to the internal method where it’s finally used.

a dingus posted:

Another idea is inheriting Thing and overriding update to add param1_event as a parameter with a default value of None and doing different things depending on whether param1_event is passed in or not.

This is somewhat intriguing though. For future reference, would you write this like
code:
class NewThing(Thing):
# other things go here
    def update(self, input, param1=None, **kwargs):
        # do poo poo
or am I misunderstanding?

SporkOfTruth fucked around with this message at 03:56 on Mar 30, 2023

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug
Inheritance is the original sin of OOP, so it's probably good to be wary about it. Is there a reason you can't just subclass it but add a new method instead of overwriting the old one? 'fancy_update()' or something like that.

There's also a question of how far this is going to be used; if you're vending this out or expecting anyone else to reuse it the bar is higher for 'confusing footguns to leave people' but if it's a project that only you are going to use...I mean lord knows there's no programming police.

SporkOfTruth
Sep 1, 2006

this kid walked up to me and was like man schmitty your stache is ghetto and I was like whatever man your 3b look like a dishrag.

he was like damn.
A new method name would make no sense here, since the existing update() method would still exist if I subclassed it from the Kalman updater class and it would be mathematically wrong for the new algorithm represented by my class.

Additionally, any other downstream class that relies on Updaters would absolutely break....aha, which is why we can't change the interface. Property it is!

Adbot
ADBOT LOVES YOU

QuarkJets
Sep 8, 2008

Yeah, if it makes sense to define a subclass, and if you'd otherwise have to pass in this parameter to a bunch of calls of the same method anyway, then I think a property sounds like a good choice

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply