Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Howard Phillips
May 4, 2008

His smile; it shines in the darkest of depths. There is hope yet.

TooMuchAbstraction posted:

PIL, the Python Imaging Library, makes it pretty easy to input/output images, but for output you still have to arrange the data as RGB pixel data. I'm not sure what you're trying to accomplish with the voronoi diagram, but if you want a simple way to output vector graphics, you can print HTML containing an <svg> tag, use the vector directives (e.g. <line>, <polygon>, etc.) to add drawings, and then save the result to file and open it in your browser. I absolutely would not be bothering to read/write even a simple file format like BMP directly. Let the libraries do the heavy lifting for you.

How are you going to define the "most significant points" of the image? Start simple with the periodic sampling. If you have pixel data as a 3D numpy array (one axis width, one axis height, one axis for RGB) named pixels, you can make a view of that array that has every 10th pixel by doing pixels[::10, ::10]

Thanks this helps a lot.

The "most significant" points... I guess what I'm trying to accomplish is to represent the original image in an "artistic" fashion using the Voronoi diagram algorithm.

For example if I were to sample your avatar I want it to make sure it doesn't miss any information during sampling that would make it hard to make out the critical details in the output. So upper left portion is almost one color with relatively little information whereas the face has lots of information and would obviously need more samples per square area. If that makes sense.

The point of the Voronoi diagram is just to demonstrate one of the algorithms we discussed in class. Nothing practical but to kind of show an implementation of the theory.

Adbot
ADBOT LOVES YOU

Xerophyte
Mar 17, 2008

This space intentionally left blank
Note that PIL's main fork is no longer being developed, the fork that's actually being developed is Pillow. PIL and pillow are also the only python image libraries I've actually used. I can confirm that they both work, but they're also kind of slow and limited and crappy. I may be biased because pillow is one of the very many libraries made by people who think images contains at most 4 interleaved channels of byte-precision 2D data, which is the sort of widespread Bad Assumption that makes my daily life harder. This quite likely doesn't matter as much to you.

There are also python bindings for the excellent openimageio. I haven't used OIIO through python so it's possible that those bindings suck, but the C++ library is awesome.*

As for your "most significant" points, you're basically talking about finding a set of N voronoi centers and colors that minimize the error of your voronoi approximation when compared to the original image according to some metric. This problem is, in general, hard. Both in finding a good metric -- you probably care more about edges than average color, so even a normally "good" image error metric like SSIM might not actually give great results -- and in finding a global minimum. Optimizing a Voronoi tessellation generally smells a lot like not-polynomial to me. You can do something generic like simulated annealing to find a decent tesselation, but don't expect to be able to compute the global optimum.


[E:] I did a quick investigation, mostly for my own benefit, and, yeah, the python oiio bindings kinda suck. python2 only, no PyPI package, basically require building oiio itself from scratch which is a giant goddamn pain on windows. Dammit Larry, spend some of that sweet Emoji Movie money on improving your nice open source tools...

Xerophyte fucked around with this message at 07:03 on Nov 1, 2017

Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



TooMuchAbstraction posted:

PIL, the Python Imaging Library, makes it pretty easy to input/output images, but for output you still have to arrange the data as RGB pixel data. I'm not sure what you're trying to accomplish with the voronoi diagram, but if you want a simple way to output vector graphics, you can print HTML containing an <svg> tag, use the vector directives (e.g. <line>, <polygon>, etc.) to add drawings, and then save the result to file and open it in your browser. I absolutely would not be bothering to read/write even a simple file format like BMP directly. Let the libraries do the heavy lifting for you.

Speaking of having a library do the heavy lifting, pretty sure https://bokeh.pydata.org/en/latest/ will make Voronoi diagrams in SVG for you.

keep it down up there!
Jun 22, 2006

How's it goin' eh?

Quick MySQL question. I have currency values stored right now without the decimal place. Whats a quick way I can replicate the column to add them?
I tried FORMAT(value,2) but it seems to add 2 trailing zeroes rather than adding the decimal 2 characters in. The data always has the cents as the last 2 characters, even if there is an even dollar value.

Example:

4000 becomes 40.00
1535 becomes 15.35

MrMoo posted:

FORMAT(value/100,2) ?

Wow Im dumb. Not sure why it slipped my mind to divide by 100.
Thanks.

keep it down up there! fucked around with this message at 22:05 on Nov 1, 2017

MrMoo
Sep 14, 2000

FORMAT(value/100,2) ?

Pollyanna
Mar 5, 2005

Milk's on them.


Regex question: how do I capture only the first part of a potential float? Say I have a few strings like “85.”, “85.98.”, “85.98.9”, “.80.9”, and “8..5”. I want to capture “85”, “85.98”, “85.98”, “.80”, and “8” from each string, respectively. I’m having trouble figuring out a regex that captures the group I want, and I’m basically doing it all manually/with substrings, and that’s hella buggy :(

Basically, I’m trying to sanitize potentially-nonsensical float-like strings.

Eela6
May 25, 2007
Shredded Hen

Pollyanna posted:

Regex question: how do I capture only the first part of a potential float? Say I have a few strings like “85.”, “85.98.”, “85.98.9”, “.80.9”, and “8..5”. I want to capture “85”, “85.98”, “85.98”, “.80”, and “8” from each string, respectively. I’m having trouble figuring out a regex that captures the group I want, and I’m basically doing it all manually/with substrings, and that’s hella buggy :(

Basically, I’m trying to sanitize potentially-nonsensical float-like strings.

This should work:


code:
^([0-9]*[.]?[0-9]+)[0-9.]*?$
"starting at the beginning of the string, capture any number of digits, then at most one period, then any number of digits. continue to absorb additional digits or periods, but don't capture them."

This satisfies all your requests: https://regex101.com/r/QKR9PG/2

Eela6 fucked around with this message at 21:38 on Nov 1, 2017

lifg
Dec 4, 2000
<this tag left blank>
Muldoon
/^ ( (?:\d+)? (?:\.\d+)? ) .* $/x

ulmont
Sep 15, 2010

IF I EVER MISS VOTING IN AN ELECTION (EVEN AMERICAN IDOL) ,OR HAVE UNPAID PARKING TICKETS, PLEASE TAKE AWAY MY FRANCHISE

Pollyanna posted:

Regex question: how do I capture only the first part of a potential float? Say I have a few strings like “85.”, “85.98.”, “85.98.9”, “.80.9”, and “8..5”. I want to capture “85”, “85.98”, “85.98”, “.80”, and “8” from each string, respectively. I’m having trouble figuring out a regex that captures the group I want, and I’m basically doing it all manually/with substrings, and that’s hella buggy :(

Substrings should be fine here, since you want to capture from the beginning until the second "."? Check the first index of . and the second index of ., and then stop before the second one if it exists?

Eela6
May 25, 2007
Shredded Hen

ulmont posted:

Substrings should be fine here, since you want to capture from the beginning until the second "."? Check the first index of . and the second index of ., and then stop before the second one if it exists?

I agree, actually . Regexes are not actually necessary here.

EG:
Python code:
_digits = frozenset("0123456789")

def parse_float(s: str) -> float:
    p = 0
    for i, c in enumerate(s):
        if c == "." and p > 0:
            return float(s[:i+1])
        elif c == ".":
            p = i
        elif c not in _digits:
            raise ValueError("not remotely a valid float")
    return float(s)

The Fool
Oct 16, 2003


code:
^((\d*\.\d+)|(\d*))
This is my pass at it.

edit: regex golf

lifg
Dec 4, 2000
<this tag left blank>
Muldoon

Eela6 posted:

I agree, actually . Regexes are not actually necessary here.

EG:
Python code:
_digits = frozenset("0123456789")

def parse_float(s: str) -> float:
    p = 0
    for i, c in enumerate(s):
        if c == "." and p > 0:
            return float(s[:i+1])
        elif c == ".":
            p = i
        elif c not in _digits:
            raise ValueError("not remotely a valid float")
    return float(s)

This code is iterating over the characters of a string, looking for patterns, with a small state machine. This is exactly the time for a regexp.

(Admission: I actually like regexps, so I may be insane.)

Eela6
May 25, 2007
Shredded Hen

lifg posted:

This code is iterating over the characters of a string, looking for patterns, with a small state machine. This is exactly the time for a regexp.

(Admission: I actually like regexps, so I may be insane.)

I think either way is fine. I tend to prefer non-regexps where possible because I can more easily reason about the possible bounds and they mean there are 'less languages' in my code.

Eg, with this python snippet here, I know for sure that the string will be iterated through exactly once, and an reader who is remotely familiar with python should understand it in it's entirely. Obviously, most experienced developers understand RE, but your audience when writing code is not always other experienced developers.

Also, at the risk of extreme bikeshedding, I am of the opinion you should always use [0-9]rather than \d in your regular expressions. It's faster and more explicit. Technically, "\d" means 'any unicode digit' - it would allow things like which shouldn't be anywhere near a float sanitizer.

I don't think you're crazy for liking to write regexes. You're crazy if you like reading them.

Eela6 fucked around with this message at 22:30 on Nov 1, 2017

Linear Zoetrope
Nov 28, 2011

A hero must cook
I'd like regular expressions more if they tended to be the pure kind. In my ideal programming world there'd be commonly used DFA regex compilers and PDA BNF compilers. Often with the backtracking flavor, you pay for that even if what you're using could be done by a finite automata. The backtracking syntax also just makes them a pain to read, while finite-automata equivalent ones are, while not pleasant, fairly scannable with a quick look. As is, regexes just strike me as a golfing language that's embedded in one way or another in every other one.

Linear Zoetrope fucked around with this message at 00:37 on Nov 2, 2017

Love Stole the Day
Nov 4, 2012
Please give me free quality professional advice so I can be a baby about it and insult you

Eela6 posted:

This satisfies all your requests: https://regex101.com/r/QKR9PG/2

Pro click!

Eela6
May 25, 2007
Shredded Hen

Regex101 owns and I wish it or something like it were built into every IDE.

LongSack
Jan 17, 2003

Anyone into sharepoint development? I have a web site I set up that would allow my coworkers to download installers and documentation for programs I have written, as well as allowing the programs to check for updated versions. My boss wants me to move this to an internal site using sharepoint, and I have zero clue as to what that involves. The current site uses PHP on the back end, javascript/jquery/css on the front end.

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.
The only thing I’ve ever heard about SharePoint is "don't".

You can multiply that anecdote by the 0 minutes of my life I’ve ever used SharePoint to determine its true value.

The Fool
Oct 16, 2003


LongSack posted:

Anyone into sharepoint development? I have a web site I set up that would allow my coworkers to download installers and documentation for programs I have written, as well as allowing the programs to check for updated versions. My boss wants me to move this to an internal site using sharepoint, and I have zero clue as to what that involves. The current site uses PHP on the back end, javascript/jquery/css on the front end.

Online or on prem? If on prem, which version? If online, modern sites or classic sites?

The good news: your front end stuff can be done with the same technologies. If using modern sites and the sharepoint framework (https://docs.microsoft.com/en-us/sharepoint/dev/spfx/sharepoint-framework-overview), you could even use react. ( this is the best of a bunch of bad options)

The bad news: your back end is going be using something like document libraries or lists. Get to reading that api documentation.

Capri Sun Tzu
Oct 24, 2017

by Reene

LongSack posted:

Anyone into sharepoint development? I have a web site I set up that would allow my coworkers to download installers and documentation for programs I have written, as well as allowing the programs to check for updated versions. My boss wants me to move this to an internal site using sharepoint, and I have zero clue as to what that involves. The current site uses PHP on the back end, javascript/jquery/css on the front end.
I worked on SharePoint 2010 and 2013 for a few years, it's not a terrible platform but you'll sink a lot of time into learning how SharePoint works. You'll be storing stuff in lists and libraries, which work fine as long as you dont have a butt-ton of rows. There's an out of the box JavaScript API for your frontend, and for your PHP you can use the REST API. When I worked on SP I just built my own layer over the REST API because the JS API was confusing and difficult to work with.

You get pretty decent security and access control out of the box as well as document versioning which might be useful for what you're doing.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Has modern ML or deep learning had any effect on client-side OCR?

I ask because I was looking around at commercial OCR packages and they don't really seem to have updated much in the past few years, but that might just be the UI and features stayed the same but the OCR engine got better...

LP0 ON FIRE
Jan 25, 2006

beep boop
Is it retarded think of using Swift as a replacement to PHP on a Linux server?

taqueso
Mar 8, 2004


:911:
:wookie: :thermidor: :wookie:
:dehumanize:

:pirate::hf::tinfoil:

Probably not, you are replacing PHP so it could be almost anything and be better. (I don't know much about swift but it doesn't seem to be a dumpster fire.)

mystes
May 31, 2006

Thermopyle posted:

Has modern ML or deep learning had any effect on client-side OCR?

I ask because I was looking around at commercial OCR packages and they don't really seem to have updated much in the past few years, but that might just be the UI and features stayed the same but the OCR engine got better...
No I've actually been complaining about this to random people recently, too. It seems like there is no off-the-shelf commercial OCR that uses machine learning at all. Same for voice recognition and machine translation. It's all cloud services now. (At least the techniques are getting published openly, but the lack of training data would make it hard to roll your own alternative.)

sarehu
Apr 20, 2007

(call/cc call/cc)

LP0 ON FIRE posted:

Is it retarded think of using Swift as a replacement to PHP on a Linux server?

No. Might be a little immature tech though.

Goonerousity
Sep 25, 2017

aww yeah

LP0 ON FIRE posted:

Is it retarded think of using Swift as a replacement to PHP on a Linux server?

Try WebASM. It's pretty elite.

LLSix
Jan 20, 2010

The real power behind countless overlords

mystes posted:

No I've actually been complaining about this to random people recently, too. It seems like there is no off-the-shelf commercial OCR that uses machine learning at all. Same for voice recognition and machine translation. It's all cloud services now. (At least the techniques are getting published openly, but the lack of training data would make it hard to roll your own alternative.)

http://ai.stanford.edu/~btaskar/ocr/ Literally the first google result for OCR dataset is a free dataset.

I know from doing facial recognition work that there are lots of free datasets of faces. I'd be shocked if there weren't several training sets for all the fields you listed.

mystes
May 31, 2006

LLSix posted:

http://ai.stanford.edu/~btaskar/ocr/ Literally the first google result for OCR dataset is a free dataset.

I know from doing facial recognition work that there are lots of free datasets of faces. I'd be shocked if there weren't several training sets for all the fields you listed.
My impression was that most of the datasets that are available are toy datasets intended purely for researching/evaluating machine learning techniques, and not sufficient for building a working product in themselves, but perhaps i'm wrong.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

mystes posted:

My impression was that most of the datasets that are available are toy datasets intended purely for researching/evaluating machine learning techniques, and not sufficient for building a working product in themselves, but perhaps i'm wrong.

That's exactly what I thought, but now that I think about it, I'm not sure where I came across that impression.

carry on then
Jul 10, 2010

by VideoGames

(and can't post for 10 years!)

LP0 ON FIRE posted:

Is it retarded think of using Swift as a replacement to PHP on a Linux server?

No there are a few projects out there that let you do that.

https://stormpath.com/blog/swift-on-the-server-today

Love Stole the Day
Nov 4, 2012
Please give me free quality professional advice so I can be a baby about it and insult you
Not sure if this has been asked before here, but are there any tips or common tricks/themes to navigating and accquainting yourself with an open source project that you're interested in trying to contribute to? I ask because I was critiqued that my Github repos are all small projects and so the documentation and code bases are not impressive enough to make people want to call back for an interview. So I'm looking around for stuff to try and work on in the hopes that it can help with finding a full-time job.

LP0 ON FIRE
Jan 25, 2006

beep boop
Thank you for all the Swift server opinions.

csammis
Aug 26, 2003

Mental Institution

Love Stole the Day posted:

Not sure if this has been asked before here, but are there any tips or common tricks/themes to navigating and accquainting yourself with an open source project that you're interested in trying to contribute to? I ask because I was critiqued that my Github repos are all small projects and so the documentation and code bases are not impressive enough to make people want to call back for an interview. So I'm looking around for stuff to try and work on in the hopes that it can help with finding a full-time job.

I don't have a good answer to your question - my answer is "look at open issues and try to solve them" - but who critiqued you for that? An actual hiring manager, a recruiter, a peer reviewer, a not-peer reviewer? It sounds like a very bullshit thing to get called out on.

huhu
Feb 24, 2006

csammis posted:

I don't have a good answer to your question - my answer is "look at open issues and try to solve them" - but who critiqued you for that? An actual hiring manager, a recruiter, a peer reviewer, a not-peer reviewer? It sounds like a very bullshit thing to get called out on.

This. Care to share your Github?

As far as getting involved - I just joined a company where I'm collaborating for the first time. I'd say read up on their guidelines for formatting, coding standards, and how pull requests are done. Keep an eye on bugs and see if there is an easy one you could solve. The first issue I solved at my current job was just to change some text on a webpage but it was enough to learn a lot about their workflow.

huhu fucked around with this message at 19:58 on Nov 3, 2017

Ciaphas
Nov 20, 2005

> BEWARE, COWARD :ovr:


Is it possible to pass a command line switch to MSBuild that suppresses all warnings from C++ solutions/projects it tries to build?

I'm trying to automate a build for a legacy application of some 20 separate solutions and 400 projects using TFS 2015, and oh crackers I do not want to have to go through those and suppress all warnings individually. (The massive number of warnings is making the TFS build web interface choke.)

Linear Zoetrope
Nov 28, 2011

A hero must cook
To be honest, the best way to get involved in open source projects is to naturally use a library, go "holy poo poo what the hell why can't I do this? I need this" and then dig in and fix it. I don't think it's really worth it to go hunting for issues to solve just for the sake of solving them. Unfortunately, that's a bit of a chicken/egg problem because you need to be doing work already to find things that need to be fixed and if that's not in a business/contracting context it's almost certainly a hobby or self-improvement one (which likely means you already have somewhat of a portfolio).

The better answer, IMO, is don't try to get hired by people who expect you to have a non-trivial open source history because they probably expect you to live your job. Granted that's easier said than done if you're desperate for a job now.

cmndstab
May 20, 2006

Huge Internet Celebrity!
Edit: Oh geez, finally found the bug just minutes after posting (after hours looking for it before!). I had a generic "copy_array" function I was using that was going one index too far each time, I assume it was overwriting the information of the next malloc'd variable or something along those lines. Whoops! I'll leave the post below anyway.


I'm pretty new to C, though I've used other languages like Java for a while. I'm midway through writing a fairly large piece of code and I've encountered an annoying bug I'm struggling to fix, and can't really create a minimal example for.

I use malloc a lot throughout the code, and then free the variables once I no longer need them. However, I reach a point in my code where freeing variables causes a crash. I assume this would be caused by an overflow issue?

At this point in my code I've done a little test. I write:

long *k1 = malloc(200*sizeof(long));
long *k2 = malloc(200*sizeof(long));
free(k1);

This causes an immediate crash. If I comment out the k2 line, it can get past it fine. I don't understand how simply calling another malloc can prevent me from freeing up k1. I don't assign any values to either of them, and anything I've done for previous variables has already happened before I create and then try to free k1. Obviously if I just put this in a separate piece of code it doesn't cause a problem, so presumably the error is caused by something I've done earlier in the code.


Can anyone suggest what might be happening? Is it just as simple as possibly trying to write to unallocated memory earlier in the code and that snowballing into this error?


cmndstab fucked around with this message at 06:45 on Nov 4, 2017

Mr Shiny Pants
Nov 12, 2012
Any of you have experience in using text to speech software that is open source? And works? Or is Dragon still one of the best out there?

Extortionist
Aug 31, 2001

Leave the gun. Take the cannoli.

quote:

code:
^([0-9]*[.]?[0-9]+)[0-9.]*?$
code:
/^ ( (?:\d+)? (?:\.\d+)? ) .* $/x
code:
^((\d*\.\d+)|(\d*))
edit: regex golf

Oh boy, regex golf! Here's par:

code:
(\d*\.?\d+)
It'll depend on the data you're checking against if you need to use ^ or $. If you're only checking strings of the kind you pasted, there's no need at all.

If you need to do a global match or if there's data you don't care about in the string, it's better to clean the data up front before doing the regex search.

code:
import re
text = "85. 85.98 85.98.9 .80.9 8..5"
text = re.sub(r"(\d*\.?\d+)(?:[\d.]+)?", r"\1", text)
matches = re.findall(r"(\d*\.?\d+)", text)
print matches
['85', '85.98', '85.98', '.80', '8']

Adbot
ADBOT LOVES YOU

Love Stole the Day
Nov 4, 2012
Please give me free quality professional advice so I can be a baby about it and insult you

csammis posted:

I don't have a good answer to your question - my answer is "look at open issues and try to solve them" - but who critiqued you for that? An actual hiring manager, a recruiter, a peer reviewer, a not-peer reviewer? It sounds like a very bullshit thing to get called out on.

huhu posted:

This. Care to share your Github?

As far as getting involved - I just joined a company where I'm collaborating for the first time. I'd say read up on their guidelines for formatting, coding standards, and how pull requests are done. Keep an eye on bugs and see if there is an easy one you could solve. The first issue I solved at my current job was just to change some text on a webpage but it was enough to learn a lot about their workflow.
As far as I know, it was a hiring manager?

Here's the Github link: https://github.com/wanderrful (I added larger codebase projects I contribute to at the top of my pinned repo thing in response to the guy I mentioned earlier in my previous post itt)

I'd share a sanitized resume link, but I'm currently iterating yet again on it. I feel like I get advice telling me to go back and forth between one thing and the other, most of the time. Neither way seems to work when it comes to getting any interviews, though.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply