Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
cinci zoo sniper
Mar 15, 2013




Raenir Salazar posted:

I have a project where I need to train a CNN into detecting different classifications of lets say images of cars. Ford, Ferrari, Mitsubishi, and Toyota; while I can get datasets with lots of cars and maybe a dataset that is only ford I cannot seem to find a convenient dataset that has all four neatly divided into categories.

Is there a trick I can do on a dataset that I know has all four mixed together to neatly divide them (like I dunno, some sort of unsupervized clustering to try to pre-divide/sort the data to cut down on the work I have to do manually) or do I basically have to divide it all manually?

Or do we have a more dedicated data science thread for this question?

That trick is called Mechanical Turk, or a specialised data labelling service. I do NLU, not CV, but this sounds like a crap idea for unsupervised or semi-supervised clustering, if that’s what you’re thinking.

We have a dead-ish DS thread, and should probably get around merging those together. Maybe merge in the scientific computing thread as well, make a boffin central. And revive LaTeX thread :anime:

Fake edit: 1200 images? Just boil yourself a coffee pot.

Edit: Should’ve probably finished reading before replying. :v:

cinci zoo sniper fucked around with this message at 11:36 on Jun 9, 2022

Adbot
ADBOT LOVES YOU

cinci zoo sniper
Mar 15, 2013




quarantinethepast posted:

That's Natural Language Understanding, right? Do you know of some good intro books into that field? I have an interest in linguistics and NLP + NLU.

That’s it indeed. The area you’ve outlined is really broad, and I don’t know what’s your background like, so maybe take a look at https://web.stanford.edu/~jurafsky/slp3/ ? I’m not sure I know anything up to date that would be more beginner-friendly than this, as far as books go.

cinci zoo sniper
Mar 15, 2013




bob dobbs is dead posted:

dan jurafsky the years i knew him was kinda 3/4 of the way to being a crying shambling wreck because the neural net peeps were running roughshod over his life's work. he's fundamentally not a neural nets guy and neither is martin iirc. of course they are forced to shove it in any nlu book now

Yeah it’s not a DL book per se, but jumping directly into DL without understanding what you’re trying to do or why is like trying to do a backflip with a motorcycle when you can’t do it water, imo.

cinci zoo sniper
Mar 15, 2013




quarantinethepast posted:

I've been taking Andrew Ng's deep learning Coursera series so I've got some DL background, albeit the courses could use more practical focus on projects beyond "add these 2 lines which we have spelled out for you to an almost complete function".

I had reviewed that course recently and for a practitioner imo it’s only good to tie up disparate knowledge about network-based ML, e.g., if you’re moving from credit risk into computer vision. I really hated the assignments, most of which just had you copy and paste code provided above, full verbatim, and I struggle to imagine how anyone could learn from that.

Apparently, though, that was too much already, as the course is now seeming being redone to be simpler.

Also, the audio quality was really poo poo. I loved random loud pitching noises in half of the videos, because no one on the editorial side had functional hearing.

Not sure what’s a good alternative for it though, for general NN intro. One of the things on my docket is to figure out a replacement curriculum for future hires that may need it.

cinci zoo sniper
Mar 15, 2013




Boris Galerkin posted:

I want to learn machine learning so that I can work for Google and make headlines about falling in love with a chatbot but I'm not sure where to start. The OP mentions PyTorch, Tensorflow, and scikit-learn. I'm very familiar with numpy/scipy so I was leaning towards scikit-learn but I'd rather pick a package that is most used.

I did skim through this tutorial for PyTorch real quick and got to this point:

I'm just not really sure how this is any different from newton's method or like basically any numerical method that minimizes an objective/loss function.

e: Maybe I need something more technical instead of a quickstart, as my background is numerical methods.

I would say that you could try going into https://developers.google.com/machine-learning/crash-course raw and seeing how it goes. Depending on particulars of your background, you could give plenty of ML practitioners run for their money.

On the 3 libraries, difference is in focus. TF/PyTorch focus on neural network-based ML, while SKLearn focuses on classical ML methods. You could (kind of) think of the difference as differentiable programming vs probabilistic programming.

Where to go after the crash course depends a bit on what your plans are. Do you want to work for a particular company or company class? They may have a niche to specialise for where learning classical ML wouldn’t make sense, or the other way around. It could also be the case that your specific kind of numerical methods are actually repackaged as (more expensive) ML somewhere else, if you just want to hop into the job title.

If you’re comfortable with academic literature, in my explicitly subjective opinion you could easily do worse than read https://hastie.su.domains/ElemStatLearn/ regardless of what’s down the line for your ML career, though.

Some other things you’ll need to take care of likely sooner than later:
- SQL
- Data engineering fundamentals
- Not grimacing when colleagues ask about sick AI features

cinci zoo sniper
Mar 15, 2013




Talking about real life application targets for this:

1) Would the text there be printed by a machine exclusively?

2) Would it all be in English?

3) Could these parcels of text be considered documents (i.e. clean background, standard font, focus on legibility)?

4) Would they be described well by a low number of templates?

cinci zoo sniper
Mar 15, 2013




Discendo Vox posted:

1) Yes

2) No, a very low group, probably ~1%, will be in duplicated form with another language and in theory there will be a handful not in English. It should be possible to exclude these from the intended use, and/or from the training set.

3) Yes.

4) I’m not certain what you mean by “template”, but the images would be in a handful, maybe six, standardized forms where text is positioned in proportionately consistent relative positions and sizes. The theoretical training data is not currently labeled to separate these formats.

“OCR” in current parlance lumps 3 different areas:

1) Optical Character Recognition – read the text [in a traditional document]*
2) Scene Text Detection – identify text areas in a [naturally occurring] scene
3) Scene Text Recognition – read the text in those areas

2 and 3 are difficult problems, challenging state-of-the-art methods. 1 can be challenging for difficult handwritings or small languages, but is basically a solved problem for machine print of major languages (unless you're dealing with 50-year-old photos of 500-year-old parchments or similar).

Your second point is unlikely an issue, and your 4th point takes care of text detection basicallly - you probably can distinguish between forms using rote heuristics, with no fuzzy ML needed. Thus, I'm not sure you even need anything more than regular developers with experience of integrating off-the-shelf OCR toolkits like EasyOCR, PaddleOCR, Tesseract, respective CV APIs of major cloud providers, or whatever else you have access to, to digitize the collection in question.

Caveat - I'm assuming that the text here is text, and not, e.g., physics formulas. Dealing with math symbols or fancy sub/superscripting is not something I've encountered, but should be researchable enough with the aforementioned keywords.

*Traditional document – sanely laid-out text on a high-contrast background, typeset in a generic font with large enough letters (relative to image resolution, for meaningful contrast areas), legible spacing, and no fancy formatting.

cinci zoo sniper fucked around with this message at 15:52 on Jul 3, 2022

cinci zoo sniper
Mar 15, 2013




w00tmonger posted:

Sorry if this has been answered somewhere else, but I feel like I'm kinda of in the deep end and don't know where to start.

Lim looking to work on a script to create product tags for some shopify listings, and an image recognition API seems like it would be ideal. I know gpt-4 has some image recognition in it, but access seems a bit weird/inconsistent and might be overkill.

What I want to do, is input a product title with an image (it's 3d printed miniature,so "vampire lord", and a pic of the sculpt), and have it spit out a list of product tags for my search/profuct-categories. I spend a ton of time indexing search terms, so some API would save me a ton of time.

To clarify, you have a collection of tags, and for each pair of title+photo you want an automated way to select the best matching tags from the collection?

Adbot
ADBOT LOVES YOU

cinci zoo sniper
Mar 15, 2013




w00tmonger posted:

Sort of. so Shopify works off collections, which for me would be broad categories of sculpts (undead, human, beasts,terrain, etc), with some sub-categories (undead would have vampires, ghosts,zombies etc)

Each product has tags which describe the product which I can use to pop an item into a category, but are also used for searchability.

I want it to output any matching tags I've made so a listing can be assigned to any relevant categories, but also potentially add some tags I haven't thought of for searchability. The second part might be overkill, may just make more sense to generate a huge list of predefined tags to constrain it

Ex: I have a vampire castle sculpt. I want it to run through a tool, and output that it has the tags undead,vampire, and terrain. It would then further give me a handfull of adjacent tags like Dracula, Transylvania, count etc for search

Edit:I feel like I would need to train my own model? Given I want it to know what a mini of a vampire looks like. I have titles tied to 1.5k+ miniatures, but I feel like I might need to do something more broadly unless there's an existing solution

So, this is 2 separate tasks – 1) get tags for image, 2) generate new tags based on existing tags. The latter is something you can credibly do with any text model starting with GPT-3, after some prodding. The former, in industry terminology, would be “image classification”, assuming you have clean photos where the miniature is the only “feature” of the image. GPT-4 is multimodal, and accepts image inputs, but the image ingestion API is not enabled at the moment in the public OpenAI service, and it may ultimately depend on access to the 32k context length model, which is not publicly available eitehr, as yet. So you may have to shop around for some other model, and either consider trainign your own right away, or investigating fine-tuning an existing model with your data at a later point.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply