Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
QuarkJets
Sep 8, 2008

Xun posted:

When you look at a GitHub page for a research paper, what do you guys want to see if the readme? Im updating mine for a publication and sadly the only advice I'm getting from my labmates is "the code is all there and there's a bibtex citation, why are you worrying :confused:"

Honestly I'm usually pretty happy with just "type this to run model" but idk if that's the norm lol

I review research papers from time to time, and the amount of stuff in a project's README tends to be all over the map, from less than useless to comprehensive.

Put in whatever you personally would want to see if you were coming in fresh, having no experience with the specific project. Include details on how to run the model and any additional details that you feel are important. It doesn't need to be a comprehensive SDK, but it should provide high-level details that would be helpful to anyone seeking to build upon your work.

Assume that you'll want to show this project and its README to future prospective employers, too - put in the extra 60 seconds to give it a title, section headings, pass it through `aspell`, etc.

Adbot
ADBOT LOVES YOU

CarForumPoster
Jun 26, 2013

⚡POWER⚡

Xun posted:

When you look at a GitHub page for a research paper, what do you guys want to see if the readme? Im updating mine for a publication and sadly the only advice I'm getting from my labmates is "the code is all there and there's a bibtex citation, why are you worrying :confused:"

Honestly I'm usually pretty happy with just "type this to run model" but idk if that's the norm lol

Ideally, a working demo or at least a GIF of what to expect if its something that has dataviz or can be made a web app fairly easily.


QuarkJets posted:

I review research papers from time to time, and the amount of stuff in a project's README tends to be all over the map, from less than useless to comprehensive.

Put in whatever you personally would want to see if you were coming in fresh, having no experience with the specific project. Include details on how to run the model and any additional details that you feel are important. It doesn't need to be a comprehensive SDK, but it should provide high-level details that would be helpful to anyone seeking to build upon your work.

Assume that you'll want to show this project and its README to future prospective employers, too - put in the extra 60 seconds to give it a title, section headings, pass it through `aspell`, etc.

This

flappin fish
Jul 4, 2005

Xun posted:

When you look at a GitHub page for a research paper, what do you guys want to see if the readme? Im updating mine for a publication and sadly the only advice I'm getting from my labmates is "the code is all there and there's a bibtex citation, why are you worrying :confused:"

Honestly I'm usually pretty happy with just "type this to run model" but idk if that's the norm lol

You'd think this would be a basic one (who wouldn't include a requirements.txt?) but please please please list all required libraries and versions. It takes long enough to get research code running as it is. I spent a couple of days once because someone's requirements.txt didn't list library versions and it depended on old versions that weren't easily interoperable.

It's nice if "type this to run model" replicates exactly what was done in the paper.

Also, please stick a license on your code.

Keisari
May 24, 2011

Has anyone been able to really well fiddle with the custom GPTs and more specifically, the "knowledge" part of them? You can upload all kinds of poo poo to be their knowledgebase. I have tried to make one to help me build a program to use a certain API, and uploaded a bunch of JSON files that describe the API. Another I made was inspired by the built in board game explainer, I made one that was focused on explaining and clarifying the rules, and uploaded some game manuals.

The API GPT basically completely errored out and either ignored all the knowledge or just crashed.

Meanwhile, the game rules explainer worked a bit better, but is poor at applying the knowledge or combining it with other sources of information. it did reasonably ok at finding specific points from the manuals, but was poor at answering certain type of fringe questions that required combining different rules from the manual.

Below are some chatlogs and my commentary about two boardgames: A Distant Plain, a COIN game about the War in Afghanistan, and War of the Ring, a super nerdy (and loving amazing) LOTR boardgame.

Chatlog, Custom Board Game GPT and A Distant Plain, the boardgame posted:

User
In a Distant Plain, how does the Taliban recruit?

GPT
I wasn't able to find specific information on how the Taliban recruit in the game "A Distant Plain" from the provided rules document. However, based on my knowledge, in "A Distant Plain," the Taliban's recruitment typically involves placing their guerrillas on the board. This process often depends on various factors like the presence of opposition forces, control of provinces, and the game's current political climate.

The recruitment action generally allows the Taliban to increase their presence and influence on the board, which is crucial for their strategy. The specifics, such as the cost and limitations of recruitment actions, may vary depending on the game's scenario and the evolving situation on the board.

For precise details on the Taliban's recruitment mechanics in "A Distant Plain," I recommend consulting the game's rulebook, specifically sections discussing Taliban operations and actions.

User
In a Distant Plain, how does the Taliban rally?

2 / 2

GPT
I have searched through the "A Distant Plain" rulebook in my knowledge source but was unable to find specific details regarding how the Taliban faction conducts a Rally in the game. This information might be in a section of the rulebook that I couldn't access or might not be explicitly detailed in the material I have. For precise rules and guidelines on how the Taliban Rally is performed in "A Distant Plain," I would recommend referring directly to the game's rulebook, particularly the sections that detail the operations and activities of the Taliban faction.


2 / 2

User
Can the Coalition player move Government police cubes with sweep?
GPT
In "A Distant Plain," the Coalition player's ability to move Government police cubes is not explicitly mentioned in the context of the Sweep operation. According to the rules, during a Sweep operation, the Coalition player may move Troops to activate Taliban and Warlord Guerrillas, primarily focusing on the highest Population space possible and aiming to add COIN Control​​. However, the rules do not specifically state that Government police cubes can be moved as part of this operation. Therefore, based on the available information, it appears that the Sweep operation for the Coalition player involves moving Troops rather than Government police cubes. [CORRECT]




The rules specify a "rally" action, which allows the insurgent factions, such as Taliban and Warlords to recruit guerillas on the board. I tried asking both "recruit" and "rally", and it couldn't find it from the rules. Meanwhile, I asked if the Coalition player can move Government police cubes with their sweep action, and it evidently did find the Sweep part of the rulebook, which proves that the file is readable by the GPT. It was also correctly able to deduce that because it only mentions troops and not police, it means that sweep doesn't include police.

The problem is that it seems very hit and miss if it can locate correct info from the data. I’ve tried to find good sources that explain the intricacies of this feature, but haven’t been able to find any good ones. (The promising “Medium” article was hidden behind a paywall) Does anyone have any?

The good news is that during my testing, hallucination has been basically zero. I’ll take “I have no idea” over hallucinating any day. I also wonder how I can have the AI access it’s internal knowledge in case it can’t find anything from the rulebook, and it’d be even better if it could combine knowledge from the rulebook with its internal knowledge.

Here’s another test in case anyone is interested, this time I uploaded the War of the Ring rulebook:

”Chatlog, Custom Board Game GPT, War of the Ring posted:

User
I am playing war of the ring. Explain how I can move the fellowship to mordor?
GPT
In "War of the Ring", moving the Fellowship to Mordor is a critical part of the quest to destroy the One Ring. Here's how you can achieve this:

Reaching Minas Morgul or Morannon: The Fellowship must first reach either Minas Morgul or Morannon. This is a mandatory requirement and must be done during a Fellowship phase to begin the last leg of the journey to Mount Doom. [CORRECT]

Entering Mordor: Once the Fellowship is in Minas Morgul or Morannon during a Fellowship phase, the Free Peoples player may decide to enter Mordor. At this point, you will proceed with the next steps. [CORRECT]

On the Mordor Track: Upon entering Mordor, you need to:

Place the Ring-bearers figure on the first step of the Mordor Track, which is marked with the Elven numeral ‘0’. From this moment, the Fellowship is considered to be “On the Mordor Track.” [CORRECT]
The Fellowship Progress counter is no longer advanced on the Fellowship Track but is still used to indicate whether the Fellowship is Hidden or Revealed. [CORRECT]
A new Hunt Pool is created by placing all Eye tiles previously drawn back with the remaining tiles of the Hunt Pool, and also adding any special tiles put in play by Event cards. Note that any Eye tile permanently removed from the game should not be placed back in the pool. [CORRECT]
Remember, in Mordor, the rules for the Fellowship change, reflecting the increasing danger and the burden of the Ring as they draw closer to Mount Doom.
Here the GPT again managed to fetch a simple query from the rulebook. Next I wanted to test if I can “gotcha” the AI. Normally in the game, the Shadow player rolls “hunt dice” to determine if they catch the Fellowship and do damage. If the Shadow player has any Nazgul, including the Witch-King, they can reroll one of these die. When the Fellowship reaches Mordor, these dice are no longer rolled. So the “gotcha” here is that normally the Witch-King can reroll a die, but specifically in Mordor it can’t.

quote:

User
Can the Witch King be used to reroll hunt dice if the fellowship is in mordor?
GPT
My search within the provided rulebook did not yield specific information about whether the Witch King can be used to reroll hunt dice when the Fellowship is in Mordor in "War of the Ring". Based on the available information, I cannot confirm or deny this rule. For detailed rules regarding unique character abilities like those of the Witch King, especially in specific scenarios like the Fellowship being in Mordor, it would be best to consult the complete rulebook or additional expansion rules if applicable.
The GPT wasn’t able to, in essence, “combine” the information in the rulebook to answer the query. The rulebook both states that Nazgul allow a reroll, but elsewhere it says that in Mordor there are no rolls. (And therefore no rerolls.)

I was able to prompt it to search from other sources, and then it was successful. If I recall correctly, it used Bing to find a website that answered the question and it gave a good answer.

quote:

User
Can you search other sources?
GPT
Certainly! I'll search for information regarding whether the Witch King can be used to reroll hunt dice when the Fellowship is in Mordor in "War of the Ring." Let me look that up for you.

In "War of the Ring", the rules for the Hunt and rerolls change when the Fellowship is on the Mordor Track:

Hunt Re-rolls Outside Mordor: Normally, if the Fellowship is in a region with a Shadow Stronghold, Shadow Army units, or one or more Nazgul, the Shadow player may reroll one failed Hunt roll die for each of these conditions. For example, if there are both Army units and Nazgul in the region with the Fellowship, the Shadow player can reroll two dice. [CORRECT]

Special Rules on the Mordor Track: However, when the Fellowship is on the Mordor Track, the rules differ. The Hunt dice are not rolled during the Action Resolution phase. Instead, one tile is automatically drawn from the Hunt Pool. The effects of this tile are applied as for a successful Hunt, with specific rules for tiles showing an Eye or a Stop icon. [CORRECT]

Therefore, the Witch King's ability to reroll hunt dice would not be applicable when the Fellowship is on the Mordor Track, as the hunt dice are not rolled in this scenario. The game shifts to a different mechanism for representing the Hunt through the drawing of tiles from the Hunt Pool. [CORRECT]
Next, I asked an in-depth question about how casualties are taken. The correct answer would be that if there are no regular units in casualties, one can be taken from reinforcements. If there are none in reinforcements either, the entire elite unit must be taken as a casualty with no downgrading. (=more damage)

quote:

User
What if the free people's player needs to downgrade an elite unit but there is no regular unit as a casualty?
GPT
In "War of the Ring", if the Free Peoples player is required to downgrade an Elite unit but there are no regular units available as casualties (i.e., in the dead pile), the usual practice is that the Elite unit is removed from the board and considered destroyed. This simulates the situation where there are no reinforcements available to replace the Elite unit's losses, effectively resulting in the unit's elimination. This rule maintains the balance and realism of the game, reflecting the logistical and tactical challenges of maintaining an army in a prolonged conflict. [PARTIALLY WRONG]

User
Can you look from the rulebook?
GPT
According to the "War of the Ring" rulebook, when removing casualties, a player can either remove one Regular unit or replace one Elite unit with a Regular unit of the same Nation for each hit scored by the opponent. If an Elite unit is to be replaced but there are no Regular units available either from previous casualties or from the available reinforcements, the rulebook does not explicitly state what happens. It implies that if no replacements are available, the Elite unit cannot be replaced. This suggests that in such a case, the Elite unit would likely be removed as a casualty without replacement. [CORRECT]

It would appear that the AI first answered based on its internal memory, and got the answer wrong. When I specifically prompted it to use the rulebook, it got the answer right, although it had to guess.

Diva Cupcake
Aug 15, 2005

https://x.com/openai/status/1727206187077370115?s=46&t=DcBXErlGIUJUj8quAgYfkQ

BAD AT STUFF
May 10, 2012

We choose to go to the moon in this decade and do the other things, not because they are easy, but because fuck you.

Keisari posted:

Has anyone been able to really well fiddle with the custom GPTs and more specifically, the "knowledge" part of them? You can upload all kinds of poo poo to be their knowledgebase. I have tried to make one to help me build a program to use a certain API, and uploaded a bunch of JSON files that describe the API. Another I made was inspired by the built in board game explainer, I made one that was focused on explaining and clarifying the rules, and uploaded some game manuals.

The API GPT basically completely errored out and either ignored all the knowledge or just crashed.

Did you give it an OpenAPI spec? And were you asking it to write code that queries an API or setting that up as an action for a custom GPT?

I got good results with regular ChatGPT when uploading a basic JSON file describing a service I wanted it to create using FastAPI. I had much less success trying to set up an action for my own GPT. It gave an unhelpful error message and crashed each time. I think the root of the issue was that I gave it the full JSON file that I pulled from Swagger. I want to go back with a cut down file to try adding endpoints one at a time. That might help with finding the source of the error, but also I think it should perform better if I can limit extraneous context.

If anyone has found things that work well for cleaning up knowledge inputs to GPTs, I'd love to hear about that.

...also this OpenAI drama makes me feel better about the dumb politics poo poo that happens at my company. :munch:

BAD AT STUFF fucked around with this message at 20:48 on Nov 22, 2023

Nektu
Jul 4, 2007

FUKKEN FUUUUUUCK
Cybernetic Crumb

Keisari posted:

Meanwhile, the game rules explainer worked a bit better, but is poor at applying the knowledge or combining it with other sources of information. it did reasonably ok at finding specific points from the manuals, but was poor at answering certain type of fringe questions that required combining different rules from the manual.

This fits to the restrictions of LLMs described here: https://aiguide.substack.com/p/can-large-language-models-reason.

They are good at identifying patterns in the training data that fit to your prompt, but they are incapable of actual reasoning.

Keisari
May 24, 2011

BAD AT STUFF posted:

Did you give it an OpenAPI spec? And were you asking it to write code that queries an API or setting that up as an action for a custom GPT?

I got good results with regular ChatGPT when uploading a basic JSON file describing a service I wanted it to create using FastAPI. I had much less success trying to set up an action for my own GPT. It gave an unhelpful error message and crashed each time. I think the root of the issue was that I gave it the full JSON file that I pulled from Swagger. I want to go back with a cut down file to try adding endpoints one at a time. That might help with finding the source of the error, but also I think it should perform better if I can limit extraneous context.

If anyone has found things that work well for cleaning up knowledge inputs to GPTs, I'd love to hear about that.

...also this OpenAI drama makes me feel better about the dumb politics poo poo that happens at my company. :munch:

Completely forgot to answer this, sorry! I am unsure what spec it was, I just plonked it in.

Nektu posted:

This fits to the restrictions of LLMs described here: https://aiguide.substack.com/p/can-large-language-models-reason.

They are good at identifying patterns in the training data that fit to your prompt, but they are incapable of actual reasoning.

That was an interesting read, thanks! Well, even in it's current state it does provide some value as what basically amounts to an improved PDF search engine.

some kinda jackal
Feb 25, 2003

 
 
This is SUPER abstract, but as someone who has zero interest in learning python, is learning the basic concepts of modern ML/AI in a Javascript environment?

The context is that my org is starting to delve into ML data science. DS is not my domain, but I do interact with these teams at a system design level so I feel it would be useful to speak their "language", if only to better facilitate relationships within the org.

I spent a fair amount of time bootstrapping JS/TS via some Node projects last year and I know that the majority of this space lives and breathes Python, which I have no real desire to pivot to.

Like I said, this is super abstract and open ended -- I expect the answer is definitely yes, the basics are all easily replicated in JS and are not really bound to a specific language. The fact that I just said "learn ML" probably speaks volumes in that I don't actually know what I want out of this other than to be able to talk to people about how we use ML, and that's probably super specific to toolsets and libraries etc., but at some point I think I just need a jumping off point.

For better or for worse, I learn by doing. So I can watch a thousand hours of youtube video tutorials but until I write the code it's usually in one ear out the other. So I'm wondering if anyone has any good recommendations for a foundational level ML "course" or good series focusing on a JS/TS toolchain.


And if this is a really stupid question, I'm wiling to eat my foot here. I typically dislike questions like this where someone asks "how do I learn this technology with no real goal?" because it's super hard to give direction, but I find myself unable to really articulate better than this given I know almost nothing about this space.


e: On python, I know I can probably learn enough to just start with some basic tutorials, but I know that I'm the kind of person who if I get frustrated not being able to write something in python or get it working I'm liable to just put the whole project down, so I'm trying not to stack the deck against me.

Xun
Apr 25, 2010

some kinda jackal posted:

This is SUPER abstract, but as someone who has zero interest in learning python, is learning the basic concepts of modern ML/AI in a Javascript environment?

The context is that my org is starting to delve into ML data science. DS is not my domain, but I do interact with these teams at a system design level so I feel it would be useful to speak their "language", if only to better facilitate relationships within the org.

I spent a fair amount of time bootstrapping JS/TS via some Node projects last year and I know that the majority of this space lives and breathes Python, which I have no real desire to pivot to.

Like I said, this is super abstract and open ended -- I expect the answer is definitely yes, the basics are all easily replicated in JS and are not really bound to a specific language. The fact that I just said "learn ML" probably speaks volumes in that I don't actually know what I want out of this other than to be able to talk to people about how we use ML, and that's probably super specific to toolsets and libraries etc., but at some point I think I just need a jumping off point.

For better or for worse, I learn by doing. So I can watch a thousand hours of youtube video tutorials but until I write the code it's usually in one ear out the other. So I'm wondering if anyone has any good recommendations for a foundational level ML "course" or good series focusing on a JS/TS toolchain.


And if this is a really stupid question, I'm wiling to eat my foot here. I typically dislike questions like this where someone asks "how do I learn this technology with no real goal?" because it's super hard to give direction, but I find myself unable to really articulate better than this given I know almost nothing about this space.


e: On python, I know I can probably learn enough to just start with some basic tutorials, but I know that I'm the kind of person who if I get frustrated not being able to write something in python or get it working I'm liable to just put the whole project down, so I'm trying not to stack the deck against me.

Not sure is this is exactly what you're looking for, but I heard Tensorflow has a javascript library and that points to this series as a tutorial. I haven't watched it but the topics look faaaiirrlly complete?

https://www.youtube.com/playlist?list=PLOU2XLYxmsILr3HQpqjLAUkIPa5EaZiui

Bruegels Fuckbooks
Sep 14, 2004

Now, listen - I know the two of you are very different from each other in a lot of ways, but you have to understand that as far as Grandpa's concerned, you're both pieces of shit! Yeah. I can prove it mathematically.

some kinda jackal posted:

e: On python, I know I can probably learn enough to just start with some basic tutorials, but I know that I'm the kind of person who if I get frustrated not being able to write something in python or get it working I'm liable to just put the whole project down, so I'm trying not to stack the deck against me.

You're better off just sucking it up and learning enough python to get by. I have sympathy for you because I personally dislike python, but the vast majority of the ecosystem and examples for stuff like tensorflow etc. are going to be written in python and frankly learning python is going to be much easier than learning some js binding for tensorflow and trying to figure out why it doesn't work or expose the same features as in the python example.

Xun
Apr 25, 2010

Bruegels Fuckbooks posted:

You're better off just sucking it up and learning enough python to get by. I have sympathy for you because I personally dislike python, but the vast majority of the ecosystem and examples for stuff like tensorflow etc. are going to be written in python and frankly learning python is going to be much easier than learning some js binding for tensorflow and trying to figure out why it doesn't work or expose the same features as in the python example.

NGL I'd agree with this 100% if he was looking to do much actual implementation or looking for a deeper understanding. But if the goal is to just understand some of the basics to communicate with datascientists while using js to get an idea of the concepts and workflow, I thiiinnkk it should be fine?

But yeah as soon as you get into actual implementation outside of the simplest models or prepackaged known-to-work-in-js models python is 100% the way to go. There's also like, apache spark in java? I used that once...

Bruegels Fuckbooks
Sep 14, 2004

Now, listen - I know the two of you are very different from each other in a lot of ways, but you have to understand that as far as Grandpa's concerned, you're both pieces of shit! Yeah. I can prove it mathematically.

Xun posted:

NGL I'd agree with this 100% if he was looking to do much actual implementation or looking for a deeper understanding. But if the goal is to just understand some of the basics to communicate with datascientists while using js to get an idea of the concepts and workflow, I thiiinnkk it should be fine?

It's really up to you but I think you've probably used more brainpower to rationalize not learning python than learning python would take you.

some kinda jackal
Feb 25, 2003

 
 
Thanks for the input gang. I'll see where I get with the tutorials that the tensorflow site links out to first. I didn't mean to position python as some kind of hill to die on -- if I really feel the limitations I probably will do some rudimentary uptake on python, I just didn't want it to be a thing where I have a goal and I have like five dependencies I need to get through to achieve it. Appreciate the frank feedback though.!

some kinda jackal
Feb 25, 2003

 
 
Ehh so I'm kind of backpedaling on my "no python" stance and I'm just hunting for a "just enough python to be useful in ML" type tutorial/resource that will let me competently follow/code along with random tutorials. It looks like Anaconda has a few good leads on practical tutorials or quick courses.

Ultimately I'm happy to eat a "told you so" because there's something about just speaking the native language of the space in an effort to learn the space.

But thanks to everyone and this thread for a bunch of high level links on concepts, etc. The more I poke at the concepts the deeper I want to go, and some of that is putting time into math, which I have the same amount of interest in as python, and then being able to put things into practice will be a nice way to validate my knowledge.

Anyway, a lot of words to say thanks.

some kinda jackal fucked around with this message at 19:10 on Jan 12, 2024

Biffmotron
Jan 12, 2007

I'm coming from the opposite place as you, since I started with Python and try very hard to live my life in such a way that I don't need to learn Javascript.

When I was starting out, I really benefitted from following along with Raschka's Python Machine Learning, which I thought had a very nice coding style that made it clear what every part of the code was doing, as opposed to some examples which are more like

Python code:

import requests
import magic

bar = requests.get('http://somedataset.com/data')

foo = magic.invocation(bar)

print(foo)
But I digress. Anaconda and Jupyter Notebooks are a great place to get started poking around with Python, because it's very easy to see why the error code is showing up and iterate rapidly. The basic ML workflow is that you have some data somewhere (database, flat file on disk, web, etc), and you read it into a Pandas DataFrame. You munge the data until there's clean features, which you convert to Numpy arrays. Then you toss the arrays into your ML model to train it, and get a trained model you can use to predict stuff. TensorFlow is internally a lot more complex than a sci-kit learn random forest, but the workflow is basically the same. I'm happy to answer more specific questions.

Biffmotron fucked around with this message at 07:00 on Jan 13, 2024

PierreTheMime
Dec 9, 2004

Hero of hormagaunts everywhere!
Buglord
Does anyone have any opinion about the current “best” vector database python library that supports cloud storage and metadata filtering? For a bunch of local POC stuff Chroma has been fine, but I’d like to setup persistent storage in an S3 bucket.

street doc
Feb 20, 2019

Is there a thread anywhere on SA focused on machine vision?

Diva Cupcake
Aug 15, 2005

This is ridiculous.
https://twitter.com/OpenAI/status/1758192957386342435?s=20

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Yeeeeeesh. I'm a commoner but from the outside looking in it's hard to believe those short prompts generated the videos we see in those tweets.

That space-man trailer looks absolutely real to me from 3 ft away.

Hughmoris fucked around with this message at 22:00 on Feb 15, 2024

SporkOfTruth
Sep 1, 2006

this kid walked up to me and was like man schmitty your stache is ghetto and I was like whatever man your 3b look like a dishrag.

he was like damn.
As it turns out, it's mostly style transfer applied to memorized sequential material:

https://twitter.com/bcmerchant/status/1758537510618304669?s=46&t=XB441enUkiQ32sYge-f10A
(Merchant corrected himself, it's the video on the right that is the Shutterstock stock footage.)

Diva Cupcake
Aug 15, 2005

Still pretty incredible imo. That it bears a resemblance to stock source material isnt surprising.

Nektu
Jul 4, 2007

FUKKEN FUUUUUUCK
Cybernetic Crumb
I read something that apparently generative AI is used to recognize diseases in very early stages, and I realized that I have no idea at all how AI can be applied to problems, and why it returns result.

Why is recognizing diseases early a use case for generative AI? On first glance it looks like something for a pattern matching algorithm (which recognizes symptoms).

So I guess the only way to get an understanding about how to apply AI to real life problems is to do python and then implement some examples, isnt it?

ultrafilter
Aug 23, 2007

It's okay if you have any questions.


Nektu posted:

I read something that apparently generative AI is used to recognize diseases in very early stages, and I realized that I have no idea at all how AI can be applied to problems, and why it returns result.

Can you link us to what you read?

Xun
Apr 25, 2010

Yeah can't say for sure without knowing what you read but I'm currently working on a time series anomaly detection paper that technically uses generative AI.

You basically train a model to reconstruct/regenerate inputs when they're normal. The idea is when the model can't reconstruct the input (hopefully because something is weird) there is a problem. Maybe that's similar :shrug:

Nektu
Jul 4, 2007

FUKKEN FUUUUUUCK
Cybernetic Crumb

ultrafilter posted:

Can you link us to what you read?
I wont bother, it was basically just a blog entry mentioning what I said above without any more details.

Entropist
Dec 1, 2007
I'm very stupid.

Nektu posted:

Why is recognizing diseases early a use case for generative AI? On first glance it looks like something for a pattern matching algorithm (which recognizes symptoms).

Generative AI is just a fancy new word for pattern matching algorithm, it seems.

People seem to use it for any huge scale machine learning models that use the attention mechanism, thus you can also say that pretty much any machine learning task is a use case for generative AI. Alternatively, it may refer to prompt-based interaction with any large-scale model, then I guess it's just an alternative interface to the patterns that are detected.

Insurrectionist
May 21, 2007
I don't have a lot of AI experience, me and a pal took ML/NN as an elective during our bachelor back in spring 2021 and we kept using it for like every project afterwards since it was fun. This was before the generative AI explosion though as we finished spring 2022 (also the last time I worked on ML) and we focused a lot on recognition/classification using keras/TF libraries. We made a cool sudoku-solver app that would solve from photos you took of unsolved or partially solved sudoku, though I mostly worked on the decidedly mundane solving algorithm there, as well as a shameless ripoff of google's quickdraw and poo poo. To be honest, I haven't kept up at all in the past 2 years however.

Basically, I understand a lot of the underlying principles, but that's about it. So I have a few question for those who are more involved. Some of these may be too generalised or just industry secrets but w/e just call me an idiot and skip those.

- How much progress in design and understanding of ML models has actually been made in the past say 5 years? Pretty difficult to answer really especially since my impression is the big companies and foundations were keeping cards close to the chest before GPT-2. I was mostly concerned with building from scratch with Google Colab instances as hardware limitation so nothing I've worked with compares to what they were using in the biz even back then anyway. Are the recent commercial models reaching a plateau in what can be achieved, and are they using smoke and mirrors to hide weaknesses at all?

- What kinds of network designs are favored for generative NNs? Is there a focus of width (nodes per layer) over depth (# of layers) or vise versa? Do they utilise input-reducing layers like CNNs (pooling etc)?

- What kind of activation functions are modern generative AIs using? Is it still plain old Sigmoid and the like? I remember we used ReLU a decent amount but I can't imagine that would be useful for anything more complex than the data analysis we were working with since it is hardly a sophisticated function.

- I remember most of the hardware bottleneck for neural networks was for training the NN, and not running the completed model. How advanced are the various free, locally runnable generative NNs nowadays? What kind of hardware so you need to run them?

BAD AT STUFF
May 10, 2012

We choose to go to the moon in this decade and do the other things, not because they are easy, but because fuck you.

Insurrectionist posted:

- What kinds of network designs are favored for generative NNs? Is there a focus of width (nodes per layer) over depth (# of layers) or vise versa? Do they utilise input-reducing layers like CNNs (pooling etc)?

GPT stands for Generative Pre-trained Transformer, and the "transformer" bit is what to look at if you want to understand how these new generative models are different. The paper that really kicked things off is called "Attention Is All You Need". I don't think I understand attention and transformers enough to do it justice, but that's a good starting point.

ultrafilter
Aug 23, 2007

It's okay if you have any questions.


If a generative AI produces images that are similar to a target image, how strongly does that suggest that the target image is AI-generated?

Lord Of Texas
Dec 26, 2006

Entropist posted:

Generative AI is just a fancy new word for pattern matching algorithm, it seems.

In the same sense that it's correct to call both firecrackers and Tsar Bomba "explosives", yes.

overeager overeater
Oct 16, 2011

"The cosmonauts were transfixed with wonderment as the sun set - over the Earth - there lucklessly, untethered Comrade Todd on fire."



Insurrectionist posted:

- What kinds of network designs are favored for generative NNs? Is there a focus of width (nodes per layer) over depth (# of layers) or vise versa? Do they utilise input-reducing layers like CNNs (pooling etc)?

It depends on the application, but for language models, BAD AT STUFF has it right that most have settled on decoder-only transformers as the architecture of choice. (e.g. Mistral, Llama and Gemma)

For generative tasks with images, diffusion models are where it's at (but people are still working with GANs). The diffusion denoising is typically done with the venerable U-Net, which does have the CNN-pooling layer structure you're probably familiar with.

Insurrectionist posted:

- What kind of activation functions are modern generative AIs using? Is it still plain old Sigmoid and the like? I remember we used ReLU a decent amount but I can't imagine that would be useful for anything more complex than the data analysis we were working with since it is hardly a sophisticated function.

The language models linked above generally pick one of the GLU variants. However, there's at least one ICLR 2024 paper which argues that plain ReLU is actually completely fine.

Insurrectionist posted:

- I remember most of the hardware bottleneck for neural networks was for training the NN, and not running the completed model. How advanced are the various free, locally runnable generative NNs nowadays? What kind of hardware so you need to run them?

The local LLMs are surprisingly good, is what I would say - they generate coherent answers and are pretty nifty for their size, but emphatically - don't expect a full ChatGPT replacement

Loading 7B language models as-is requires a GPU with 16 GB of VRAM, but with quantization you can get away with 8 GB.

(Comedy option: OnnxStream is set up to use the least memory necessary, letting you run Stable Diffusion XL on a Raspberry Pi Zero 2, albeit at the cost of requiring hours to generate an image)

Charles 2 of Spain
Nov 7, 2017

Not sure where to post this so I'll put it here.

I'm streaming audio data from a microphone and am continuously running a Pytorch model on it (speech recognition and other stuff). On a Linux laptop with Ubuntu and a GeForce GPU the inference time is around 8ms, which is nice and fast. When I run the exact same code and model on a Windows desktop also with a GeForce GPU the inference time is around 20ms, more than twice as slow.

What could be the reason for this? GPUs are the same on both systems and they are both being used as far as I can tell. I would understand a slight difference in performance depending on the operating system but this is quite large. Is it something to do with how the model was trained?

Tippecanoe
Jan 26, 2011

Do you know where the latency is coming from? I'm not an audio expert by any means but I am wondering if your audio interface is introducing latency. Are you using a USB mic?

Charles 2 of Spain
Nov 7, 2017

USB mic on both systems and I'm using PyAudio for streaming. The specific bottleneck seems to be caused by the call to the model encoder. Unfortunately I'm not familiar with the details of the transformer model, but this command appears to be doing the bulk of the inference.

overeager overeater
Oct 16, 2011

"The cosmonauts were transfixed with wonderment as the sun set - over the Earth - there lucklessly, untethered Comrade Todd on fire."



(e: posted before I saw the reply, whoops)

I would also start looking at the audio end first - in addition to the interface itself, there are different audio backends in Windows (MME, DirectSound, WASAPI), if you happen to be using PyAudio it's worth checking which one is actually being used.

Alternatively, to rule out anything happening in the model itself, try setting up the PyTorch profiler and compare the time spent in inference.

e2: Is the Windows version running natively or under WSL? My other thought was the Linux version possibly taking advantage of Triton or similar and falling back to a slower implementation on Windows

overeager overeater fucked around with this message at 03:58 on Mar 19, 2024

Charles 2 of Spain
Nov 7, 2017

Thanks, I'll check those out. I'm running Windows natively.

Charles 2 of Spain
Nov 7, 2017

According to the profiler the Windows machine is taking longer for inference than the Linux one for basically all the operations. So it does seem to confirm that there is an issue with this particular model and Windows. I've tried on a couple of other desktops as well with the same results.

Xun
Apr 25, 2010

Oh this is fun, working on my next paper on anomaly detection and I'm implementing and running some baseline methods. I saved implementing PCA analysis for last because it's EZ and figured it'd be nice to include a non machine learning baseline.

Uh. The out of the box sklearn PCA analysis does a lot better than most of these baselines and my method is performing about the same. Lmao wtf, some of these papers report a difference of 60% between their methods and PCA. These are like AAAI papers too with basically identical preprocessing...

Well that paper is getting derailed :v:

Adbot
ADBOT LOVES YOU

ultrafilter
Aug 23, 2007

It's okay if you have any questions.


Replication crisis in AI/ML when?

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply