|
Xun posted:When you look at a GitHub page for a research paper, what do you guys want to see if the readme? Im updating mine for a publication and sadly the only advice I'm getting from my labmates is "the code is all there and there's a bibtex citation, why are you worrying " I review research papers from time to time, and the amount of stuff in a project's README tends to be all over the map, from less than useless to comprehensive. Put in whatever you personally would want to see if you were coming in fresh, having no experience with the specific project. Include details on how to run the model and any additional details that you feel are important. It doesn't need to be a comprehensive SDK, but it should provide high-level details that would be helpful to anyone seeking to build upon your work. Assume that you'll want to show this project and its README to future prospective employers, too - put in the extra 60 seconds to give it a title, section headings, pass it through `aspell`, etc.
|
# ? Nov 22, 2023 01:13 |
|
|
# ? Jun 8, 2024 08:24 |
|
Xun posted:When you look at a GitHub page for a research paper, what do you guys want to see if the readme? Im updating mine for a publication and sadly the only advice I'm getting from my labmates is "the code is all there and there's a bibtex citation, why are you worrying " Ideally, a working demo or at least a GIF of what to expect if its something that has dataviz or can be made a web app fairly easily. QuarkJets posted:I review research papers from time to time, and the amount of stuff in a project's README tends to be all over the map, from less than useless to comprehensive. This
|
# ? Nov 22, 2023 05:55 |
|
Xun posted:When you look at a GitHub page for a research paper, what do you guys want to see if the readme? Im updating mine for a publication and sadly the only advice I'm getting from my labmates is "the code is all there and there's a bibtex citation, why are you worrying " You'd think this would be a basic one (who wouldn't include a requirements.txt?) but please please please list all required libraries and versions. It takes long enough to get research code running as it is. I spent a couple of days once because someone's requirements.txt didn't list library versions and it depended on old versions that weren't easily interoperable. It's nice if "type this to run model" replicates exactly what was done in the paper. Also, please stick a license on your code.
|
# ? Nov 22, 2023 08:13 |
|
Has anyone been able to really well fiddle with the custom GPTs and more specifically, the "knowledge" part of them? You can upload all kinds of poo poo to be their knowledgebase. I have tried to make one to help me build a program to use a certain API, and uploaded a bunch of JSON files that describe the API. Another I made was inspired by the built in board game explainer, I made one that was focused on explaining and clarifying the rules, and uploaded some game manuals. The API GPT basically completely errored out and either ignored all the knowledge or just crashed. Meanwhile, the game rules explainer worked a bit better, but is poor at applying the knowledge or combining it with other sources of information. it did reasonably ok at finding specific points from the manuals, but was poor at answering certain type of fringe questions that required combining different rules from the manual. Below are some chatlogs and my commentary about two boardgames: A Distant Plain, a COIN game about the War in Afghanistan, and War of the Ring, a super nerdy (and loving amazing) LOTR boardgame. Chatlog, Custom Board Game GPT and A Distant Plain, the boardgame posted:User The rules specify a "rally" action, which allows the insurgent factions, such as Taliban and Warlords to recruit guerillas on the board. I tried asking both "recruit" and "rally", and it couldn't find it from the rules. Meanwhile, I asked if the Coalition player can move Government police cubes with their sweep action, and it evidently did find the Sweep part of the rulebook, which proves that the file is readable by the GPT. It was also correctly able to deduce that because it only mentions troops and not police, it means that sweep doesn't include police. The problem is that it seems very hit and miss if it can locate correct info from the data. I’ve tried to find good sources that explain the intricacies of this feature, but haven’t been able to find any good ones. (The promising “Medium” article was hidden behind a paywall) Does anyone have any? The good news is that during my testing, hallucination has been basically zero. I’ll take “I have no idea” over hallucinating any day. I also wonder how I can have the AI access it’s internal knowledge in case it can’t find anything from the rulebook, and it’d be even better if it could combine knowledge from the rulebook with its internal knowledge. Here’s another test in case anyone is interested, this time I uploaded the War of the Ring rulebook: ”Chatlog, Custom Board Game GPT, War of the Ring posted:User quote:User I was able to prompt it to search from other sources, and then it was successful. If I recall correctly, it used Bing to find a website that answered the question and it gave a good answer. quote:User quote:User It would appear that the AI first answered based on its internal memory, and got the answer wrong. When I specifically prompted it to use the rulebook, it got the answer right, although it had to guess.
|
# ? Nov 22, 2023 10:49 |
|
https://x.com/openai/status/1727206187077370115?s=46&t=DcBXErlGIUJUj8quAgYfkQ
|
# ? Nov 22, 2023 12:24 |
|
Keisari posted:Has anyone been able to really well fiddle with the custom GPTs and more specifically, the "knowledge" part of them? You can upload all kinds of poo poo to be their knowledgebase. I have tried to make one to help me build a program to use a certain API, and uploaded a bunch of JSON files that describe the API. Another I made was inspired by the built in board game explainer, I made one that was focused on explaining and clarifying the rules, and uploaded some game manuals. Did you give it an OpenAPI spec? And were you asking it to write code that queries an API or setting that up as an action for a custom GPT? I got good results with regular ChatGPT when uploading a basic JSON file describing a service I wanted it to create using FastAPI. I had much less success trying to set up an action for my own GPT. It gave an unhelpful error message and crashed each time. I think the root of the issue was that I gave it the full JSON file that I pulled from Swagger. I want to go back with a cut down file to try adding endpoints one at a time. That might help with finding the source of the error, but also I think it should perform better if I can limit extraneous context. If anyone has found things that work well for cleaning up knowledge inputs to GPTs, I'd love to hear about that. ...also this OpenAI drama makes me feel better about the dumb politics poo poo that happens at my company. BAD AT STUFF fucked around with this message at 20:48 on Nov 22, 2023 |
# ? Nov 22, 2023 20:45 |
|
Keisari posted:Meanwhile, the game rules explainer worked a bit better, but is poor at applying the knowledge or combining it with other sources of information. it did reasonably ok at finding specific points from the manuals, but was poor at answering certain type of fringe questions that required combining different rules from the manual. This fits to the restrictions of LLMs described here: https://aiguide.substack.com/p/can-large-language-models-reason. They are good at identifying patterns in the training data that fit to your prompt, but they are incapable of actual reasoning.
|
# ? Dec 26, 2023 23:44 |
|
BAD AT STUFF posted:Did you give it an OpenAPI spec? And were you asking it to write code that queries an API or setting that up as an action for a custom GPT? Completely forgot to answer this, sorry! I am unsure what spec it was, I just plonked it in. Nektu posted:This fits to the restrictions of LLMs described here: https://aiguide.substack.com/p/can-large-language-models-reason. That was an interesting read, thanks! Well, even in it's current state it does provide some value as what basically amounts to an improved PDF search engine.
|
# ? Dec 27, 2023 09:35 |
|
This is SUPER abstract, but as someone who has zero interest in learning python, is learning the basic concepts of modern ML/AI in a Javascript environment? The context is that my org is starting to delve into ML data science. DS is not my domain, but I do interact with these teams at a system design level so I feel it would be useful to speak their "language", if only to better facilitate relationships within the org. I spent a fair amount of time bootstrapping JS/TS via some Node projects last year and I know that the majority of this space lives and breathes Python, which I have no real desire to pivot to. Like I said, this is super abstract and open ended -- I expect the answer is definitely yes, the basics are all easily replicated in JS and are not really bound to a specific language. The fact that I just said "learn ML" probably speaks volumes in that I don't actually know what I want out of this other than to be able to talk to people about how we use ML, and that's probably super specific to toolsets and libraries etc., but at some point I think I just need a jumping off point. For better or for worse, I learn by doing. So I can watch a thousand hours of youtube video tutorials but until I write the code it's usually in one ear out the other. So I'm wondering if anyone has any good recommendations for a foundational level ML "course" or good series focusing on a JS/TS toolchain. And if this is a really stupid question, I'm wiling to eat my foot here. I typically dislike questions like this where someone asks "how do I learn this technology with no real goal?" because it's super hard to give direction, but I find myself unable to really articulate better than this given I know almost nothing about this space. e: On python, I know I can probably learn enough to just start with some basic tutorials, but I know that I'm the kind of person who if I get frustrated not being able to write something in python or get it working I'm liable to just put the whole project down, so I'm trying not to stack the deck against me.
|
# ? Jan 9, 2024 13:41 |
|
some kinda jackal posted:This is SUPER abstract, but as someone who has zero interest in learning python, is learning the basic concepts of modern ML/AI in a Javascript environment? Not sure is this is exactly what you're looking for, but I heard Tensorflow has a javascript library and that points to this series as a tutorial. I haven't watched it but the topics look faaaiirrlly complete? https://www.youtube.com/playlist?list=PLOU2XLYxmsILr3HQpqjLAUkIPa5EaZiui
|
# ? Jan 9, 2024 17:03 |
|
some kinda jackal posted:e: On python, I know I can probably learn enough to just start with some basic tutorials, but I know that I'm the kind of person who if I get frustrated not being able to write something in python or get it working I'm liable to just put the whole project down, so I'm trying not to stack the deck against me. You're better off just sucking it up and learning enough python to get by. I have sympathy for you because I personally dislike python, but the vast majority of the ecosystem and examples for stuff like tensorflow etc. are going to be written in python and frankly learning python is going to be much easier than learning some js binding for tensorflow and trying to figure out why it doesn't work or expose the same features as in the python example.
|
# ? Jan 9, 2024 21:12 |
|
Bruegels Fuckbooks posted:You're better off just sucking it up and learning enough python to get by. I have sympathy for you because I personally dislike python, but the vast majority of the ecosystem and examples for stuff like tensorflow etc. are going to be written in python and frankly learning python is going to be much easier than learning some js binding for tensorflow and trying to figure out why it doesn't work or expose the same features as in the python example. NGL I'd agree with this 100% if he was looking to do much actual implementation or looking for a deeper understanding. But if the goal is to just understand some of the basics to communicate with datascientists while using js to get an idea of the concepts and workflow, I thiiinnkk it should be fine? But yeah as soon as you get into actual implementation outside of the simplest models or prepackaged known-to-work-in-js models python is 100% the way to go. There's also like, apache spark in java? I used that once...
|
# ? Jan 9, 2024 23:35 |
|
Xun posted:NGL I'd agree with this 100% if he was looking to do much actual implementation or looking for a deeper understanding. But if the goal is to just understand some of the basics to communicate with datascientists while using js to get an idea of the concepts and workflow, I thiiinnkk it should be fine? It's really up to you but I think you've probably used more brainpower to rationalize not learning python than learning python would take you.
|
# ? Jan 10, 2024 16:48 |
|
Thanks for the input gang. I'll see where I get with the tutorials that the tensorflow site links out to first. I didn't mean to position python as some kind of hill to die on -- if I really feel the limitations I probably will do some rudimentary uptake on python, I just didn't want it to be a thing where I have a goal and I have like five dependencies I need to get through to achieve it. Appreciate the frank feedback though.!
|
# ? Jan 10, 2024 16:52 |
|
Ehh so I'm kind of backpedaling on my "no python" stance and I'm just hunting for a "just enough python to be useful in ML" type tutorial/resource that will let me competently follow/code along with random tutorials. It looks like Anaconda has a few good leads on practical tutorials or quick courses. Ultimately I'm happy to eat a "told you so" because there's something about just speaking the native language of the space in an effort to learn the space. But thanks to everyone and this thread for a bunch of high level links on concepts, etc. The more I poke at the concepts the deeper I want to go, and some of that is putting time into math, which I have the same amount of interest in as python, and then being able to put things into practice will be a nice way to validate my knowledge. Anyway, a lot of words to say thanks. some kinda jackal fucked around with this message at 19:10 on Jan 12, 2024 |
# ? Jan 12, 2024 18:49 |
|
I'm coming from the opposite place as you, since I started with Python and try very hard to live my life in such a way that I don't need to learn Javascript. When I was starting out, I really benefitted from following along with Raschka's Python Machine Learning, which I thought had a very nice coding style that made it clear what every part of the code was doing, as opposed to some examples which are more like Python code:
Biffmotron fucked around with this message at 07:00 on Jan 13, 2024 |
# ? Jan 13, 2024 06:48 |
|
Does anyone have any opinion about the current “best” vector database python library that supports cloud storage and metadata filtering? For a bunch of local POC stuff Chroma has been fine, but I’d like to setup persistent storage in an S3 bucket.
|
# ? Feb 1, 2024 17:06 |
|
Is there a thread anywhere on SA focused on machine vision?
|
# ? Feb 8, 2024 15:16 |
|
This is ridiculous. https://twitter.com/OpenAI/status/1758192957386342435?s=20
|
# ? Feb 15, 2024 20:43 |
|
Diva Cupcake posted:This is ridiculous. Yeeeeeesh. I'm a commoner but from the outside looking in it's hard to believe those short prompts generated the videos we see in those tweets. That space-man trailer looks absolutely real to me from 3 ft away. Hughmoris fucked around with this message at 22:00 on Feb 15, 2024 |
# ? Feb 15, 2024 21:58 |
|
As it turns out, it's mostly style transfer applied to memorized sequential material: https://twitter.com/bcmerchant/status/1758537510618304669?s=46&t=XB441enUkiQ32sYge-f10A (Merchant corrected himself, it's the video on the right that is the Shutterstock stock footage.)
|
# ? Feb 16, 2024 22:07 |
|
Still pretty incredible imo. That it bears a resemblance to stock source material isnt surprising.
|
# ? Feb 17, 2024 03:53 |
|
I read something that apparently generative AI is used to recognize diseases in very early stages, and I realized that I have no idea at all how AI can be applied to problems, and why it returns result. Why is recognizing diseases early a use case for generative AI? On first glance it looks like something for a pattern matching algorithm (which recognizes symptoms). So I guess the only way to get an understanding about how to apply AI to real life problems is to do python and then implement some examples, isnt it?
|
# ? Feb 25, 2024 10:56 |
|
Nektu posted:I read something that apparently generative AI is used to recognize diseases in very early stages, and I realized that I have no idea at all how AI can be applied to problems, and why it returns result. Can you link us to what you read?
|
# ? Feb 25, 2024 15:51 |
|
Yeah can't say for sure without knowing what you read but I'm currently working on a time series anomaly detection paper that technically uses generative AI. You basically train a model to reconstruct/regenerate inputs when they're normal. The idea is when the model can't reconstruct the input (hopefully because something is weird) there is a problem. Maybe that's similar
|
# ? Feb 25, 2024 16:07 |
|
ultrafilter posted:Can you link us to what you read?
|
# ? Feb 25, 2024 16:19 |
|
Nektu posted:Why is recognizing diseases early a use case for generative AI? On first glance it looks like something for a pattern matching algorithm (which recognizes symptoms). Generative AI is just a fancy new word for pattern matching algorithm, it seems. People seem to use it for any huge scale machine learning models that use the attention mechanism, thus you can also say that pretty much any machine learning task is a use case for generative AI. Alternatively, it may refer to prompt-based interaction with any large-scale model, then I guess it's just an alternative interface to the patterns that are detected.
|
# ? Feb 26, 2024 19:31 |
|
I don't have a lot of AI experience, me and a pal took ML/NN as an elective during our bachelor back in spring 2021 and we kept using it for like every project afterwards since it was fun. This was before the generative AI explosion though as we finished spring 2022 (also the last time I worked on ML) and we focused a lot on recognition/classification using keras/TF libraries. We made a cool sudoku-solver app that would solve from photos you took of unsolved or partially solved sudoku, though I mostly worked on the decidedly mundane solving algorithm there, as well as a shameless ripoff of google's quickdraw and poo poo. To be honest, I haven't kept up at all in the past 2 years however. Basically, I understand a lot of the underlying principles, but that's about it. So I have a few question for those who are more involved. Some of these may be too generalised or just industry secrets but w/e just call me an idiot and skip those. - How much progress in design and understanding of ML models has actually been made in the past say 5 years? Pretty difficult to answer really especially since my impression is the big companies and foundations were keeping cards close to the chest before GPT-2. I was mostly concerned with building from scratch with Google Colab instances as hardware limitation so nothing I've worked with compares to what they were using in the biz even back then anyway. Are the recent commercial models reaching a plateau in what can be achieved, and are they using smoke and mirrors to hide weaknesses at all? - What kinds of network designs are favored for generative NNs? Is there a focus of width (nodes per layer) over depth (# of layers) or vise versa? Do they utilise input-reducing layers like CNNs (pooling etc)? - What kind of activation functions are modern generative AIs using? Is it still plain old Sigmoid and the like? I remember we used ReLU a decent amount but I can't imagine that would be useful for anything more complex than the data analysis we were working with since it is hardly a sophisticated function. - I remember most of the hardware bottleneck for neural networks was for training the NN, and not running the completed model. How advanced are the various free, locally runnable generative NNs nowadays? What kind of hardware so you need to run them?
|
# ? Mar 5, 2024 10:48 |
|
Insurrectionist posted:- What kinds of network designs are favored for generative NNs? Is there a focus of width (nodes per layer) over depth (# of layers) or vise versa? Do they utilise input-reducing layers like CNNs (pooling etc)? GPT stands for Generative Pre-trained Transformer, and the "transformer" bit is what to look at if you want to understand how these new generative models are different. The paper that really kicked things off is called "Attention Is All You Need". I don't think I understand attention and transformers enough to do it justice, but that's a good starting point.
|
# ? Mar 6, 2024 04:38 |
|
If a generative AI produces images that are similar to a target image, how strongly does that suggest that the target image is AI-generated?
|
# ? Mar 12, 2024 16:57 |
|
Entropist posted:Generative AI is just a fancy new word for pattern matching algorithm, it seems. In the same sense that it's correct to call both firecrackers and Tsar Bomba "explosives", yes.
|
# ? Mar 13, 2024 00:58 |
|
Insurrectionist posted:- What kinds of network designs are favored for generative NNs? Is there a focus of width (nodes per layer) over depth (# of layers) or vise versa? Do they utilise input-reducing layers like CNNs (pooling etc)? It depends on the application, but for language models, BAD AT STUFF has it right that most have settled on decoder-only transformers as the architecture of choice. (e.g. Mistral, Llama and Gemma) For generative tasks with images, diffusion models are where it's at (but people are still working with GANs). The diffusion denoising is typically done with the venerable U-Net, which does have the CNN-pooling layer structure you're probably familiar with. Insurrectionist posted:- What kind of activation functions are modern generative AIs using? Is it still plain old Sigmoid and the like? I remember we used ReLU a decent amount but I can't imagine that would be useful for anything more complex than the data analysis we were working with since it is hardly a sophisticated function. The language models linked above generally pick one of the GLU variants. However, there's at least one ICLR 2024 paper which argues that plain ReLU is actually completely fine. Insurrectionist posted:- I remember most of the hardware bottleneck for neural networks was for training the NN, and not running the completed model. How advanced are the various free, locally runnable generative NNs nowadays? What kind of hardware so you need to run them? The local LLMs are surprisingly good, is what I would say - they generate coherent answers and are pretty nifty for their size, but emphatically - don't expect a full ChatGPT replacement Loading 7B language models as-is requires a GPU with 16 GB of VRAM, but with quantization you can get away with 8 GB. (Comedy option: OnnxStream is set up to use the least memory necessary, letting you run Stable Diffusion XL on a Raspberry Pi Zero 2, albeit at the cost of requiring hours to generate an image)
|
# ? Mar 14, 2024 17:21 |
|
Not sure where to post this so I'll put it here. I'm streaming audio data from a microphone and am continuously running a Pytorch model on it (speech recognition and other stuff). On a Linux laptop with Ubuntu and a GeForce GPU the inference time is around 8ms, which is nice and fast. When I run the exact same code and model on a Windows desktop also with a GeForce GPU the inference time is around 20ms, more than twice as slow. What could be the reason for this? GPUs are the same on both systems and they are both being used as far as I can tell. I would understand a slight difference in performance depending on the operating system but this is quite large. Is it something to do with how the model was trained?
|
# ? Mar 19, 2024 03:22 |
|
Do you know where the latency is coming from? I'm not an audio expert by any means but I am wondering if your audio interface is introducing latency. Are you using a USB mic?
|
# ? Mar 19, 2024 03:30 |
|
USB mic on both systems and I'm using PyAudio for streaming. The specific bottleneck seems to be caused by the call to the model encoder. Unfortunately I'm not familiar with the details of the transformer model, but this command appears to be doing the bulk of the inference.
|
# ? Mar 19, 2024 03:44 |
|
(e: posted before I saw the reply, whoops) I would also start looking at the audio end first - in addition to the interface itself, there are different audio backends in Windows (MME, DirectSound, WASAPI), if you happen to be using PyAudio it's worth checking which one is actually being used. Alternatively, to rule out anything happening in the model itself, try setting up the PyTorch profiler and compare the time spent in inference. e2: Is the Windows version running natively or under WSL? My other thought was the Linux version possibly taking advantage of Triton or similar and falling back to a slower implementation on Windows overeager overeater fucked around with this message at 03:58 on Mar 19, 2024 |
# ? Mar 19, 2024 03:52 |
|
Thanks, I'll check those out. I'm running Windows natively.
|
# ? Mar 19, 2024 04:14 |
|
According to the profiler the Windows machine is taking longer for inference than the Linux one for basically all the operations. So it does seem to confirm that there is an issue with this particular model and Windows. I've tried on a couple of other desktops as well with the same results.
|
# ? Mar 19, 2024 05:09 |
|
Oh this is fun, working on my next paper on anomaly detection and I'm implementing and running some baseline methods. I saved implementing PCA analysis for last because it's EZ and figured it'd be nice to include a non machine learning baseline. Uh. The out of the box sklearn PCA analysis does a lot better than most of these baselines and my method is performing about the same. Lmao wtf, some of these papers report a difference of 60% between their methods and PCA. These are like AAAI papers too with basically identical preprocessing... Well that paper is getting derailed
|
# ? Mar 19, 2024 21:05 |
|
|
# ? Jun 8, 2024 08:24 |
|
Replication crisis in AI/ML when?
|
# ? Mar 19, 2024 21:17 |