|
Carp posted:Wow, yeah, LoRA sounds very interesting, and I agree that it would make fine-tuning a large model much easier. How has it worked out for you? There is so much new information out there about deep learning and LLMs. If I come across the paper again, I'll be sure to let you know.
|
# ¿ Mar 30, 2023 16:27 |
|
|
# ¿ May 10, 2024 09:53 |
|
Carp posted:What have you used them for in the past and what do think of ChatGPT and GPT-4? Here's a typical example of where I'd use something like this: Suppose I have a bunch of consumer reviews with 1-5 stars, with each review associated with a product, user demo info, etc., but that also contains a text review field with natural language. There's potentially a lot of good info locked up in the review text field. However, there's probably too little data to train my own text model, most of the text is short (which means the words have little context of their own), and it would be too much of a PITA anyway. So, instead of doing that, I just get the word vector embedding. Each word in each review gets its vector. Then, those vectors are averaged over each review, so now each review has a semantic vector associated with it (there's other/better ways to do that, but whatever). Just like with the words, semantically similar review texts have similar embedding vectors. These embeddings can then be used alongside the other review data in a downstream model to actually predict the rating. I'm no expert on the newer models, but the main part of the transfer learning task is still going to be text => numeric vectors, except going through a more sophisticated prediction-context-aware transformer model rather than being essentially a dictionary of words to vectors. cat botherer fucked around with this message at 00:54 on Mar 31, 2023 |
# ¿ Mar 31, 2023 00:50 |
|
Aramis posted:More specifically, anything related to structuring information for human consumption is definitely going to be dead in the water real quick. Technical writers, copy editors, etc... Carp posted:That's a pretty good summary. Much better than my notes earlier in the thread, which are a little confused.
|
# ¿ Mar 31, 2023 15:47 |
|
SaTaMaS posted:Another promising area is the possibility that ChatGPT can look at really old languages like Cobol and Fortran and not just improve the documentation but translate it into modern languages using cleaner code. I think its one area where ChatGPT would really be lead astray. ChatGPT only understands text (including code) and textual contexts. Good code written in a modern structured programming language will usually have a pretty decent mapping between syntax and computational semantics. ChatGPT has no idea about any kind of computational semantics, but it is possible that there exists a faithful enough mapping, code:
Cobol is not modern or structured - the relationship between syntax and semantics, in CS terminology, is "hosed up." Because the code is unstructured, it's a massively complex ball of entropy with all sorts of non-local interactions. A piece of code might do very different things depending on current program state, etc. The program can only be understood as a whole, and only fully upon running it many, many times with different inputs - so it's just something that ChatGPT can't do. cat botherer fucked around with this message at 17:28 on Mar 31, 2023 |
# ¿ Mar 31, 2023 16:55 |
|
Seyser Koze posted:So all we need is a society completely unlike the one we live in, run by people completely unlike the ones running it, and a ton of people losing their jobs will be no issue. Great.
|
# ¿ Mar 31, 2023 17:27 |
|
SCheeseman posted:Impede, obstruct, whatever. In any case what you want isn't going to make AI art generators uneconomical, it'll make the 'legal' ones economical only for the entrenched IP hoarders. What you want changes nothing about how people will in actuality be exploited and may even serve to make it worse! Whatever your problem is, the answer is not "expand copyright protections."
|
# ¿ Apr 4, 2023 23:59 |
|
KwegiboHB posted:https://hub.jhu.edu/2023/02/28/organoid-intelligence-biocomputers/
|
# ¿ Apr 5, 2023 01:42 |
|
StratGoatCom posted:This isn't some ip cartel, this is enforcement of long standing rights in law in basically every legal system.
|
# ¿ Apr 5, 2023 01:47 |
|
StratGoatCom posted:For the hundedth time, this isn't. This is merely using already existing rules. Allowing it to be otherwise in fact will have that effect you fear, because it makes literally anything free real estate for billionaire bandits. Indeed, the point is laundering this behavior, much as crypto was laundering for securities bs.
|
# ¿ Apr 5, 2023 01:59 |
|
StratGoatCom posted:So? By training those models, they clearly crossed long established lines on copyright law.
|
# ¿ Apr 5, 2023 02:26 |
|
StratGoatCom posted:Given the commercial nature of these models and that they create similar outputs WITHOUT permission, no I do not think fair use harbor applies.
|
# ¿ Apr 5, 2023 02:36 |
|
eXXon posted:There's a class action lawsuit over GitHub CoPilot currently ongoing, filed in November. Microsoft asked to dismiss it in January. No idea what to expect next.
|
# ¿ Apr 5, 2023 03:40 |
|
KwegiboHB posted:How about we... and I know I'm being crazy over here... NOT torture the disembodied brains. Or the embodied brains either.
|
# ¿ Apr 5, 2023 03:41 |
|
BrainDance posted:There was this really cool artificial life game series called Creatures back in the day. It was really what got me into the early internet (it had a scripting language to create objects, the creatures had artificial DNA and could evolve so you could export them and share them. It was actually incredibly cool and way better than I'm explaining it.) Like a really complex tamagotchi.
|
# ¿ Apr 6, 2023 16:53 |
|
ChatGPT cannot control a robot. JFC people, come back to reality.
|
# ¿ Apr 8, 2023 23:53 |
|
KillHour posted:Mods!?
|
# ¿ Apr 23, 2023 16:24 |
|
Delthalaz posted:Regarding the fears of AI superintelligence and world domination, I'll be a lot more concerned if Paradox can ever develop an "AI" that can beat an average human player without cheating. Those games are pretty complicated, but not nearly as complicated as the real world, so...
|
# ¿ May 3, 2023 19:14 |
|
Folks, that was a joke post.
|
# ¿ May 3, 2023 22:04 |
|
Bar Ran Dun posted:Another AI showing human reasoning article, this time in the times. Based on a Microsoft paper. Also a new funny Bard thing just dropped: https://twitter.com/goodside/status/1657396491676164096
|
# ¿ May 16, 2023 19:16 |
|
SubG posted:There really isn't any plausible argument for capitalism ending mediaeval feudalism. The normal framing is that agrarian feudalism (roughly the thousand years preceding the 16th Century, although you can fiddle with the endpoints a lot) was supplanted by mercantilism (roughly the 16th to 18th Centuries) which lead to capitalism (somewhere around the late 18th/early 19th Century).
|
# ¿ May 16, 2023 22:13 |
|
PT6A posted:Yeah, I think one aspect of feudalism that gets dismissed or ignored a lot is that it's based on mutual obligation in a semi-theocratic framework. If you upset God by failing to execute the obligations of your divinely-ordained station, it's open season on you! During most of the middle ages, calvary was king. This needed horses and armor, which the great mass of peasants couldn't supply - but landowners could. Until the advent of longbows and guns, knights were essentially invincible against peasants, so peasant uprisings were easy to squash. While the lords needed the peasants to farm the land, and the peasants needed the land to eat. This shifted power towards landowners, within limits. Any kind of spiritual obligation only existed on Sundays. People were animated by material concerns, same as now.
|
# ¿ May 17, 2023 00:07 |
|
Count Roland posted:I believe LLMs are poor at logic. Dealing with facts requires the AI to state things are true or false. Such statements can be logically modified ie if x is true then y. A model that is guessing the next symbols in a phrase will sometimes pull this off but can't itself be reliable. The AI needs to do logical operations. Which I assume is possible, given how logic-based computing is.
|
# ¿ May 19, 2023 17:13 |
|
Liquid Communism posted:The AI's entire 'memory' consists of its training set. Hence why you cannot remove something from said training set without retraining the AI, or it will continue to use what has been indexed.
|
# ¿ May 22, 2023 14:41 |
|
Clarste posted:The idea is to define the program as something that cannot "learn" and can only "copy" so therefore anything in its training set is copying by definition. Like tracing. A computer cannot have a style, it can only trace things.
|
# ¿ May 22, 2023 17:04 |
|
Yeah you cannot copyright styles. That's never been the case, and doing it on a computer does not change that. https://www.thelegalartist.com/blog/you-cant-copyright-style
|
# ¿ May 22, 2023 17:41 |
|
Liquid Communism posted:Yep. Even the 'draw a thing in Bob Ross' style' prompt is a dodge, because the algorithm has no idea what Bob Ross' style is. It knows there were files in its training set that were human-tagged as being produced by or similar to Bob Ross, and will now iterate on parts of them to generate an image that the human user will then decide is or is not what they wanted.
|
# ¿ May 22, 2023 18:11 |
|
This thread would be a lot easier if people argued based on the ML models and copyright laws that actually exist. It seems that people think these models are some kind of database.
|
# ¿ May 22, 2023 18:13 |
|
Clarste posted:I am saying the law can declare it so regardless of what you or anyone thinks, and a lot of people with a lot of money have a vested interest in strong copyright laws. This isn't a philosophical discussion, the law is a tool that you use to get what you want. Clarste posted:I super do not see how this actually matters. You input copyrighted material into the machine. Whether it happened before or after "training" is 100% irrelevant to the issue of whether we want that to be a thing and how we might stop it.
|
# ¿ May 22, 2023 18:25 |
|
Clarste posted:Case law can go wherever it wants, but if people with money don't like where it went they can buy a senator or 50. All I have ever been saying is that the law can stop it if it wants to, and all these arguments about the internal workings of the machine or the nature of art are pretty irrelevant to that. cat botherer fucked around with this message at 18:36 on May 22, 2023 |
# ¿ May 22, 2023 18:34 |
|
StratGoatCom posted:AI, or very likely to have been trained on such, yes.
|
# ¿ May 22, 2023 23:44 |
|
StratGoatCom posted:Because it isn't
|
# ¿ May 23, 2023 00:00 |
|
It’s actually kind of astonishing how basic most of the math is. It’s just intuition on the best way to use it.
|
# ¿ May 23, 2023 23:57 |
|
SubG posted:No, the thing they cite is not The Bell Curve. They cite an opinion piece in The Wall Street Journal signed by 52 scientists. It's called Mainstream Science on Intelligence.
|
# ¿ May 24, 2023 02:18 |
|
Languages are indeed fuzzy, which is probably why computational linguistics hasn't made nearly as much progress as simpler statistical models on things like machine translation.
|
# ¿ May 24, 2023 16:45 |
|
English isn’t really “less structured” than Russian. What Russian conveys in conjugation and declension, English conveys with word order and sometimes more words, like auxiliary verbs. In linguistic terms, English is analytic in that it breaks things down, with a small ratio of morphemes (word parts) to words. Russian is the opposite in that it is a synthetic language. Speech in both languages can exist on a wide continuum of ambiguous to exact.
|
# ¿ May 24, 2023 17:33 |
|
NoiseAnnoys posted:exactly, thank you. gurragadon posted:So, its basically just elitism and snobbery from the people in Moscow? Like how French people think (used to think?) that regional accents weren't really French. e: ambiguous typo cat botherer fucked around with this message at 18:09 on May 24, 2023 |
# ¿ May 24, 2023 17:55 |
|
SubG posted:With the VM? Nah. There's no "curve of possibilities" because there's nothing indicating what the underlying distribution is. We can estimate e.g. the entropy of the script and from that estimate the amount of information the VM encodes...but to a first order approximation that just tells us how much additional information we'd need in order to produce a meaningful "solution". As you say, you never know what the underlying distribution is, but you also never can know the actual entropy of the script, because it depends on an unknown optimal code, or equivalently, a distribution, to describe it. Information entropy is just the expected value of the log probability, but that requires knowledge of the probability distribution to calculate in the first place. Thus, information theory is inseparable from probability. No matter what, there is some kind of assumptions that must be made, and the anything we infer is colored by those decisions. cat botherer fucked around with this message at 00:10 on May 26, 2023 |
# ¿ May 26, 2023 00:05 |
|
SubG posted:I understand what you're trying to say, but I don't see how this contradicts anything I said. The issue is that any estimation of the entropy of Voynichese, and therefore information in the VM, just tells us how much there is, not what it is. Put in slightly different terms: it lets us figure out how to compress the text, not how to decrypt/translate it.
|
# ¿ May 26, 2023 00:43 |
|
Bar Ran Dun posted:That’s amazing to me. These models are only copyrighted? They aren’t patenting these models? Even with an algorithmic patent, it’s extremely hard to prove anyone else is using it (it’s almost impossible to reach a level of evidence/suspicion to sue/get a subpoena) without access to the code. Getting a patent means you have to show the whole rear end of your algorithm, which is thus not a good idea given how easy it is to infringe. With a lot of this stuff, places like OpenAI will publish papers on sometimes innovative aspects of what they’re doing. However, if it’s anything like some places I’ve worked, they’re holding back some important but non-obvious practical details. They aren’t idiots. The concept of patents on algorithms is incoherent. “Math” results or techniques cannot be patented, but courts consider algorithms to not be part of math. Mathematicians and computer scientists disagree. cat botherer fucked around with this message at 01:34 on May 27, 2023 |
# ¿ May 27, 2023 01:22 |
|
|
# ¿ May 10, 2024 09:53 |
|
StratGoatCom posted:If your model ate someone's stuff and it emulates it, you are not covered under fair use.
|
# ¿ May 27, 2023 01:27 |