|
Bob Morales posted:What the gently caress is a 'factory'? I know the Java joke is ObjectFactoryFactoryBeanFactory() but what is it? A factory is just a class that lets you make instances of other classes without having to figure out which constructor to use yourself. Let's say there's an interface called Shape, and it's got the methods draw() and getArea(). You may write a ShapeFactory class that returns an object of a different implementing class depending on what arguments you give its getShape() method. So if you have the method, ShapeFactory.getShape(String shapeName, int size). When shapeName is "CIRCLE" it may return a Circle object with radius size, but when shapeName is "SQUARE" it may return a Square object with side length size. They'll have different draw() methods and different getArea() methods but they can both be generated by ShapeFactory.getShape(). So you do all the work in the beginning to figure out what lists of parameters lead to which constructors, and then you never worry about it again. If you still don't get it maybe read this: http://www.tutorialspoint.com/design_pattern/factory_pattern.htm
|
# ? Mar 23, 2016 22:08 |
|
|
# ? Jun 1, 2024 15:50 |
|
HappyHippo posted:This is a classic example of Java forcing you to "noun your verbs" Java has static methods. So if you need a special method to create a Foo object a certain way, you can add it as a static method to the Foo class itself. Sure, "Foo" is a noun, but it's also a namespace. So that much is like any modern language. The problem in Java (prior to Java 8) is that it didn't allow static methods in interfaces. So, if "Foo" is an interface with multiple implementations, you couldn't add static methods to create different flavor of Foo objects. The traditional solution was to have a separate class, FooFactory, to store the different factory methods for creating Foo objects. Sometimes the factory class name doesn't actually contain "Factory", like how the Collections class contains a bunch of methods for creating different kinds of Collection objects. Java 8 allows for static methods in interfaces now, so Java no longer forces you to put factory methods in a separate class, but most APIs will probably continue to do for Java 6/7 compatibility. HappyHippo posted:(you can't just build something, you need to get a builder and then tell it to build).
|
# ? Mar 23, 2016 22:54 |
|
ExcessBLarg! posted:That's an oversimplification, and not quite right either. Yes I was being a little facetious there. However a lot of tutorials on Design Patterns™ skip over little details like "do you even need this?" and people tend to cargo cult the poo poo out of them so it's important to have a critical eye. Case in point: Ok first of all let's look at the factory method itself: code:
code:
code:
|
# ? Mar 23, 2016 23:10 |
|
ExcessBLarg! posted:You can "just build" something. However, since Java lacks keyword arguments, the Builder pattern avoids having to have constructors with many arguments, or having to pass in a "config" object. It's solving a different problem than the lack of global-scope methods. Builders also make it a lot more pleasant to create large numbers of similar objects, especially if they're immutable and can thus share internal state. You fill in the bulk of the builder with the stuff that's identical across all objects you're creating, then you tweak the few things that are different, create an object, tweak again, create another object, etc.
|
# ? Mar 23, 2016 23:17 |
|
Bob Morales posted:What the gently caress is a 'factory'? I know the Java joke is ObjectFactoryFactoryBeanFactory() but what is it? An ObjectFactoryFactoryBeanFactory is obviously a class that has a method that produces an object with simple get/set methods that has a member that is a class that has a method that produces an object of a class that has a method that produces an object of class Object. I mean, it's right there in the name.
|
# ? Mar 24, 2016 05:08 |
|
I feel like I post this link a lot, but http://java.metagno.me/ A lot of the real horrorshow classes that give the builder paradigm a bad name come from Spring.
|
# ? Mar 24, 2016 06:16 |
|
EDIT: Never mind I was being an idiot and trying to use a Ruby Controller when I should have been using the model.
Tea Bone fucked around with this message at 14:25 on Mar 24, 2016 |
# ? Mar 24, 2016 13:14 |
|
This is kind of a vague and open-ended question... I'm trying to think of any actual or hyopthetical cool/interesting/novel uses of machine learning in line of business applications and I can't think of anything. I'm thinking like...something a store owner would use to manage inventory, a property owner would use to manage their property, a trucking company would use to manage their fleet, etc. Anything come to anyone's mind?
|
# ? Mar 24, 2016 14:37 |
|
Thermopyle posted:This is kind of a vague and open-ended question... The problem you're running into is probably that most of these tasks either don't have the data to train a good model, exact methods exist, good approximate methods exist, or they don't want to deal with something that spits out an answer that could be wrong and you have no idea how often it's wrong, how confident the prediction is, or why it made that guess. Most of the stuff would probably just be predicative recommendations, like "hey, your store sold a lot of Barbies, maybe you'd like to stock Monster High dolls too next season?" Or it just finds obvious answers. Like, my university once spent a few terms on a research project with the local fire department trying to optimize their dispatch routes, only to discover they didn't need an entire fire station. The fire chief knew this, it was only built because they got a grant on the insistence they build a new fire station. And the predictions were basically what their two or three dispatchers were doing already anyway, so the automation didn't provide any extra insight. Personally I'm working on some research in the field of confident predictions -- an agent that can recognize when it has no earthly idea what it's doing and ask for help from an expert. The end goal is to make something that can safely automate things like power plants by taking the burden off human controllers, but not just go off the rails and send the core into meltdown when it's not sure what to do. But it's pretty nascent technology. Edit: There's been some talk in my department about trying to develop agents for things like police or construction training simulations, it's being considered semi-seriously so at least some smart people think it's feasible, but there's not much work on it. Linear Zoetrope fucked around with this message at 14:58 on Mar 24, 2016 |
# ? Mar 24, 2016 14:52 |
|
Thermopyle posted:This is kind of a vague and open-ended question... I think you need to first figure out if you could ever convince the business user that an AI would be helpful, and if you could ever recoup costs. Watson is being used for medical diagnosis and offering advice for lawyers, but that's really just serving as a specialized version of Google. Machine learning still requires some scale to be worthwhile.
|
# ? Mar 24, 2016 16:18 |
|
Web-based line of business apps could have some sort of scale, but maybe not in a way that's useful. Imagine a truckers.com provides something where fleet managers can sign up for an account and input data about their trucks and truckers and the truckers can input data about their trips on the fly and devices attached to the trucks upload data...you could have some sort of scale across all of the accounts. Of course you have privacy/security implications to think about. Anyway, I'm not actually proposing any solution, just wondering if anyone has done it or thought of doing it.
|
# ? Mar 24, 2016 16:44 |
|
I was thinking you could essentially print money if you could make a system that could learn to reliably transform business data. There's already tons of money in that space, so if you could undercut Data Transform Consultants Local Number 5032 by 10% with the kind of margins that you could get by spinning up instances on AWS/Azure on demand you'd probably be raking in billions in under five years. I kind of doubt it's possible to do with ML in its modern state, but I'm no expert.
|
# ? Mar 24, 2016 19:29 |
|
Munkeymon posted:I was thinking you could essentially print money if you could make a system that could learn to reliably transform business data. There's already tons of money in that space, so if you could undercut Data Transform Consultants Local Number 5032 by 10% with the kind of margins that you could get by spinning up instances on AWS/Azure on demand you'd probably be raking in billions in under five years. I kind of doubt it's possible to do with ML in its modern state, but I'm no expert. Most of the difficulty of data transformation stuff ends up being either accommodating dumb business demands (has to always be upper case and have three trailing spaces), or actually understanding the rats nest people have gotten themselves into. The rules creation part is usually rather easy, it's the people surrounding things that make it so difficult.
|
# ? Mar 24, 2016 19:33 |
|
Munkeymon posted:I was thinking you could essentially print money if you could make a system that could learn to reliably transform business data. There's already tons of money in that space, so if you could undercut Data Transform Consultants Local Number 5032 by 10% with the kind of margins that you could get by spinning up instances on AWS/Azure on demand you'd probably be raking in billions in under five years. I kind of doubt it's possible to do with ML in its modern state, but I'm no expert. You're basically talking about every ETL platform plus maybe Drools.
|
# ? Mar 24, 2016 19:53 |
|
Unrelated to my brain droppings above, I did a stupid in WebStorm. I opened a .dust file and dismissed some message that popped up telling me the highlighting was being overridden by some setting thinking I'd just install a plugin and it wouldn't matter. Well, I installed the Dust plugin and I'm still not getting highlighting, so clearly I need to (un)tick a box somewhere but I don't see where. Does this sound familiar to anyone?Blinkz0rz posted:You're basically talking about every ETL platform plus maybe Drools. But you still have to have a well-paid expert understand the rules well enough to describe the transforms in some formal language. I was thinking about using some before/after examples as training data and having the system derive the transform rules on its own. Again, it seems unlikely to be feasible (possible?) for the foreseeable future, especially given how weird edge cases crop up that would probably gently caress up the training. Maybe an intermediate step could be software that figures out some of the rules and auto-gens some code for an expert to start from to save some skilled labor, though.
|
# ? Mar 24, 2016 21:13 |
|
Thermopyle posted:This is kind of a vague and open-ended question... I spent a couple years doing quantitative retail for a large company that you've heard of. It's not entirely clear what's machine learning and what's statistics, stochastic optimization or stochastic control, but there are a lot of fairly advanced methods being used in getting sweaters from a warehouse to you. The major areas where quants work that I can think of off the top of my head are sales forecasting, inventory management, shipping, pricing, competitor price discovery, ad placement, search and recommendations. There are bodies of literature describing how to solve any of those problems, so if you're looking to do something novel, you definitely have to argue that what you have in mind is better than the standard solutions. Nobody ever really thought about applying machine learning to ETL. The problem there, in my experience, is that 80% of the data is regular enough to be described by simple rules, and the remaining 20% are a bunch of weird edge cases that you have to get right. That's really not the sort of scenario where an ML system is going to look good in terms of ROI. Thermopyle posted:Web-based line of business apps could have some sort of scale, but maybe not in a way that's useful. Imagine a truckers.com provides something where fleet managers can sign up for an account and input data about their trucks and truckers and the truckers can input data about their trips on the fly and devices attached to the trucks upload data...you could have some sort of scale across all of the accounts. Of course you have privacy/security implications to think about. I'm not really sure how much of a market there is in providing services like what I did to smaller businesses. It'd have to be pretty cheap because they don't have the sort of problems where a reasonable solution is that much worse than the optimal solution, and the people who can do these sorts of models tend to be somewhat expensive. On the other hand, there are companies trying to do third party analytics, so maybe it can work.
|
# ? Mar 25, 2016 00:34 |
|
What is the difference between casting an object and calling a decorator function on an object? I had a dream where someone explained this to me and something clicked but I just woke up and forgot their argument. It had something to do with arguments and value vs reference.
Marx Headroom fucked around with this message at 15:10 on Mar 26, 2016 |
# ? Mar 26, 2016 15:05 |
|
Mr. Jive posted:What is the difference between casting an object and calling a decorator function on an object? I had a dream where someone explained this to me and something clicked but I just woke up and forgot their argument. It had something to do with arguments and value vs reference. Casting an object is re-interpreting some location in memory as a different type. You usually explicitly cast things when going from a less-specific (superclass) to a more specific (subclass) thing. This more specific thing usually something "additional" you want to be able to use. If you squint hard enough, this is kinda similar to decorators, which accept some thing (an object, a function) and usually add some functionality to it.
|
# ? Mar 26, 2016 17:25 |
|
Gravity Pike posted:An ObjectFactoryFactoryBeanFactory is obviously a class that has a method that produces an object with simple get/set methods that has a member that is a class that has a method that produces an object of a class that has a method that produces an object of class Object. I mean, it's right there in the name. Just reminded of the old joke: quote:I had a problem and used Java.
|
# ? Mar 27, 2016 21:51 |
|
Not sure if this is the best place to ask but is BitTorrent resilient to any kind and magnitude of data corruption? For instance, if I have a completed torrent on disk and I randomly flip, delete and add bits all over the place, is there enough error correction data in the torrent to be able to correct all of those data alterations? Of course there's a point where the data will no longer resemble the original data at all but still, will BT just clobber the entire bad file or will it just throw its hands up?
|
# ? Mar 28, 2016 00:20 |
|
Shaocaholica posted:Not sure if this is the best place to ask but is BitTorrent resilient to any kind and magnitude of data corruption? For instance, if I have a completed torrent on disk and I randomly flip, delete and add bits all over the place, is there enough error correction data in the torrent to be able to correct all of those data alterations? Of course there's a point where the data will no longer resemble the original data at all but still, will BT just clobber the entire bad file or will it just throw its hands up? I don't think the torrent itself has any error correction built in, but it will be able to detect that a given blocks hash is no longer good and then download that block again.
|
# ? Mar 28, 2016 00:25 |
|
Skandranon posted:I don't think the torrent itself has any error correction built in, but it will be able to detect that a given blocks hash is no longer good and then download that block again. Yeah, the torrent file itself is basically just a list of filenames and metadata plus a checksum for each block of data. Barring hash collisions or absurd luck, anywhere between a single bit flipped in each block and a file of entirely random data will just end up resulting in the entire file being redownloaded from peers.
|
# ? Mar 28, 2016 02:46 |
|
Plorkyeran posted:Yeah, the torrent file itself is basically just a list of filenames and metadata plus a checksum for each block of data. Barring hash collisions or absurd luck, anywhere between a single bit flipped in each block and a file of entirely random data will just end up resulting in the entire file being redownloaded from peers. Skandranon posted:I don't think the torrent itself has any error correction built in, but it will be able to detect that a given blocks hash is no longer good and then download that block again. Thanks. What if the file size is different which would result in blocks being offset? I guess that would just fall into the 'all blocks bad, get all blocks'? I guess that's really down to the fine details how how BT checks blocks.
|
# ? Mar 28, 2016 04:10 |
|
Shaocaholica posted:Thanks. What if the file size is different which would result in blocks being offset? I guess that would just fall into the 'all blocks bad, get all blocks'? I guess that's really down to the fine details how how BT checks blocks. The torrent contains all the file meta data as well, so if the file size/name changed, then it would recognize it's not right, and a client would either throw an error or start getting all blocks again.
|
# ? Mar 28, 2016 04:52 |
|
I'm trying to implement the Newton-Raphson iteratively-reweighted least squares method of logistic regression in Matlab for homework. However, I'm finding that I'm encountering a lot of matrices which are "singular to working precision". The update function does this: (For ease of typing I'll call Phi in the above image X.) I calculate y, R, and z, then plug everything in to the second line in that image. I've narrowed down the problematic matrix to X' * R * X. Is there some other way I should be doing this?
|
# ? Mar 28, 2016 19:09 |
|
hooah posted:I'm trying to implement the Newton-Raphson iteratively-reweighted least squares method of logistic regression in Matlab for homework. However, I'm finding that I'm encountering a lot of matrices which are "singular to working precision". The update function does this: Not unless X and R are square, in which case you can compute inv(X)*inv(R)*inv(X'), but R usually isn't square I don't think that's it. The usual culprits in logistic regression tend to be either a subtly incorrect logistic function, or not normalizing the features. Also make sure you don't have a flipped sign somewhere, that's bitten me before. It could also be that you're not detecting convergence quickly enough/doing too many iterations. What does the result look like if you just escape when you get the error and use the trained function? Does it have reasonable accuracy? Linear Zoetrope fucked around with this message at 19:29 on Mar 28, 2016 |
# ? Mar 28, 2016 19:26 |
|
Given a log file with two rows like this:code:
[line A] "\"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0\"" [line B] "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" (the [line A] and [Line B] tags I have added in and are not part of the log file) I am trying to import each line into a database table where double quotes indicate the start and end of a text string, and the weird (\ ") shape at the front and back of the string in [line A] are breaking the import process. There are other patterns in the thousands of log files I am trying to import involving nested double quotes that break my text identified strngs. I would like to identify a regex string that will eliminate the nested double quotes from within the text string, without stripping the double quotes in a "good" string. Or is there a better way of making the strings consistent?
|
# ? Mar 28, 2016 19:49 |
Agrikk posted:I would like to identify a regex string that will eliminate the nested double quotes from within the text string, without stripping the double quotes in a "good" string. Or is there a better way of making the strings consistent? If you have a CSV parsing library available, and you can configure all delimiters it uses, try telling it to use space as field delimiter, and double quotes with backslash escapes for quoted strings. That should get you correct handling.
|
|
# ? Mar 28, 2016 19:58 |
|
Agrikk posted:I would like to identify a regex string that will eliminate the nested double quotes from within the text string, without stripping the double quotes in a "good" string. Or is there a better way of making the strings consistent? Jesus Christ, no. You want to bind your parameters into a query, and let your query library handle escaping quotes, etc. for you. In general, if you are generating a query by inserting variables into strings, You Are What Is Wrong With Databases. A proper query gen should look something like this: code:
|
# ? Mar 28, 2016 19:59 |
|
TooMuchAbstraction posted:Jesus Christ, no. You want to bind your parameters into a query, and let your query library handle escaping quotes, etc. for you. In general, if you are generating a query by inserting variables into strings, You Are What Is Wrong With Databases.
|
# ? Mar 28, 2016 20:02 |
|
Your Φ isn't rank-deficient, is it? ΦT R Φ will be singular if so, and that's a simple case that's worth ruling out early. Lysidas fucked around with this message at 20:13 on Mar 28, 2016 |
# ? Mar 28, 2016 20:02 |
|
TooMuchAbstraction posted:Jesus Christ, no. You want to bind your parameters into a query, and let your query library handle escaping quotes, etc. for you. In general, if you are generating a query by inserting variables into strings, You Are What Is Wrong With Databases. mystes posted:I want to somehow believe the question was about parsing the string to insert different parts of it into different columns, but then the database part would have been irrelevant to the question, so I don't know what to think. This is correct. SSIS requires that I have columns defined for the ETL job where the columns match up with the delimiters (in this case the delimiter is a space, and text strings are wrapped with double quotes. Because there exist nested double quotes, the ETL job is parsing columns incorrectly and the import fails. All I'm looking for is to remove any nested double quote. Agrikk fucked around with this message at 20:28 on Mar 28, 2016 |
# ? Mar 28, 2016 20:25 |
Agrikk posted:This is correct. SSIS requires that I have columns defined for the ETL job where the columns match up with the delimiters (in this case the delimiter is a space, and text strings are wrapped with double quotes. Because there exist nested double quotes, the ETL job is parsing columns incorrectly and the import fails. As I wrote, any chance of handling it as if it was CSV, just with spaces instead of commas? Because then you should get the quote handling for free.
|
|
# ? Mar 28, 2016 20:39 |
|
Okay, sorry for leaping to bad conclusions, but I heard "use a regex to fix my database input" and everything went red for a moment there. So what exactly are you looking to do here? Just remove the quotation mark? If it's always preceded by a \ then that's pretty easy: code:
Do nested quotation marks only ever show up in the last "entry" on the line? Then you can do a greedy match up to the last quotation mark that precedes a "bad" mark, split the line into "good" and "bad" entries, and remove quotation marks from the "bad" section.
|
# ? Mar 28, 2016 20:43 |
|
nielsm posted:As I wrote, any chance of handling it as if it was CSV, just with spaces instead of commas? Because then you should get the quote handling for free. No, that won't work because the userAgent string can also be a random number of words that would then get broken up into a random number of columns. TooMuchAbstraction posted:Okay, sorry for leaping to bad conclusions, but I heard "use a regex to fix my database input" and everything went red for a moment there. No problem. The problem is the random nature of the user agent string. It is wrapped by quotes to indicate it is a string, but it can have a number of patterns within it, including varying number of words, included double and single quotes and whatnot, and it doesn't appear at the end of the line. What I'm trying to avoid is having the ETL process bomb out and then me having to add another find/replace special case to a preprocessing step that filters out the weird double quote combinations.
|
# ? Mar 28, 2016 21:01 |
|
Lysidas posted:Your Φ isn't rank-deficient, is it? ΦT R Φ will be singular if so, and that's a simple case that's worth ruling out early. Yeah, evidently it is. The original data is from a spam database which has 4,600 examples and 57 attributes and one class label. For whatever reason, the professor modified the data "by splitting above and below the mean count". This results in 114 attributes and 2 class labels (which don't seem to have any relation to the labels file he gave us, but whatever). I'm assuming the rank-deficiency is due to this meddling with the data, since the rank of the data matrix is only 58. I'll ask him tomorrow what we're supposed to do with that weird format. Thanks for helping me narrow it down.
|
# ? Mar 28, 2016 21:09 |
Agrikk posted:No, that won't work because the userAgent string can also be a random number of words that would then get broken up into a random number of columns. Really? 1. Replace \" sequences with "" sequences 2. Import as CSV with space as field delimiter Works in Excel at least, and if you're using MS SSIS I'd assume it uses compatible CSV reader code. Other CSV readers may use different escaping rules for quote marks. Yes I'm really set on this idea. If you can use a real parser, do so. The format seems to be written for CSV-style reading. nielsm fucked around with this message at 21:30 on Mar 28, 2016 |
|
# ? Mar 28, 2016 21:19 |
|
nielsm posted:Really? Hrm... Closer. But SSIS barfs on one of the quotes: MS is notoriously bad at consistency between SQL Server, Excel, Access, etc in terms of how it handles data. Oh well, I think I'll just create a library of substitutions to make.
|
# ? Mar 28, 2016 21:36 |
|
TooMuchAbstraction posted:
You suffer from the same affliction I try to rid myself of: gratuitous use of cat (sed can take a filename as an arg) Other commands I used to do that poo poo with regularly: less, grep, and their z* variants
|
# ? Mar 28, 2016 22:03 |
|
|
# ? Jun 1, 2024 15:50 |
|
No Safe Word posted:You suffer from the same affliction I try to rid myself of: gratuitous use of cat (sed can take a filename as an arg) You're right, of course. But I didn't know about zless, zgrep, etc. That's cool!
|
# ? Mar 28, 2016 22:35 |