So another way to formulate the problem would be finding a smallest set of items to remove from the current ordering, to leave the remaining items in correct order. After solving that it's just determining ordering numbers for the removed items to insert them correctly which should be trivial. The problem is just that the number of possible subsets is exponential in the size of the set. My best guess at a heuristic is still finding correctly ordered subsets, you should be able to do that in linear time. If the actual data is already partially sorted the number of these subsets should be relatively low, and you might be able to just do an exhaustive search for the best set of those to get the smallest set to remove. If your data is hopelessly shuffled, I don't think you're going to find an efficient reordering solution in short time, and your time is better spent sending off reordering requests for each item.
|
|
# ? Apr 14, 2015 19:24 |
|
|
# ? Jun 11, 2024 04:42 |
|
What would you do if the list came down reversed?
|
# ? Apr 14, 2015 19:39 |
|
ultrafilter posted:What would you do if the list came down reversed? I'd probably just renumber the whole thing even though in the case of odd number of items I could do it in len(items) -1 requests. As the most common case is likely that a large percentage of the items are already numbered correctly, I'm going to investigate some heuristics around this idea: nielsm posted:My best guess at a heuristic is still finding correctly ordered subsets, you should be able to do that in linear time. If the actual data is already partially sorted the number of these subsets should be relatively low, and you might be able to just do an exhaustive search for the best set of those to get the smallest set to remove.
|
# ? Apr 14, 2015 19:54 |
|
My intuition tells me your problem is related to the Levenshtein distance between the initial string and the final in order string. You might want to look that up.
|
# ? Apr 14, 2015 20:13 |
|
I'm not sure if this question is appropriate for this thread or not, but let's give it a shot... I'm part support, part developer, part QA, part DBA at my company. Recently I've had to do a lot more code diving than what my Not-Computer-Science degree prepared me for. There is little documentation on how it works and I'm kind of out my league. What is the best way for me to go about learning how my company's program works and documenting it so that it both sticks with me and the poor sap after me has an easier time? I understand it's going to be a long, grueling process of going through code line-by-line to try and see why and how things work and it'll likely take months of work, but I feel like it's the only way I'll learn. Could anyone suggest books, blogs, methods, or anything to help actually dive into byzantine code and understand it?
|
# ? Apr 14, 2015 22:10 |
|
HappyHippo posted:My intuition tells me your problem is related to the Levenshtein distance between the initial string and the final in order string. You might want to look that up. The optimal number of renumerations is certainly an edit distance, but not Levenshtein. You can still use Wagner-Fischer (the standard edit distance algorithm) with arbitrarily large transpositions as your one edit operation to get the answer, but you'll end up with O(n^2) complexity. However: you should be able to boil the problem down to operating on correctly ordered subsequences of the list, where correctly ordered means the elements in the subsequence are in exactly the same relative position as they would be in the sorted list. Finding those is potentially somewhat expensive, but should be cheap for nearly-sorted inputs. Then you can solve it greedily by renumerating the shortest out-of-order subsequence until done and still be optimal. I think. Hopefully. [E:] I should mention the above is still O(n^2) in general and if your list isn't nearly sorted then it sucks. Still, in that case you need to renumerate everything so aiming for fast in the nearly-sorted case is probably the best call you can make. Xerophyte fucked around with this message at 00:32 on Apr 15, 2015 |
# ? Apr 15, 2015 00:25 |
|
If you take all of the sorted sequential subarrays and merge them together, you can use something like insertion sort to place the remaining elements into the result. That might be the best approach here.
|
# ? Apr 15, 2015 00:29 |
|
My gut feeling is that most potential complexity issues will be completely overshadowed by the saved network overhead at average list lengths for the average out-of-orderness. We'll see when I tackle the problem again in a couple days (so I have time to think about it some more and tackle it afresh). Also, thanks for talking about it, you guys are giving me some good ideas to try out.
|
# ? Apr 15, 2015 00:32 |
|
Thermopyle posted:I've muddled through a suboptimal solution for this, but it feels gross and way complicated and I don't really know what to Google for. Is there a general strategy for this type of operation? You have a permutation that you're trying to reverse, so maybe try to decompose it as disjoint cycles and work on each one separately?
|
# ? Apr 15, 2015 03:36 |
|
I updated my rbenv Ruby version to from 2.1.0 to 2.3.0dev, and now I get this, whenever I run a CLI gem: Ignoring [gem] because its extensions are not built. Try: gem pristine [gem] —version x.y.z I get dozens of lines of this for different gems. I also get this, after I switched back to 2.1.0. I use Ruby exclusively for gems, so I suck utterly at managing versions, PATH, and whatnot. I also have rvm installed, so the chance of blowing it all up is non-zero. Any idea why this blew up, and why it stays that way, even when I switch back to 2.1.0?
|
# ? Apr 15, 2015 10:20 |
|
I have a field that contains YYYY-MM-DD dates. I need to convert it to iso8601 (YYYY-MM-DDTHH:MM). Only future values will have actual HH:MM values. What do I put as the time for existing data? I was going to just go with 00:00 but someone else seems to think that's insane.
|
# ? Apr 15, 2015 14:09 |
|
Ederick posted:I'm not sure if this question is appropriate for this thread or not, but let's give it a shot... I'm part support, part developer, part QA, part DBA at my company. Recently I've had to do a lot more code diving than what my Not-Computer-Science degree prepared me for. There is little documentation on how it works and I'm kind of out my league. What is the best way for me to go about learning how my company's program works and documenting it so that it both sticks with me and the poor sap after me has an easier time? I'd recommend stepping through code in a debugger as much as possible - especially if you're unsure on how something works. Although when dealing with old code that has been heavily modified over the years (and not cleaned up), even something that looks obvious may not be. As you go along, be sure to document why code does what it does when you figure it out. In the older product that I work on, that's always the tricky part. Figuring out what the code is doing isn't as difficult as understanding why. What language is it in? If you have a decent IDE, that can help a lot with tracking down where functions are called and things like that. Something older might involve a lot of grepping.
|
# ? Apr 15, 2015 14:20 |
|
EAT THE EGGS RICOLA posted:I have a field that contains YYYY-MM-DD dates. I need to convert it to iso8601 (YYYY-MM-DDTHH:MM). Only future values will have actual HH:MM values. What do I put as the time for existing data? I was going to just go with 00:00 but someone else seems to think that's insane. That's pretty standard. Did you ask this person why he/she thought it was insane?
|
# ? Apr 15, 2015 14:25 |
|
EAT THE EGGS RICOLA posted:I have a field that contains YYYY-MM-DD dates. I need to convert it to iso8601 (YYYY-MM-DDTHH:MM). Only future values will have actual HH:MM values. What do I put as the time for existing data? I was going to just go with 00:00 but someone else seems to think that's insane. Do they have a better idea? 00:00 seems sane enough to me.
|
# ? Apr 15, 2015 14:26 |
|
EAT THE EGGS RICOLA posted:I have a field that contains YYYY-MM-DD dates. I need to convert it to iso8601 (YYYY-MM-DDTHH:MM). Only future values will have actual HH:MM values. What do I put as the time for existing data? I was going to just go with 00:00 but someone else seems to think that's insane.
|
# ? Apr 15, 2015 14:32 |
|
KARMA! posted:That's pretty standard. Did you ask this person why he/she thought it was insane? They want it to be 08:30, which will be the standard release datetime for products in the future (but all existing dates are already in the past)
|
# ? Apr 15, 2015 14:33 |
|
EAT THE EGGS RICOLA posted:I have a field that contains YYYY-MM-DD dates. I need to convert it to iso8601 (YYYY-MM-DDTHH:MM). Only future values will have actual HH:MM values. What do I put as the time for existing data? I was going to just go with 00:00 but someone else seems to think that's insane. 00:00 in which time zone?
|
# ? Apr 15, 2015 14:50 |
|
csammis posted:00:00 in which time zone? ISO8601 requires either local time, utc, or utc+offset, but all of these dates are standardized for the same time zone.
|
# ? Apr 15, 2015 14:54 |
|
EAT THE EGGS RICOLA posted:They want it to be 08:30, which will be the standard release datetime for products in the future (but all existing dates are already in the past) Systems, processes, and standard release times can and will change in the future. I wouldn't add some arbitrary time to previous data based on that. 00:00 also acts as a good marker to show where your data previously did not have that level of precision.
|
# ? Apr 15, 2015 14:58 |
|
Bognar posted:Systems, processes, and standard release times can and will change in the future. I wouldn't add some arbitrary time to previous data based on that. 00:00 also acts as a good marker to show where your data previously did not have that level of precision. YES THANK YOU
|
# ? Apr 15, 2015 15:04 |
|
If all that logic isn't helping, maybe clearly laying out the other 1438 possible values and coming up with pros/cons would help. Here's the list right now: 1. 00:00 - Simple, shows when real time was added 2. 08:30 - Magical value from the future that will be tacked onto everything 3. 08:29 - Sorts before 08:30, also shows when real time was added
|
# ? Apr 15, 2015 15:09 |
|
EAT THE EGGS RICOLA posted:ISO8601 requires either local time, utc, or utc+offset, but all of these dates are standardized for the same time zone. If you're 100% positive it's only going to ever be in the same timezone it's okay to leave it local, but I'd consider adding the correct utc+offset anyway to future proof yourself. Our application has a datatype stored as YYYY-MM-DD and often we get requests to update fields of that type to YYYY-MM-DDTHH:MM - which today is used globally with instances crossing timezones routinely, but back in the day someone thought "hey a single instance of this application won't ever be used across timezones." Now there's no single zone which will make the dates "correct" for everyone if we add a time part. Well, time makes fools of us all And for the record I'm voting 00:00 too for all the reasons above.
|
# ? Apr 15, 2015 15:14 |
|
csammis posted:If you're 100% positive it's only going to ever be in the same timezone it's okay to leave it local, but I'd consider adding the correct utc+offset anyway to future proof yourself. This is an application for a government that is limited in geolocation, so although I absolutely can't be 100% positive that it won't ever need to expand beyond the area, it's the likeliest scenario I can think of where local time is okay (and the pushback from people that don't understand things to trying to do something 'weird' like use utc/utc+offset will be ridiculous).
|
# ? Apr 15, 2015 15:23 |
|
EAT THE EGGS RICOLA posted:This is an application for a government that is limited in geolocation,
|
# ? Apr 15, 2015 18:40 |
|
I've been spending the past few weeks debugging stuff in a web service, the behavior of which is highly dependent on the local time of the client. Which the client is not required to provide in their request. Which means that we use geoIP location. Except for mobile, where we use lat/long. But sometimes zip code. And sometimes we have no idea and treat them as if they could be anywhere in the continental US. gently caress timezones forever.
|
# ? Apr 15, 2015 21:34 |
|
Got this problem that seemed easy at first, thought I'd solved it and was wrong. Pretty sure I have it now, but it feels clunky. I scan a directory of files and create 'groups.' Let's say the file names are NAME+sequenceNumber. AAA-001, AAA-002, BBB-001, etc. Each group is just the name plus every other file in it's sequence, so out of these three files There would be 2 groups, AAA(containing 2 files) and BBB containing 1 file. I only want a complete group. If all files in the sequence aren't there yet, I want to ignore it... but there is no way to know how many items are in a sequence(some files might not yet be written). A sequence is always written at the same time though, so all of group AAA would be written within a 1-10 second span. At first, I just scanned the files and ignored files younger than 30 seconds. That appeared to work but didn't really of course, because a sequence could possibly straddle the 30 second mark. Now I scan the files, and build two sets of groups, files older than 30 seconds and files younger than 30 seconds. After the scan, I check every file in the younger groups to see if a group name matches one in the older groups and add it in. Is there a better way?
|
# ? Apr 16, 2015 06:00 |
|
Well, I don't think you're going to be able to solve that problem deterministically. You're going to have to accept some error, or convince whoever wrote whatever process is outputting these files to start off with writing an A-METADATA file or something that encodes the ultimate length. I think the "simplest" solution is just to scan the directory, sleep for 10 seconds (since that's your upper bound), and then scan again, ignoring any new groups created within that time and assuming all sequences you got during the first read are complete. If a 10 second wait is too slow, you can tune it down. In the worst case you have a 0 second wait and you're just doing what you're already doing. You could get more complex in learning to recognize how long sequences are based on some sort of properties of the files, assuming there even is a relation, but applying that sort of machine learning to the problem is killing a fly with a bazooka; not to mention still only approximately correct. You're basically trying to solve interprocess race conditions, so good luck with that. Linear Zoetrope fucked around with this message at 06:26 on Apr 16, 2015 |
# ? Apr 16, 2015 06:18 |
|
You know how when connecting a device to some wifi networks a login/T&C HTML is shown - what is that page called, and is it possible to set one up for an ad hoc network?
|
# ? Apr 16, 2015 09:54 |
|
The term you're looking for is a "captive portal". Why do you need one on an ad-hoc network? It should be technically possible to set up, but once you're going through the effort to do that you might as well just run the machine as an actual AP.
|
# ? Apr 16, 2015 11:25 |
|
Ah, yes, "captive portal" is precisely the term I need. Thanks! This is for a quick prototype of a potential art installation project that includes audience participation. I'll look into making the box a first-class AP, too. That's great advice, thank you again.
|
# ? Apr 16, 2015 12:03 |
|
ynohtna posted:You know how when connecting a device to some wifi networks a login/T&C HTML is shown - what is that page called, and is it possible to set one up for an ad hoc network? I found the easiest way to do that is to use a freeBSD box as default gateway which redirects all requested urls to its own internal "login" page. I used it several times when putting together pay per use wifi hotspots, The reason for FreeBSD is that they still use IPChains which support this redirecting and unlocking based on Mac Address, where iptables which linux uses does not (at least not as easy to setup)
|
# ? Apr 16, 2015 12:28 |
|
ExcessBLarg! posted:What about daylight savings? That doesn't matter for a releasedate. Historical release dates are irrelevant, it's just a "release this thing at this time" date, so if it says 09:30, then DST doesn't matter, it's just 9:30 in the relevant time zone that day.
|
# ? Apr 16, 2015 13:05 |
|
I need a little help with design pattern trivia. Let's say I have two rather distinct entities that I want to work together. I would normally--not referencing a pattern here--talk about creating a bridge, but the bridge pattern appears to mean something else. Would this be a mediator? A little more detail--while staying abstract. I have system A1 and B1. I own B1. I have to interoperate with A1, so I come up with some abstraction I forgot on things common to A that I know, with a specific implementation for A1. The idea is I expect to see A2, and I have no motivation to make a B2. What would one call that intermediary?
|
# ? Apr 16, 2015 18:54 |
|
That's just an interface.
|
# ? Apr 16, 2015 19:09 |
|
GrumpyDoctor posted:That's just an interface. If he has access to extract the interface from A (and thereby modifying A). Otherwise, it would more properly be an adapter pattern.
|
# ? Apr 16, 2015 19:19 |
|
baquerd posted:If he has access to extract the interface from A (and thereby modifying A). Otherwise, it would more properly be an adapter pattern. That's what I was thinking, it just sounds like he needs an adapter.
|
# ? Apr 16, 2015 19:21 |
|
This question is too high-level and abstract for me to contribute meaningfully.
|
# ? Apr 16, 2015 21:50 |
|
Yeah, my question needs some work. I'll elaborate tonight. I think it's something like facades and adapters, but I omitted some information.
|
# ? Apr 16, 2015 22:58 |
|
I just got a mysterious e-mail that only included a Base64 encoding of a 24-digit decimal number. Is that a format that any particular thing comes in?
|
# ? Apr 17, 2015 03:36 |
|
|
# ? Jun 11, 2024 04:42 |
|
Akumu posted:I just got a mysterious e-mail that only included a Base64 encoding of a 24-digit decimal number. Is that a format that any particular thing comes in? If it were 23 digits or 25 digits, that would be something, but 24? Don't think so.
|
# ? Apr 17, 2015 03:43 |