Ask General Programming Questions Not Worth Their Own Thread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Ask General Programming Questions Not Worth Their Own Thread

«‹›672 »

nielsm: Jun 1, 2009

So another way to formulate the problem would be finding a smallest set of items to remove from the current ordering, to leave the remaining items in correct order. After solving that it's just determining ordering numbers for the removed items to insert them correctly which should be trivial.

The problem is just that the number of possible subsets is exponential in the size of the set.
My best guess at a heuristic is still finding correctly ordered subsets, you should be able to do that in linear time. If the actual data is already partially sorted the number of these subsets should be relatively low, and you might be able to just do an exhaustive search for the best set of those to get the smallest set to remove.

If your data is hopelessly shuffled, I don't think you're going to find an efficient reordering solution in short time, and your time is better spent sending off reordering requests for each item.

# ? Apr 14, 2015 19:24

Adbot: ADBOT LOVES YOU

# ? Jun 11, 2024 11:03

ultrafilter: Aug 23, 2007; It's okay if you have any questions.

What would you do if the list came down reversed?

# ? Apr 14, 2015 19:39

Thermopyle: Jul 1, 2003; ...the stupid are cocksure while the intelligent are full of doubt. �Bertrand Russell

ultrafilter posted:

What would you do if the list came down reversed?

I'd probably just renumber the whole thing even though in the case of odd number of items I could do it in len(items) -1 requests.

As the most common case is likely that a large percentage of the items are already numbered correctly, I'm going to investigate some heuristics around this idea:

nielsm posted:

My best guess at a heuristic is still finding correctly ordered subsets, you should be able to do that in linear time. If the actual data is already partially sorted the number of these subsets should be relatively low, and you might be able to just do an exhaustive search for the best set of those to get the smallest set to remove.

If your data is hopelessly shuffled, I don't think you're going to find an efficient reordering solution in short time, and your time is better spent sending off reordering requests for each item.

# ? Apr 14, 2015 19:54

HappyHippo: Nov 19, 2003; Do you have an Air Miles Card?

My intuition tells me your problem is related to the Levenshtein distance between the initial string and the final in order string. You might want to look that up.

# ? Apr 14, 2015 20:13

Ederick: Jan 2, 2013

I'm not sure if this question is appropriate for this thread or not, but let's give it a shot... I'm part support, part developer, part QA, part DBA at my company. Recently I've had to do a lot more code diving than what my Not-Computer-Science degree prepared me for. There is little documentation on how it works and I'm kind of out my league. What is the best way for me to go about learning how my company's program works and documenting it so that it both sticks with me and the poor sap after me has an easier time?

I understand it's going to be a long, grueling process of going through code line-by-line to try and see why and how things work and it'll likely take months of work, but I feel like it's the only way I'll learn. Could anyone suggest books, blogs, methods, or anything to help actually dive into byzantine code and understand it?

# ? Apr 14, 2015 22:10

Xerophyte: Mar 17, 2008; This space intentionally left blank

HappyHippo posted:

My intuition tells me your problem is related to the Levenshtein distance between the initial string and the final in order string. You might want to look that up.

The optimal number of renumerations is certainly an edit distance, but not Levenshtein. You can still use Wagner-Fischer (the standard edit distance algorithm) with arbitrarily large transpositions as your one edit operation to get the answer, but you'll end up with O(n^2) complexity.

However: you should be able to boil the problem down to operating on correctly ordered subsequences of the list, where correctly ordered means the elements in the subsequence are in exactly the same relative position as they would be in the sorted list. Finding those is potentially somewhat expensive, but should be cheap for nearly-sorted inputs. Then you can solve it greedily by renumerating the shortest out-of-order subsequence until done and still be optimal. I think. Hopefully.

[E:] I should mention the above is still O(n^2) in general and if your list isn't nearly sorted then it sucks. Still, in that case you need to renumerate everything so aiming for fast in the nearly-sorted case is probably the best call you can make.

Xerophyte fucked around with this message at 00:32 on Apr 15, 2015

# ? Apr 15, 2015 00:25

ultrafilter: Aug 23, 2007; It's okay if you have any questions.

If you take all of the sorted sequential subarrays and merge them together, you can use something like insertion sort to place the remaining elements into the result. That might be the best approach here.

# ? Apr 15, 2015 00:29

Thermopyle: Jul 1, 2003; ...the stupid are cocksure while the intelligent are full of doubt. �Bertrand Russell

My gut feeling is that most potential complexity issues will be completely overshadowed by the saved network overhead at average list lengths for the average out-of-orderness. We'll see when I tackle the problem again in a couple days (so I have time to think about it some more and tackle it afresh).

Also, thanks for talking about it, you guys are giving me some good ideas to try out.

# ? Apr 15, 2015 00:32

fritz: Jul 26, 2003

Thermopyle posted:

I've muddled through a suboptimal solution for this, but it feels gross and way complicated and I don't really know what to Google for. Is there a general strategy for this type of operation?

You have a permutation that you're trying to reverse, so maybe try to decompose it as disjoint cycles and work on each one separately?

# ? Apr 15, 2015 03:36

ufarn: May 30, 2009

I updated my rbenv Ruby version to from 2.1.0 to 2.3.0dev, and now I get this, whenever I run a CLI gem:

Ignoring [gem] because its extensions are not built. Try: gem pristine [gem] �version x.y.z

I get dozens of lines of this for different gems.

I also get this, after I switched back to 2.1.0. I use Ruby exclusively for gems, so I suck utterly at managing versions, PATH, and whatnot. I also have rvm installed, so the chance of blowing it all up is non-zero.

Any idea why this blew up, and why it stays that way, even when I switch back to 2.1.0?

# ? Apr 15, 2015 10:20

EAT THE EGGS RICOLA: May 29, 2008

I have a field that contains YYYY-MM-DD dates. I need to convert it to iso8601 (YYYY-MM-DDTHH:MM). Only future values will have actual HH:MM values. What do I put as the time for existing data? I was going to just go with 00:00 but someone else seems to think that's insane.

# ? Apr 15, 2015 14:09

zerofunk: Apr 24, 2004

Ederick posted:

I'm not sure if this question is appropriate for this thread or not, but let's give it a shot... I'm part support, part developer, part QA, part DBA at my company. Recently I've had to do a lot more code diving than what my Not-Computer-Science degree prepared me for. There is little documentation on how it works and I'm kind of out my league. What is the best way for me to go about learning how my company's program works and documenting it so that it both sticks with me and the poor sap after me has an easier time?

I understand it's going to be a long, grueling process of going through code line-by-line to try and see why and how things work and it'll likely take months of work, but I feel like it's the only way I'll learn. Could anyone suggest books, blogs, methods, or anything to help actually dive into byzantine code and understand it?

I'd recommend stepping through code in a debugger as much as possible - especially if you're unsure on how something works. Although when dealing with old code that has been heavily modified over the years (and not cleaned up), even something that looks obvious may not be. As you go along, be sure to document why code does what it does when you figure it out. In the older product that I work on, that's always the tricky part. Figuring out what the code is doing isn't as difficult as understanding why.

What language is it in? If you have a decent IDE, that can help a lot with tracking down where functions are called and things like that. Something older might involve a lot of grepping.

# ? Apr 15, 2015 14:20

karms: Jan 22, 2006; by Nyc_Tattoo; Yam Slacker

EAT THE EGGS RICOLA posted:

I have a field that contains YYYY-MM-DD dates. I need to convert it to iso8601 (YYYY-MM-DDTHH:MM). Only future values will have actual HH:MM values. What do I put as the time for existing data? I was going to just go with 00:00 but someone else seems to think that's insane.

That's pretty standard. Did you ask this person why he/she thought it was insane?

# ? Apr 15, 2015 14:25

Bognar: Aug 4, 2011; I am the queen of France; Hot Rope Guy

EAT THE EGGS RICOLA posted:

I have a field that contains YYYY-MM-DD dates. I need to convert it to iso8601 (YYYY-MM-DDTHH:MM). Only future values will have actual HH:MM values. What do I put as the time for existing data? I was going to just go with 00:00 but someone else seems to think that's insane.

Do they have a better idea? 00:00 seems sane enough to me.

# ? Apr 15, 2015 14:26

ufarn: May 30, 2009

EAT THE EGGS RICOLA posted:

I have a field that contains YYYY-MM-DD dates. I need to convert it to iso8601 (YYYY-MM-DDTHH:MM). Only future values will have actual HH:MM values. What do I put as the time for existing data? I was going to just go with 00:00 but someone else seems to think that's insane.

Seems sensible, as long as the dates won�t be changed by timezone differences.

# ? Apr 15, 2015 14:32

EAT THE EGGS RICOLA: May 29, 2008

KARMA! posted:

That's pretty standard. Did you ask this person why he/she thought it was insane?

They want it to be 08:30, which will be the standard release datetime for products in the future (but all existing dates are already in the past)

# ? Apr 15, 2015 14:33

csammis: Aug 26, 2003; Mental Institution

EAT THE EGGS RICOLA posted:

I have a field that contains YYYY-MM-DD dates. I need to convert it to iso8601 (YYYY-MM-DDTHH:MM). Only future values will have actual HH:MM values. What do I put as the time for existing data? I was going to just go with 00:00 but someone else seems to think that's insane.

00:00 in which time zone?

# ? Apr 15, 2015 14:50

EAT THE EGGS RICOLA: May 29, 2008

csammis posted:

00:00 in which time zone?

ISO8601 requires either local time, utc, or utc+offset, but all of these dates are standardized for the same time zone.

# ? Apr 15, 2015 14:54

Bognar: Aug 4, 2011; I am the queen of France; Hot Rope Guy

EAT THE EGGS RICOLA posted:

They want it to be 08:30, which will be the standard release datetime for products in the future (but all existing dates are already in the past)

Systems, processes, and standard release times can and will change in the future. I wouldn't add some arbitrary time to previous data based on that. 00:00 also acts as a good marker to show where your data previously did not have that level of precision.

# ? Apr 15, 2015 14:58

EAT THE EGGS RICOLA: May 29, 2008

Bognar posted:

Systems, processes, and standard release times can and will change in the future. I wouldn't add some arbitrary time to previous data based on that. 00:00 also acts as a good marker to show where your data previously did not have that level of precision.

YES THANK YOU

# ? Apr 15, 2015 15:04

JawnV6: Jul 4, 2004; So hot ...

If all that logic isn't helping, maybe clearly laying out the other 1438 possible values and coming up with pros/cons would help. Here's the list right now:
1. 00:00 - Simple, shows when real time was added
2. 08:30 - Magical value from the future that will be tacked onto everything
3. 08:29 - Sorts before 08:30, also shows when real time was added

# ? Apr 15, 2015 15:09

csammis: Aug 26, 2003; Mental Institution

EAT THE EGGS RICOLA posted:

ISO8601 requires either local time, utc, or utc+offset, but all of these dates are standardized for the same time zone.

If you're 100% positive it's only going to ever be in the same timezone it's okay to leave it local, but I'd consider adding the correct utc+offset anyway to future proof yourself.

Our application has a datatype stored as YYYY-MM-DD and often we get requests to update fields of that type to YYYY-MM-DDTHH:MM - which today is used globally with instances crossing timezones routinely, but back in the day someone thought "hey a single instance of this application won't ever be used across timezones." Now there's no single zone which will make the dates "correct" for everyone if we add a time part. Well, time makes fools of us all :v:

And for the record I'm voting 00:00 too for all the reasons above.

# ? Apr 15, 2015 15:14

EAT THE EGGS RICOLA: May 29, 2008

csammis posted:

If you're 100% positive it's only going to ever be in the same timezone it's okay to leave it local, but I'd consider adding the correct utc+offset anyway to future proof yourself.

Our application has a datatype stored as YYYY-MM-DD and often we get requests to update fields of that type to YYYY-MM-DDTHH:MM - which today is used globally with instances crossing timezones routinely, but back in the day someone thought "hey a single instance of this application won't ever be used across timezones." Now there's no single zone which will make the dates "correct" for everyone if we add a time part. Well, time makes fools of us all

And for the record I'm voting 00:00 too for all the reasons above.

This is an application for a government that is limited in geolocation, so although I absolutely can't be 100% positive that it won't ever need to expand beyond the area, it's the likeliest scenario I can think of where local time is okay (and the pushback from people that don't understand things to trying to do something 'weird' like use utc/utc+offset will be ridiculous).

# ? Apr 15, 2015 15:23

ExcessBLarg!: Sep 1, 2001

EAT THE EGGS RICOLA posted:

This is an application for a government that is limited in geolocation,

What about daylight savings?

# ? Apr 15, 2015 18:40

DimpledChad: May 14, 2002; Rigging elections since '87.

I've been spending the past few weeks debugging stuff in a web service, the behavior of which is highly dependent on the local time of the client. Which the client is not required to provide in their request. Which means that we use geoIP location. Except for mobile, where we use lat/long. But sometimes zip code. And sometimes we have no idea and treat them as if they could be anywhere in the continental US.

gently caress timezones forever.

# ? Apr 15, 2015 21:34

bobua: Mar 23, 2003; I'd trade it all for just a little more.

Got this problem that seemed easy at first, thought I'd solved it and was wrong. Pretty sure I have it now, but it feels clunky.

I scan a directory of files and create 'groups.' Let's say the file names are NAME+sequenceNumber. AAA-001, AAA-002, BBB-001, etc. Each group is just the name plus every other file in it's sequence, so out of these three files There would be 2 groups, AAA(containing 2 files) and BBB containing 1 file.

I only want a complete group. If all files in the sequence aren't there yet, I want to ignore it... but there is no way to know how many items are in a sequence(some files might not yet be written). A sequence is always written at the same time though, so all of group AAA would be written within a 1-10 second span.

At first, I just scanned the files and ignored files younger than 30 seconds. That appeared to work but didn't really of course, because a sequence could possibly straddle the 30 second mark. Now I scan the files, and build two sets of groups, files older than 30 seconds and files younger than 30 seconds. After the scan, I check every file in the younger groups to see if a group name matches one in the older groups and add it in.

Is there a better way?

# ? Apr 16, 2015 06:00

Linear Zoetrope: Nov 28, 2011; A hero must cook

Well, I don't think you're going to be able to solve that problem deterministically. You're going to have to accept some error, or convince whoever wrote whatever process is outputting these files to start off with writing an A-METADATA file or something that encodes the ultimate length.

I think the "simplest" solution is just to scan the directory, sleep for 10 seconds (since that's your upper bound), and then scan again, ignoring any new groups created within that time and assuming all sequences you got during the first read are complete. If a 10 second wait is too slow, you can tune it down. In the worst case you have a 0 second wait and you're just doing what you're already doing.

You could get more complex in learning to recognize how long sequences are based on some sort of properties of the files, assuming there even is a relation, but applying that sort of machine learning to the problem is killing a fly with a bazooka; not to mention still only approximately correct.

You're basically trying to solve interprocess race conditions, so good luck with that.

Linear Zoetrope fucked around with this message at 06:26 on Apr 16, 2015

# ? Apr 16, 2015 06:18

ynohtna: Feb 16, 2007; backwoods compatible; Illegal Hen

You know how when connecting a device to some wifi networks a login/T&C HTML is shown - what is that page called, and is it possible to set one up for an ad hoc network?

# ? Apr 16, 2015 09:54

Jabor: Jul 16, 2010; #1 Loser at SpaceChem

The term you're looking for is a "captive portal".

Why do you need one on an ad-hoc network? It should be technically possible to set up, but once you're going through the effort to do that you might as well just run the machine as an actual AP.

# ? Apr 16, 2015 11:25

ynohtna: Feb 16, 2007; backwoods compatible; Illegal Hen

Ah, yes, "captive portal" is precisely the term I need. Thanks!

This is for a quick prototype of a potential art installation project that includes audience participation. I'll look into making the box a first-class AP, too. That's great advice, thank you again.

# ? Apr 16, 2015 12:03

TheresaJayne: Jul 1, 2011

ynohtna posted:

You know how when connecting a device to some wifi networks a login/T&C HTML is shown - what is that page called, and is it possible to set one up for an ad hoc network?

I found the easiest way to do that is to use a freeBSD box as default gateway which redirects all requested urls to its own internal "login" page.

I used it several times when putting together pay per use wifi hotspots,

The reason for FreeBSD is that they still use IPChains which support this redirecting and unlocking based on Mac Address, where iptables which linux uses does not (at least not as easy to setup)

# ? Apr 16, 2015 12:28

EAT THE EGGS RICOLA: May 29, 2008

ExcessBLarg! posted:

What about daylight savings?

That doesn't matter for a releasedate. Historical release dates are irrelevant, it's just a "release this thing at this time" date, so if it says 09:30, then DST doesn't matter, it's just 9:30 in the relevant time zone that day.

# ? Apr 16, 2015 13:05

Rocko Bonaparte: Mar 12, 2002; Every day is Friday!

I need a little help with design pattern trivia. Let's say I have two rather distinct entities that I want to work together. I would normally--not referencing a pattern here--talk about creating a bridge, but the bridge pattern appears to mean something else. Would this be a mediator?

A little more detail--while staying abstract. I have system A1 and B1. I own B1. I have to interoperate with A1, so I come up with some abstraction I forgot on things common to A that I know, with a specific implementation for A1. The idea is I expect to see A2, and I have no motivation to make a B2. What would one call that intermediary?

# ? Apr 16, 2015 18:54

raminasi: Jan 25, 2005; a last drink with no ice

That's just an interface.

# ? Apr 16, 2015 19:09

baquerd: Jul 2, 2007; by FactsAreUseless

GrumpyDoctor posted:

That's just an interface.

If he has access to extract the interface from A (and thereby modifying A). Otherwise, it would more properly be an adapter pattern.

# ? Apr 16, 2015 19:19

carry on then: Jul 10, 2010; by VideoGames
(and can't post for 10 years!)

baquerd posted:

If he has access to extract the interface from A (and thereby modifying A). Otherwise, it would more properly be an adapter pattern.

That's what I was thinking, it just sounds like he needs an adapter.

# ? Apr 16, 2015 19:21

Suspicious Dish: Sep 24, 2011; 2020 is the year of linux on the desktop, bro; Fun Shoe

This question is too high-level and abstract for me to contribute meaningfully.

# ? Apr 16, 2015 21:50

Rocko Bonaparte: Mar 12, 2002; Every day is Friday!

Yeah, my question needs some work. I'll elaborate tonight. I think it's something like facades and adapters, but I omitted some information.

# ? Apr 16, 2015 22:58

Akumu: Apr 24, 2003

I just got a mysterious e-mail that only included a Base64 encoding of a 24-digit decimal number. Is that a format that any particular thing comes in?

# ? Apr 17, 2015 03:36

Adbot: ADBOT LOVES YOU

# ? Jun 11, 2024 11:03

baquerd: Jul 2, 2007; by FactsAreUseless

Akumu posted:

I just got a mysterious e-mail that only included a Base64 encoding of a 24-digit decimal number. Is that a format that any particular thing comes in?

If it were 23 digits or 25 digits, that would be something, but 24? Don't think so.

# ? Apr 17, 2015 03:43

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Ask General Programming Questions Not Worth Their Own Thread

«‹›672 »