Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
ShoulderDaemon
Oct 9, 2003
support goon fund
Taco Defender

LP0 ON FIRE posted:

I don't want to take "safe risks", but all that information is individually encrypted, and the IVs will be stored on a separate server.

The fact that you think storing IVs separately helps is part of the problem, here; storing IVs on a separate server adds approximately zero security, but you don't understand enough about the process to know that. Having the information "individually encrypted", depending on how you did it, might be less secure than encrypting it all in a single batch. There are a lot of little details that you have to pay very close attention to if you want to do this right, and I don't think that we can over an Internet forum talk you through it.

I'm all for experimentation in order to learn, but the setup you've described is already bordering on needless-complex-and-probably-broken, and if you want to experiment you should begin with something simple, and not involving any real data.

TooMuchAbstraction posted:

Hey, we're talking encryption and security! That's awesome. I want to implement a basic client/server API to allow our client programs to send notifications to the server. The only big trick here is that only authorized clients should be able to use this service, and the details of the information they're sending should be kept private (i.e. not sent in the clear).

As nielsm said, use TLS with client certificates.

Adbot
ADBOT LOVES YOU

LP0 ON FIRE
Jan 25, 2006

beep boop

ShoulderDaemon posted:

The fact that you think storing IVs separately helps is part of the problem, here; storing IVs on a separate server adds approximately zero security, but you don't understand enough about the process to know that. Having the information "individually encrypted", depending on how you did it, might be less secure than encrypting it all in a single batch. There are a lot of little details that you have to pay very close attention to if you want to do this right, and I don't think that we can over an Internet forum talk you through it.

I'm all for experimentation in order to learn, but the setup you've described is already bordering on needless-complex-and-probably-broken, and if you want to experiment you should begin with something simple, and not involving any real data.


As nielsm said, use TLS with client certificates.

I'm really interested to know why storing IV's on a separate server, especially if it's a vault adds almost no security. I read from lots of sources and told by others you must do this. If you were to store the IVs on the same database, on the same server, it would seem to me that would serve no purpose as an attacker would have all the IVs and encrypted values available to them right there if they broke into a database, and all they would need to guess right is one key to decrypt everything assuming all the keys are the same and not stored on the database.

Making your information more secure by assigning every encrypted value the same IV also makes zero sense to me. If someone had to just guess one IV vs one for every entry, it kind of makes the former seem more dangerous. I don't know, maybe I'm wrong, and I'm really interested to hear why! Thanks!

TooMuchAbstraction
Oct 14, 2012

I spent four years making
Waves of Steel
Hell yes I'm going to turn my avatar into an ad for it.
Fun Shoe
Okay, thanks for the advice guys! I'll need to read up on exactly how TLS works, so I know how to use the tool correctly, but that does sound preferable to a roll-your-own approach.

ShoulderDaemon
Oct 9, 2003
support goon fund
Taco Defender

LP0 ON FIRE posted:

I'm really interested to know why storing IV's on a separate server, especially if it's a vault adds almost no security.

Here's a message that I'm going to encrypt:
code:
$ echo "This is a test message that I am going to encrypt, and then decrypt with the wrong IV." > message.clear
$ xxd message.clear
00000000: 5468 6973 2069 7320 6120 7465 7374 206d  This is a test m
00000010: 6573 7361 6765 2074 6861 7420 4920 616d  essage that I am
00000020: 2067 6f69 6e67 2074 6f20 656e 6372 7970   going to encryp
00000030: 742c 2061 6e64 2074 6865 6e20 6465 6372  t, and then decr
00000040: 7970 7420 7769 7468 2074 6865 2077 726f  ypt with the wro
00000050: 6e67 2049 562e 0a                        ng IV..
OK, let's encrypt it with some key and IV:
code:
$ openssl enc -e -aes-128-cbc -nosalt -iv 0123456789abcdef -k "some sort of passphrase" -in message.clear -out message.crypt
$ xxd message.crypt
00000000: 36cc 0431 7047 f913 f1ea 8d96 f84d 6eb0  6..1pG.......Mn.
00000010: ce0c 30f6 804e fe80 1129 f881 2c38 32d4  ..0..N...)..,82.
00000020: 2bca a0d2 fe86 3457 3154 ae8b 94e6 c58b  +.....4W1T......
00000030: a730 3b5f b317 3e9d 042e f1c0 63e6 c97c  .0;_..>.....c..|
00000040: 4c82 9199 cbd0 ea54 3085 64bd c5d4 e445  L......T0.d....E
00000050: ef1d 12d0 8efe e71a 7d28 936f 6046 0ac9  ........}(.o`F..
And now let's decrypt it, but whoops, I forgot the IV and had to guess:
code:
$ openssl enc -d -aes-128-cbc -nosalt -iv 0000000000000000 -k "some sort of passphrase" -in message.crypt -out message.decrypt
$ xxd message.decrypt
00000000: 554b 2c14 a9c2 becf 6120 7465 7374 206d  UK,.....a test m
00000010: 6573 7361 6765 2074 6861 7420 4920 616d  essage that I am
00000020: 2067 6f69 6e67 2074 6f20 656e 6372 7970   going to encryp
00000030: 742c 2061 6e64 2074 6865 6e20 6465 6372  t, and then decr
00000040: 7970 7420 7769 7468 2074 6865 2077 726f  ypt with the wro
00000050: 6e67 2049 562e 0a                        ng IV..
Well, at least the first 8 bytes of it were safe. Of course, now that I have the rest of the message, I've got a much better chance of guessing those.

IVs are simply not intended to be secret. They are effectively part of the encrypted message; analysis of crypto algorithms assumes that any attacker with access to a message also has access to the IV. Often, they are not generated in a hard to predict fashion; typically only uniqueness matters, so the software generating them might not care to keep IVs from different messages uncorrelated. Keeping them secret doesn't help, it just makes your life harder, and encourages you to falsely believe that you are more secure than you actually are.

LP0 ON FIRE posted:

I read from lots of sources and told by others you must do this. If you were to store the IVs on the same database, on the same server, it would seem to me that would serve no purpose as an attacker would have all the IVs and encrypted values available to them right there if they broke into a database, and all they would need to guess right is one key to decrypt everything assuming all the keys are the same and not stored on the database.

As I showed, you can trivially decrypt all but the first block of a message with just the key when you're using CBC and don't have the IV. If your messages are e.g. email addresses, then the first part of peoples' email address is likely to correlate with their real name or user name, so if I was only missing the first 8 bytes of each address, then for a large number of messages if I would expect to be able to completely recover their address. IVs being secret just doesn't help you here in any particularly meaningful sense.

LP0 ON FIRE posted:

Making your information more secure by assigning every encrypted value the same IV also makes zero sense to me. If someone had to just guess one IV vs one for every entry, it kind of makes the former seem more dangerous. I don't know, maybe I'm wrong, and I'm really interested to hear why! Thanks!

Oh, you absolutely can't ever let yourself re-use the same IV. That would lead to all sorts of other attacks. You should just store the IV as part of the encrypted messages, and use a different IV for every message.

Look, this poo poo is hard to do right. If it's interesting to you, you should learn it; the world needs more cryptographers. But you shouldn't make your "I'm going to wade into encryption technologies" project involve real data. Hire a contractor who knows their poo poo, and ask if you can watch over his shoulder and ask dumb questions. Get familiar with the tools and algorithms one at a time, so you know what each step of the process is for and how to do it in isolation. Start developing an intuition for how protocols fit together and where weak spots are likely to appear. Look at some real protocols like the various X.509 family of messages, or OpenPGP, or TLS. Note how the same sorts of designs keep reappearing over and over, because they are well-understood and well-studied. Most importantly, build relationships in the community, so that on the rare case where you find yourself actually needing a new protocol, you can get it reviewed by people who aren't you, because nobody ever sees the problems in their own protocols.

Just, please, don't encrypt anything that you or your customers care about with a protocol that you designed by yourself. Especially not if it's your first time playing with cryptography.

ShoulderDaemon fucked around with this message at 21:58 on Nov 24, 2015

ExcessBLarg!
Sep 1, 2001

LP0 ON FIRE posted:

I'm really interested to know why storing IV's on a separate server, especially if it's a vault adds almost no security.
Because the IV isn't intended to be a secret. In CBC mode, the IV for all-but-the-first block is determined from the previous block. At best, you're obfuscating the first block of your message, not subsequent blocks.

This sounds like an XY problem though. What are you going to do with the encrypted values in Table A when Table B (and it's IVs) are in a vault and not available? What's the value in having encrypted, and not decryptable values in Table A?

LP0 ON FIRE
Jan 25, 2006

beep boop

ShoulderDaemon posted:

Here's a message that I'm going to encrypt..

Thanks, this is really good information! Usually IV's are more like a salt, than anything right? I'm making something for a trial for a small group, and I told my boss we will need a contractor when we get ready for the full release. I do want to preserve as much as we can as we migrate though, so it's important to get this stuff right the first time.

ExcessBLarg! posted:

Because the IV isn't intended to be a secret. In CBC mode, the IV for all-but-the-first block is determined from the previous block. At best, you're obfuscating the first block of your message, not subsequent blocks.

This sounds like an XY problem though. What are you going to do with the encrypted values in Table A when Table B (and it's IVs) are in a vault and not available? What's the value in having encrypted, and not decryptable values in Table A?


And thank you. If someone was to store a key on a vault, is it's purpose only to allow or deny access to something, and not ever send back information to the user that was decrypted by that key?

raminasi
Jan 25, 2005

a last drink with no ice

LP0 ON FIRE posted:

And thank you. If someone was to store a key on a vault, is it's purpose only to allow or deny access to something, and not ever send back information to the user that was decrypted by that key?

I think the question is "why not just store the data in the vault."

ExcessBLarg!
Sep 1, 2001

LP0 ON FIRE posted:

If someone was to store a key on a vault, is it's purpose only to allow or deny access to something, and not ever send back information to the user that was decrypted by that key?
I'm not sure I understand what you're asking.

Forget IVs though, let's assume that in your scenario you're encrypting the incoming data with a cryptographically-secure, randomly-generated per-user key, storing the encrypted data in Table A, and putting the per-user key in Table B. I think that's effectively the scenario you wanted to achieve with the IVs.

Now, to do anything useful with that data you'd have to decrypt it. That means you have to access both the data in Table A, and the key in Table B. What benefit do you gain, though, from having two separate tables? Why not store both the data and the key in Table A. Why not store the data plaintext in Table A?

So, one reasonable answer is that the encrypted data is quite large in volume, and so you want to store "Table A" on a large cloud-hosted provider, while "Table B" is much smaller (just keys) and you intend to locate on it on premise in a physically secure environment. The goal here, would be not having to worry about sensitive data accidentally being leaked by the cloud provider. Is that your scenario?

Illusive Fuck Man
Jul 5, 2004
RIP John McCain feel better xoxo 💋 ðŸ™Â
Taco Defender
While we're on the subject, I was playing with an idea / writing some code a while ago. If I'm encrypting (with say, aes128-gcm) a bunch of immutable butts, and randomly generating the key for each butt / never re-encrypting with that buttkey again, is it fine to use a zero IV? That was my assumption at the start, but then I started thinking about birthday poo poo / the chances of randomly choosing the same key twice with a large number of butts, and the consequences of that. Never got around to running the numbers.

Steve French
Sep 8, 2003

TooMuchAbstraction posted:

Hey, we're talking encryption and security! That's awesome. I want to implement a basic client/server API to allow our client programs to send notifications to the server. The only big trick here is that only authorized clients should be able to use this service, and the details of the information they're sending should be kept private (i.e. not sent in the clear).

First off, I'm happy to use existing solutions so long as they're secure and open-source compatible; any recommendations? Failing that, here's the sketch of my design:

Clients that are authorized are provided with a client ID and a client private key, which they plug into their versions of the program.

Server-side: we expose an HTTP server that accepts three types of requests:

1) HTTP GET request for the server's public key ("Key Request").

2) HTTP GET request to request an authentication challenge ("Challenge"). This request will return a random string encrypted with the client's public key.

3) HTTP POST request to send a new notification ("Notification"). This request contains the following information:

- The decrypted random string from a Challenge request (usable once only, and only within a short time window)
- JSON string describing the notification, encrypted with the server's public key

All requests also include the client ID; we can revoke client IDs/keys if necessary by updating a table on the server side. We can also update the server's public/private keys at any time since each interaction with the server should involve first getting the new key -- it's not embedded into the client program anywhere.

The notification process should be fairly obvious from this: request the server's public key, request a challenge token, decrypt the token with the client's private key, encrypt the notification data with the server's public key, send both to the server. The server verifies the token matches what it encrypted and that the client is authorized, and if so, decrypts the notification data and takes appropriate action based on what it finds there. As far as I can tell, the client's information should remain private so long as they keep their private key, well, private -- which is down to proper operational security on their computers and is not something I think I should be particularly worried about.

This seems fairly straightforward, but I'm worried that I'm the guy going "computer security is easy! :downs:" while walking off a cliff into oceans full of Screaming Eels. So, uh, sanity check please?

Others have (correctly) told you that you should just be using standard techniques here, but it may be helpful to point out some ways that what you laid out could go wrong. I'm not even remotely a security expert, so the fact that I could quickly and easily identify these problems is a clear sign that this is why you ought to always use standard proven techniques.

What you proposed is vulnerable to man in the middle attacks, both to view supposedly secret information, as well as impersonating a client and sending fake notifications.

For example, if I intercept the POST request, I can read the challenge response and forward it to the real server, and since the JSON describing the notification is encrypted with the server's public key (and not signed in any way), I can easily craft my own notification, encrypt it with the server's public key myself, and send anything that I want to be accepted as legitimate by the server.

Additionally, since the client gets the server's public key from an HTTP GET from the server, and it knows that it can change at any time, if I also intercept that GET request, I can return a different public key that I have the corresponding private key to. Then, later, when the POST request with the notification is sent, I can easily decrypt it and read the contents (and modify it before forwarding to the server as well).

ShoulderDaemon
Oct 9, 2003
support goon fund
Taco Defender

Illusive gently caress Man posted:

While we're on the subject, I was playing with an idea / writing some code a while ago. If I'm encrypting (with say, aes128-gcm) a bunch of immutable butts, and randomly generating the key for each butt / never re-encrypting with that buttkey again, is it fine to use a zero IV? That was my assumption at the start, but then I started thinking about birthday poo poo / the chances of randomly choosing the same key twice with a large number of butts, and the consequences of that. Never got around to running the numbers.

Use a unique IV for every message, even if you aren't reusing keys.

There's a fair amount of "healthy paranoia" in crypto. We don't currently know of any serious issues with AES, but we have good reason to suspect that when some start to appear, they will (at least at first) be fairly narrow attacks; blocks with particular structures will be more vulnerable to cryptanalysis. Using unique, and ideally random, IVs gives us some hope that any such structures will be randomized within our message corpus, which makes them unlikely and hard to find, and thus probably increases our resistance to future attacks long enough for us to get wind of AES being likely-to-be-compromised-soon and allow a migration plan. Otherwise we might get unlucky and discover that all of our messages begin with easy-to-break blocks, which means that all of our keys may become suddenly vulnerable to attack.

GCM and related counter mode ciphers are particularly noteworthy in this regard because the actual encryption is being done on the counter, which is initialized by the IV, and if I had to pick any single block as being "most likely to cause problems" or "most likely to have some well-known acceleration structure for breaking" it'd be the all-zeroes block. Random IVs serve to make attackers' lives harder by giving them as little room for acceleration structures as possible.

LP0 ON FIRE
Jan 25, 2006

beep boop

ExcessBLarg! posted:

I'm not sure I understand what you're asking.

Forget IVs though, let's assume that in your scenario you're encrypting the incoming data with a cryptographically-secure, randomly-generated per-user key, storing the encrypted data in Table A, and putting the per-user key in Table B. I think that's effectively the scenario you wanted to achieve with the IVs.

Now, to do anything useful with that data you'd have to decrypt it. That means you have to access both the data in Table A, and the key in Table B. What benefit do you gain, though, from having two separate tables? Why not store both the data and the key in Table A. Why not store the data plaintext in Table A?

So, one reasonable answer is that the encrypted data is quite large in volume, and so you want to store "Table A" on a large cloud-hosted provider, while "Table B" is much smaller (just keys) and you intend to locate on it on premise in a physically secure environment. The goal here, would be not having to worry about sensitive data accidentally being leaked by the cloud provider. Is that your scenario?

These were my original thoughts, (except replacing IVs here with keys) but now it looks like it's really, really wrong:

My potential plan was to have the encrypted data on a database's table on server A, and I'd have the keys on a database's table on server B. If someone broke into server A, they would not have the keys until they broke into server B. Furthermore, server B could be setup as a vault to accept the encrypted data, decrypt it with it's keys and send back the result to the user.

That other scenario is interesting, but I don't think it would be used with this.

LP0 ON FIRE fucked around with this message at 17:54 on Nov 25, 2015

Skandranon
Sep 6, 2008
fucking stupid, dont listen to me

LP0 ON FIRE posted:

These were my original thoughts, (except replacing IVs here with keys) but now it looks like it's really, really wrong:

My potential plan was to have the encrypted data on a database's table on server A, and I'd have the keys on a database's table on server B. If someone broke into server A, they would not have the keys until they broke into server B. Furthermore, server B could be setup as a vault to accept the encrypted data, decrypt it with it's keys and send back the result to the user.

If server A requests keys from server B, and A is compromised, how can B tell if the requests for keys are legitimate or not? Also, sending the data to be decrypted on B will be a huge performance bottleneck.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

I wonder if a programming security questions thread would be a good idea. It seems like security issues are something that the majority of programmers brush up against peripherally, but from my very limited perspective, they're hard to address without being subtly wrong. Because it's something that is a seemingly minor part of what they do, most programmers (me included!) don't have the background to know when they're loving it up.


Anyway, I didn't come in here to say that stuff. I came in here to ask for any links to layperson-accessible descriptions of neural networks, deep learning, AI-sorts-of-thingamabobs.


edit: lol, I typed "no" instead of "know"

Thermopyle fucked around with this message at 18:38 on Nov 25, 2015

Symbolic Butt
Mar 22, 2009

(_!_)
Buglord

Thermopyle posted:

Anyway, I didn't come in here to say that stuff. I came in here to ask for any links to layperson-accessible descriptions of neural networks, deep learning, AI-sorts-of-thingamabobs.

I like the way Peter Norvig quickly explains some of these in this talk: http://www.infoq.com/presentations/machine-learning-general-programming

LP0 ON FIRE
Jan 25, 2006

beep boop

Thermopyle posted:

I wonder if a programming security questions thread would be a good idea. It seems like security issues are something that the majority of programmers brush up against peripherally, but from my very limited perspective, they're hard to address without being subtly wrong. Because it's something that is a seemingly minor part of what they do, most programmers (me included!) don't have the background to no when they're loving it up.

Sounds great to me.


Skandranon posted:

If server A requests keys from server B, and A is compromised, how can B tell if the requests for keys are legitimate or not? Also, sending the data to be decrypted on B will be a huge performance bottleneck.

If A's is compromised, by default I don't think there's any way that B could tell if the requests for keys are legitimate. My guess is that you would need some kind of software to raise an alert and not allow any access if a bizarre pattern of requests started coming in.

Volmarias
Dec 31, 2002

EMAIL... THE INTERNET... SEARCH ENGINES...

Thermopyle posted:

I wonder if a programming security questions thread would be a good idea. It seems like security issues are something that the majority of programmers brush up against peripherally, but from my very limited perspective, they're hard to address without being subtly wrong. Because it's something that is a seemingly minor part of what they do, most programmers (me included!) don't have the background to no when they're loving it up.

Well, there's always the Security Fuckup Megathread, which is surprisingly helpful, but probably not what you always want.

ExcessBLarg!
Sep 1, 2001

LP0 ON FIRE posted:

If someone broke into server A, they would not have the keys until they broke into server B.
Why would someone who can break into A not immediately break into B? If they're running the same operating system, daemons, and update cycles, then a security vulnerability against A would equally apply against B?

Again you could say "A is a public facing, shared server with more user accounts, daemons, and generally greater risk to being attacked." In that case, I'd probably store all the data (encrypted or not) on B. "A has much more storage though" is a good reason to store encrypted data on A and keys on B.

It's all kind of a silly exercise though, since if you're going to handle unencrypted, user-sensitive data at some point, you want to do that on a dedicated (public facing) server. If you're going to store anything on A, at all, A itself should always handle it in encrypted form. Also if you're handling user-sensitive data, you should really audit the security of your entire platform, including physical security, to make sure that you're meeting the right objectives. Separating keys from data might be an appropriate means to meet those objectives, but without context, we can't say if it's the appropriate thing to do.

Duct Tape
Sep 30, 2004

Huh?
I've got a regular expression problem, and I know one solution, but it's damned ugly and I'm wondering if there is a cleaner way of doing it.

I want to match a specific string, let's say "Apple", but when matching it I want to ignore all numbers. So "Apple1", "0A1p23p45l6e7", and "111111Apple" all match, but "Appleb" doesn't.

I can use "^\d*A\d*p\d*p\d*l\d*e\d*$", but it feels like there should be a cleaner way of doing this. Any ideas?

ExcessBLarg!
Sep 1, 2001
Strip digits from the string first, then check if it matches "\AApple\z"

Skandranon
Sep 6, 2008
fucking stupid, dont listen to me

Duct Tape posted:

I've got a regular expression problem, and I know one solution, but it's damned ugly and I'm wondering if there is a cleaner way of doing it.

I want to match a specific string, let's say "Apple", but when matching it I want to ignore all numbers. So "Apple1", "0A1p23p45l6e7", and "111111Apple" all match, but "Appleb" doesn't.

I can use "^\d*A\d*p\d*p\d*l\d*e\d*$", but it feels like there should be a cleaner way of doing this. Any ideas?

I'm a little confused by your use case, why would you ever need to match "0A1p23p45l6e7"?

TheresaJayne
Jul 1, 2011

LP0 ON FIRE posted:

These were my original thoughts, (except replacing IVs here with keys) but now it looks like it's really, really wrong:

My potential plan was to have the encrypted data on a database's table on server A, and I'd have the keys on a database's table on server B. If someone broke into server A, they would not have the keys until they broke into server B. Furthermore, server B could be setup as a vault to accept the encrypted data, decrypt it with it's keys and send back the result to the user.

That other scenario is interesting, but I don't think it would be used with this.

Well I have some info on keys etc but I can't post it in open discourse :( something about "If you go into details we will send someone in a black car wearing sunglasses to zap you in the face with a flashy light thing"

sigh!

Hang on , there is someone at the door!

...
....
......
Forget everything i said, because I did. (Lp0 check private message)

TheresaJayne fucked around with this message at 08:20 on Nov 26, 2015

qntm
Jun 17, 2009

Skandranon posted:

I'm a little confused by your use case, why would you ever need to match "0A1p23p45l6e7"?

How would knowing the use case change your answer to the question?

Skandranon
Sep 6, 2008
fucking stupid, dont listen to me

qntm posted:

How would knowing the use case change your answer to the question?

It sounds like you are doing something you don't really need to, however, if you really need to be matching something like "Apple" inside of "0A1p23p45l6e7", then I think you have the solution you want. You can probably write a function which, given a word, will automatically generate your regex.

You could also iterate over the string and check for the letters in your match word, one at a time. Will probably be a lot faster than a regex, assuming there are no other edge cases and you just need a true/false that the letters in "Apple" appear somewhere, in order, in "0A1p23p45l6e7".

ExcessBLarg! posted:

Strip digits from the string first, then check if it matches "\AApple\z"

This is a good suggestion/question. Is it only going to be digits in between letters, or any possible character?

b0lt
Apr 29, 2005

Skandranon posted:

It sounds like you are doing something you don't really need to, however, if you really need to be matching something like "Apple" inside of "0A1p23p45l6e7", then I think you have the solution you want. You can probably write a function which, given a word, will automatically generate your regex.
This is terrible, do not do this. Regex compilation is not a free operation.

quote:

You could also iterate over the string and check for the letters in your match word, one at a time. Will probably be a lot faster than a regex, assuming there are no other edge cases and you just need a true/false that the letters in "Apple" appear somewhere, in order, in "0A1p23p45l6e7".
If you're using a non-poo poo regex implementation, no, it won't. This is just compiling the regex by hand.

nielsm
Jun 1, 2009



Duct Tape posted:

I've got a regular expression problem, and I know one solution, but it's damned ugly and I'm wondering if there is a cleaner way of doing it.

I want to match a specific string, let's say "Apple", but when matching it I want to ignore all numbers. So "Apple1", "0A1p23p45l6e7", and "111111Apple" all match, but "Appleb" doesn't.

I can use "^\d*A\d*p\d*p\d*l\d*e\d*$", but it feels like there should be a cleaner way of doing this. Any ideas?

As others have posted, no neat solution in a single regex. What you could do is regex replace all characters you don't want away from the string, then test for the words in the cleaned string.
That is, unless you actually need to find that kind of "hidden words" inside a larger context and that context is important.

Skandranon
Sep 6, 2008
fucking stupid, dont listen to me

b0lt posted:

If you're using a non-poo poo regex implementation, no, it won't. This is just compiling the regex by hand.

Pretty sure it would be faster. Done correctly, it only requires a single pass through the search string & the matching string.

Ophidia
Oct 20, 2012
Is there anyone here who has experience in developing windows phone 8.1 apps? I wrote a small app and want to deploy it to my windows phone and I cannot make it work. I created an appx file, but I have no idea how to install it on my phone - could someone explain the necessary steps or link me a tutorial? I spent the last 3 days trying to find something on google and nothing worked so far.

Ophidia fucked around with this message at 22:29 on Nov 26, 2015

b0lt
Apr 29, 2005

Skandranon posted:

Pretty sure it would be faster. Done correctly, it only requires a single pass through the search string & the matching string.

You are wrong. It is equivalent to a regex.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

b0lt posted:

You are wrong. It is equivalent to a regex.

On the other hand, you can write an equivalent-to-a-regex version that doesn't require a compilation step every time you change the word you're looking for.

Duct Tape
Sep 30, 2004

Huh?

Skandranon posted:

I'm a little confused by your use case, why would you ever need to match "0A1p23p45l6e7"?

nielsm posted:

As others have posted, no neat solution in a single regex. What you could do is regex replace all characters you don't want away from the string, then test for the words in the cleaned string.
That is, unless you actually need to find that kind of "hidden words" inside a larger context and that context is important.

Skandranon posted:

It sounds like you are doing something you don't really need to, however, if you really need to be matching something like "Apple" inside of "0A1p23p45l6e7", then I think you have the solution you want. You can probably write a function which, given a word, will automatically generate your regex.

You could also iterate over the string and check for the letters in your match word, one at a time. Will probably be a lot faster than a regex, assuming there are no other edge cases and you just need a true/false that the letters in "Apple" appear somewhere, in order, in "0A1p23p45l6e7".


This is a good suggestion/question. Is it only going to be digits in between letters, or any possible character?

I simplified the problem to describe the solution I'm looking for. The actual problem is that I have a system where users can set up a series of files to be created. When creating the series, they're given options on how to create distinct file names by using any values from a DateTime and/or an auto-incrementing number. So you see names like "BobJoesSuperCoolFile_2015-11-16T14-34.ext" or "SprungerDailyAnalysis26.ext" or "Performance2012-2015Revision06.ext". By stripping out all the numbers from a file name, you can find the series' pattern.

At this point in the code, I'm searching for the files that match the pattern. Unfortunately, the pattern to search for is user-editable, so I can't assume they automatically want to strip out numbers. The best I can do is pre-populate the search box with a regular expression that will do what I'm describing: match the pattern but ignore any numbers.

But "1Apples2" is an easier way of describing it.

b0lt posted:

This is terrible, do not do this. Regex compilation is not a free operation.

I wound up going with the /d* solution, as it works. However, your point reminded me that the way I did it winds up recompiling the regex again and again. I'll fix that.

pokeyman
Nov 26, 2006

That elephant ate my entire platoon.

Ophidia posted:

Is there anyone here who has experience in developing windows phone 8.1 apps? I wrote a small app and want to deploy it to my windows phone and I cannot make it work. I created an appx file, but I have no idea how to install it on my phone - could someone explain the necessary steps or link me a tutorial? I spent the last 3 days trying to find something on google and nothing worked so far.

A guess: copy the .appx to the phone over USB then open it on the phone. May need to turn on some kind of developer mode on the phone.

Gul Banana
Nov 28, 2003

the keyword to google is "sideloading"

Ophidia
Oct 20, 2012
Thank you for your answers. I tried to move the appx file to my phone and install it, but for some reason my lumia 925 is missing the "install local apps" option. Googling sideloading also didn't bring any results.

I made it work though now - the solution was to use visual studio for this purpose. Deploying a solution to "device" when the phone is connected to the PC is the way to easily get an app to the phone and the reason why it took me so long to find out is that "Windows 8.1 Universal App" was the wrong kind of app it seems. When creating a new app I chose Visual C# -> Windows 8 -> Blank App (Universal Windows 8.1.) which is obviously wrong. Instead it has to be Visual C# -> Windows 8 -> Windows Phone -> Blank App (Windows Phone). And there suddenly is an option to deploy the app to device and now I have it on my phone.

Just wanted to explain in case someone else has this problem.

Rockybar
Sep 3, 2008

I'm making a program that graphs based on input from an arduino. Currently all values and their timestamps are stored in an ArrayList, and then I use this code:

code:
for (int i = (TrackedPoints.size() - 1); i >= 0; i--) 
    	{

      point(centreX - (width/2) + TrackedPoints.get(i).getTime()*10, 
      		centreY + TrackedPoints.get(i).getTemp()-(height/2));
      		
      }
to draw a graph as a series of points from the array. However when it reaches about 1000 objects in the array (maybe a minute in) it starts lagging a lot. What is the best solution to reducing this? Currently I've thought of two options: merging multiple points in the array, so only a maximum of 600 or so are rendered (displaying the whole graph); or having the graph shift to the left as it goes along (and perhaps making a mode to switch between the two). Are either of these solutions acceptable? Or is there a better way to keep all my data displayed without having to resort to other ways of making it efficient? Just stopping the background update isn't really an option, as they'll be other things updating on screen, some I'd like in real time for instance.

baby puzzle
Jun 3, 2011

I'll Sequence your Storm.
You probably don't need to draw every point. Decrement your loop's counter by a number that depends on the size of the array.

Edit: For example, if you can draw 500 points fast enough, and there are 2000 items in the array, then decrease i by 4 (2000/500) on each loop.

baby puzzle fucked around with this message at 20:15 on Nov 29, 2015

Rockybar
Sep 3, 2008

drat that's a really simple solution, why didn't I think of that. I was going to average out point values but this will be much quicker. cheers.

edit: ↓↓↓ I would do that but the background is cleared at the end of each frame, meaning each point needs to be redrawn. I've already made the chart axes dynamic so they can accommodate a graph being 'squished' once it reaches the end.

Rockybar fucked around with this message at 20:24 on Nov 29, 2015

JawnV6
Jul 4, 2004

So hot ...
Assuming that's Processing? I've done that in C#, the graphing faculties there allow you to add a single point to a Series. That code is looping over every point in the array every time, check if there's some way to only add the most recent sample.

As an aside, if you're plotting samples coming in at 10Hz, you're going to run out of pixels in a few minutes. I've found graphs that do a sliding window are more useful than compressing samples. See if you can simply adjust the axis values instead of re-drawing every point.

TooMuchAbstraction
Oct 14, 2012

I spent four years making
Waves of Steel
Hell yes I'm going to turn my avatar into an ad for it.
Fun Shoe

Rockybar posted:

I'm making a program that graphs based on input from an arduino. Currently all values and their timestamps are stored in an ArrayList, and then I use this code:

code:
for (int i = (TrackedPoints.size() - 1); i >= 0; i--) 
    	{

      point(centreX - (width/2) + TrackedPoints.get(i).getTime()*10, 
      		centreY + TrackedPoints.get(i).getTemp()-(height/2));
      		
      }
to draw a graph as a series of points from the array. However when it reaches about 1000 objects in the array (maybe a minute in) it starts lagging a lot. What is the best solution to reducing this? Currently I've thought of two options: merging multiple points in the array, so only a maximum of 600 or so are rendered (displaying the whole graph); or having the graph shift to the left as it goes along (and perhaps making a mode to switch between the two). Are either of these solutions acceptable? Or is there a better way to keep all my data displayed without having to resort to other ways of making it efficient? Just stopping the background update isn't really an option, as they'll be other things updating on screen, some I'd like in real time for instance.

You can draw the graph points to an image object, and just update that image object each time a new point is added. Then in your display code you just draw the image again. This kind of approach can also be used to do a shifting window (like your second option): just draw the image to itself, but offset by a bit, and the oldest points will disappear while new room opens up for new ones.

Adbot
ADBOT LOVES YOU

hooah
Feb 6, 2006
WTF?
I'm building a genetic algorithm to stress-test elevator control schemes. Currently, fitness is based on the number of floors traveled, but this is resulting in a lot of requests going from the top floor to the bottom floor. I would like to reward more randomness in the request start and destination floors. How can I go about doing this?

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply