Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
furtheraway
Dec 19, 2005

this milchkuh has given its last pound of flesh
I understand that we're supposed to keep language specific questions in their related threads, but I don't see any current threads on shell scripting or sed. If I'm wrong, feel free to ridicule me.

With that out of the way, my question is about using regular expressions with sed. I'm searching a line of numbers and symbols and only want a certain group of numbers. My input string will look like this:
code:
16(100%)
I only want the numerals between the first parenthesis and the percent sign. That group can contain from one to three numerals. How do I extract it?

I've tried the following (and many variations) with no luck. This regex makes the most sense to me in this context:
code:
([0-9]*(\([0-9]{1,3})%\))
Any ideas? I know it's something simple, so if anyone can spare an epiphany...

Adbot
ADBOT LOVES YOU

furtheraway
Dec 19, 2005

this milchkuh has given its last pound of flesh

bitprophet posted:

Must be sed's interpretation of regex, I wasn't able to convince it to do what you want (not even anything as simple as '[0-9]+' was matching, so either I was being retarded in my sed usage or that's the aspect it doesn't support...?) but Python's RE library works fine, I think it's largely PCRE compatible:


Upon further investigation, it looks like you need the -E flag for sed to use extended regex instead of simple:


EDIT: I used + instead of {1,3} there but same difference in this case :v: Also, I used + instead of * for the earlier part, which is probably better unless your input sometimes lacks the leading number entirely; I also removed the all-encompassing parentheses which I am pretty sure were extraneous :)

It doesn't look like my version of sed (4.0.7) supports -E, which seems odd. Maybe it's a GNU thing? Either way, your post (and a cup of tea) showed me exactly what I was missing, and it's totally embarrassing. I was forgetting to give sed a search command! :doh: I'm so ashamed of myself right now.

With a little fiddling I was able to get your version to work and my problem is solved. Thank you very much!

furtheraway
Dec 19, 2005

this milchkuh has given its last pound of flesh

axolotl farmer posted:

Here's my very ugly take on it!

echo "16(100%)" |sed 's/[()%]/ /g'|awk '{print $2}'

That's a cool method as well (took me a second to spot the space), and it's not nearly as ugly as the entire command I'm using to get my string.

Thanks to everyone for the replies.

furtheraway
Dec 19, 2005

this milchkuh has given its last pound of flesh

Plastic Jesus posted:

It might be simpler (and possibly faster) to just 'cut' twice instead of sed:

code:
[jacob@tw-133]$ echo '16(100%)'|cut -d'(' -f2|cut -d'%' -f1
100
[jacob@tw-133]$ 

That's an excellent point. Using a regex in my situation is overkill. I think I'll just use this method.

furtheraway
Dec 19, 2005

this milchkuh has given its last pound of flesh
I want to use a unix shell script to generate a plain text file of sequential ten digit numbers (from 0000000000 to 9999999999) with each separated by a carriage return. Something like this:

0000000000
0000000001
...
9999999998
9999999999

What is the most efficient way to do this?

EDIT: Thanks for all the helpful answers!

furtheraway fucked around with this message at 17:36 on May 28, 2010

Adbot
ADBOT LOVES YOU

furtheraway
Dec 19, 2005

this milchkuh has given its last pound of flesh
Awesome. Thanks for the tips, everyone. Especially about the write bottleneck, since I hadn't even considered that mess.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply