Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
prefect
Sep 11, 2001

No one, Woodhouse.
No one.




Dead Man’s Band

MononcQc posted:

Schemes and Lisp often toyed with the idea of alternative function-based approaches to regular expressions. One of them (SRE) had the POSIX regex:
...

:stare: :wow: :tipshat:

Adbot
ADBOT LOVES YOU

uG
Apr 23, 2003

by Ralp

MononcQc posted:

Schemes and Lisp often toyed with the idea of alternative function-based approaches to regular expressions. One of them (SRE) had the POSIX regex:

...

This is all macro-time poo poo, but there was run-time stuff prepared for it with a different function (use csl instead of rx)

It's an interesting approach that ultimately seems to have never caught on (much like Scheme itself), and can be somewhat obtained by concatenating strings (although groups of backreferences and poo poo are not safe to merge all willy-nilly)

how is this different than using code blocks inside regular expressions?

Zombywuf
Mar 29, 2008

FamDav posted:

i call it more powerful because it's more general and has less structure imposed on it. this can also lead to it interacting poorly with other features of the language resulting in ridiculous errors.

Exactly, the power of a language feature is measured by the size of the hole it leaves in your foot.

tef
May 30, 2004

-> some l-system crap ->

FamDav posted:

<c++ feature> is a bad feature because what you really wanted was something like <thought out feature>. instead you got something that allowed you to kind of <useful thing> but also allowed you to get really weird behavior because <c++>

c++ is a world where everything is possible but the price is unreasonable

Malcolm XML
Aug 8, 2009

I always knew it would end like this.

MononcQc posted:

Schemes and Lisp often toyed with the idea of alternative function-based approaches to regular expressions. One of them (SRE) had the POSIX regex:

code:
"[[:<:]]([b-df-hj-np-tv-zB-DF-HJ-NP-TV-Z])+[[:>]]"

Be represented in Scheme as:

code:
(w/nocase (word+ (~ ("aeiou"))))

Where a sequence of functions would just build a matching expression. The interesting thing you could do with these is that you can define matches as functions and compose them:

code:

(define ws (rx (+ whitespace))) ; Seq of whitespace

(define date (rx (: (| "Jan" "Feb" "Mar" ...) ; A month/day date.
                    ,ws
                    (| ("123456789")          ; 1-9
                       (: ("12") digit)	      ; 10-29
                       "30" "31")))))

In that case we're defining two expression: ws, for whitespace, and date, for a month followed by whitespace and the numbers 1-31. Later on, I can define and compose regular expressions by doing:

code:

(rx ... ,date ... (* ... ,date ...)	    
          ... .... ,date))

This is all macro-time poo poo, but there was run-time stuff prepared for it with a different function (use csl instead of rx)

It's an interesting approach that ultimately seems to have never caught on (much like Scheme itself), and can be somewhat obtained by concatenating strings (although groups of backreferences and poo poo are not safe to merge all willy-nilly)

So...parsec? Limited to regular languages?

libcxx
Mar 15, 2013
thread_local post<shit> shit_post("lol if u");

tef posted:

c++ is a world where everything is possible but the price is unreasonable

for a lot of applications, this is a better starting point than a language that comes pre-restricted for the sake of convenience

PleasingFungus
Oct 10, 2012
idiot asshole bitch who should fuck off

Shaggar posted:

The fact that the language convention tends towards readability is one of its strengths. also the auto complete literally makes it faster than an untyped shortcut that the autocomplete cant figure out for you.

Shaggar posted:

if you get mad about verbosity you're the biggest whiney babby idiot.

lol at "verbosity == readability"

Shaggar posted:

if lambdas ever become useful im sure they'll be added to java

lol @ this

gucci void main posted:

weve hit peak shaggar

PleasingFungus
Oct 10, 2012
idiot asshole bitch who should fuck off

coffeetable posted:

it's the sierpinski gasket. probably seen it on the cover of some textbook or other

e: and the other one is the sierpinski arrowhead curve, though i dunno why you'd be more likely to have seen that one than the gasket

also lollin' at people who browse w/avs off

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

PleasingFungus posted:

lol at "verbosity == readability"

http://en.wikipedia.org/wiki/Trimming_%28computer_programming%29#Usage

coffeetable
Feb 5, 2006

TELL ME AGAIN HOW GREAT BRITAIN WOULD BE IF IT WAS RULED BY THE MERCILESS JACKBOOT OF PRINCE CHARLES

YES I DO TALK TO PLANTS ACTUALLY

PleasingFungus posted:

also lollin' at people who browse w/avs off

ffff

Zlodo
Nov 25, 2006

MononcQc posted:

Schemes and Lisp often toyed with the idea of alternative function-based approaches to regular expressions. One of them (SRE) had the POSIX regex:

code:
"[[:<:]]([b-df-hj-np-tv-zB-DF-HJ-NP-TV-Z])+[[:>]]"
Be represented in Scheme as:

code:
(w/nocase (word+ (~ ("aeiou"))))
Where a sequence of functions would just build a matching expression. The interesting thing you could do with these is that you can define matches as functions and compose them:

code:
(define ws (rx (+ whitespace))) ; Seq of whitespace

(define date (rx (: (| "Jan" "Feb" "Mar" ...) ; A month/day date.
                    ,ws
                    (| ("123456789")          ; 1-9
                       (: ("12") digit)	      ; 10-29
                       "30" "31")))))
In that case we're defining two expression: ws, for whitespace, and date, for a month followed by whitespace and the numbers 1-31. Later on, I can define and compose regular expressions by doing:

code:
(rx ... ,date ... (* ... ,date ...)	    
          ... .... ,date))
This is all macro-time poo poo, but there was run-time stuff prepared for it with a different function (use csl instead of rx)

It's an interesting approach that ultimately seems to have never caught on (much like Scheme itself), and can be somewhat obtained by concatenating strings (although groups of backreferences and poo poo are not safe to merge all willy-nilly)

well basically this is what happens when you want to make regex not poo poo: you turn it into an actual parser with multiple independently defined grammar rules that can be reused, and with the right language / library writing such rules can be made really nice and easy so why even bother with regex

this is typically the type of things that ~the power of C++~ is very good at. using templates and operators overloading you can have something very similar to that scheme thing above that let you define and compose parsing grammar rules like this:

code:
    auto slsl = Term( '/' ) & Term( '/' );
    auto slst = Term( '/' ) & Term( '*' );
    auto stsl = Term( '*' ) & Term( '/' );

    auto NewLine = Term( '\n' ) >> [&]( const range_t& r ){ ++lineNum; lineStart = r.begin() - source.begin(); };

    auto Comment = slsl & Until( Any(), End() | NewLine );
    auto CommentBlock = slst & Until( NewLine | Any(), stsl );
    auto Blank = InString( " \t\r" ) | NewLine;
    auto Space = +( Blank | Comment | CommentBlock );
then it all compiles down into the bunch of function you'd write for each of those rules to implement that grammar as a recursive descent parser
boost::spirit offers just such a parsing library

gonadic io
Feb 16, 2011

>>=

Malcolm XML posted:

So...parsec?

Wheany
Mar 17, 2006

Spinyahahahahahahahahahahahaha!

Doctor Rope

Zlodo posted:

regex are gross and only useful to sift through the dejections of another program to find some data

if you need to write regex you are doing a job so unimportant that no one bothered to output the data you need in an usable format

sometimes regexes just own.

FamDav
Mar 29, 2008

like always the haskell version is infinitely better

Shaggar
Apr 26, 2006

PleasingFungus posted:

lol at "verbosity == readability"

in the case of java the "verbosity" idiots are whining about is readability. Having class and variable names that describe what they are instead of p-lang style
Python code:
var ____xXXx4;

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
ah, i see. so int ____xXXx4; is better than var ____xXXx4;. gotcha.

MononcQc
May 29, 2007

uG posted:

how is this different than using code blocks inside regular expressions?

Not sure given I haven't used or seen code blocks inside regexes in any significant manner. My guess is that code inside regexes is about changing or augmenting the regex execution while the scheme approach is to treat the regex as a data structure that is more powerful than a string -- it just turns out that lisps consider data to be code, so you can both run code and represent the regex as sequences of possibly nested data structures that can be composed together.

Malcolm XML posted:

So...parsec? Limited to regular languages?

Possibly. Not that it matters given the context of the original question was "is there a language where a nice pile of regex isn't going to look like that, though?"

Shaggar
Apr 26, 2006

Suspicious Dish posted:

ah, i see. so int ____xXXx4; is better than var ____xXXx4;. gotcha.

yes but int ____bonerCount is even better and then int bonerCount would be best.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
wouldnt it be BONER_COUNT?

Shaggar
Apr 26, 2006
not unless its a constant which it wouldn't be unless maybe u have ed?

FamDav
Mar 29, 2008

MononcQc posted:

Not sure given I haven't used or seen code blocks inside regexes in any significant manner. My guess is that code inside regexes is about changing or augmenting the regex execution while the scheme approach is to treat the regex as a data structure that is more powerful than a string -- it just turns out that lisps consider data to be code, so you can both run code and represent the regex as sequences of possibly nested data structures that can be composed together.


Possibly. Not that it matters given the context of the original question was "is there a language where a nice pile of regex isn't going to look like that, though?"

is your question "is there a language where the default library implementation of regex is parser combinators?" becuase i'd say its between maybe and no.

but like all languages have parser combinator libraries

FamDav
Mar 29, 2008
also you know how like sometimes people are like "programmers are dumb they cant understand this language feature"

well add post/pre increment to this

http://forums.somethingawful.com/showthread.php?threadid=3376083&userid=0&perpage=40&pagenumber=153#post418468778

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

Shaggar posted:

not unless its a constant which it wouldn't be unless maybe u have ed?

yospos bitch

weird
Jun 4, 2012

by zen death robot

FamDav posted:

also you know how like sometimes people are like "programmers are dumb they cant understand this language feature"

well add post/pre increment to this

http://forums.somethingawful.com/showthread.php?threadid=3376083&userid=0&perpage=40&pagenumber=153#post418468778

isnt that ub?

FamDav
Mar 29, 2008
it takes awhile but someone eventually says that yes

PleasingFungus
Oct 10, 2012
idiot asshole bitch who should fuck off
gently caress people who try to pull dumb poo poo with pre/post-increment. the fanciest thing you're allowed to do with them is traverse an array with a pointer in one step, i.e. foo = *(i++). anything more complex than that is (a) unreadable, (b) probably undefined and (c) makes you look like a douche.

Bloody
Mar 3, 2013

FamDav posted:

also you know how like sometimes people are like "programmers are dumb they cant understand this language feature"

well add post/pre increment to this

http://forums.somethingawful.com/showthread.php?threadid=3376083&userid=0&perpage=40&pagenumber=153#post418468778

if i got this in an interview i would almost certainly tell the interviewer to blow it out their rear end which is really the only correct answer

Bloody
Mar 3, 2013

PleasingFungus posted:

gently caress people who try to pull dumb poo poo with pre/post-increment. the fanciest thing you're allowed to do with them is traverse an array with a pointer in one step, i.e. foo = *(i++). anything more complex than that is (a) unreadable, (b) probably undefined and (c) makes you look like a douche.

but but but my artisanal bespoke handcrafted shitcode!!!

Tiny Bug Child
Sep 11, 2004

Avoid Symmetry, Allow Complexity, Introduce Terror

Zlodo posted:

regex are gross and only useful to sift through the dejections of another program to find some data

if you need to write regex you are doing a job so unimportant that no one bothered to output the data you need in an usable format

if you never need to write regex your job is so boring that someone has already gotten to the data you need and carefully digested + regurgitated it so your babby self can handle it

MononcQc
May 29, 2007

lmao at the idea that a factor to identify a fun job is having regexes as a necessity

Janitor Prime
Jan 22, 2004

PC LOAD LETTER

What da fuck does that mean

Fun Shoe

PleasingFungus posted:

gently caress people who try to pull dumb poo poo with pre/post-increment. the fanciest thing you're allowed to do with them is traverse an array with a pointer in one step, i.e. foo = *(i++). anything more complex than that is (a) unreadable, (b) probably undefined and (c) makes you look like a douche.

Similar in vein is using the it to fill in prepared statements in jdbc code.
Java code:
int i = 1;
ps.setLong(i++, 80085);
ps.setString(i++, "poo poo");
ps.setString(i++, "lord");
ps.setString(i++, "posting");
ps.setString(i++, "inc");
ps.executeUpdate();

Workaday Wizard
Oct 23, 2009

by Pragmatica
why does that c code even compile? are c compilers too lazy to throw an error for that case or is there legit reason?


also re: regexs. if your regex has multiple compnents what's wrong with creating string patterns for each component and +ing them when you compile regex object??

Shaggar
Apr 26, 2006

Hard NOP Life posted:

Similar in vein is using the it to fill in prepared statements in jdbc code.
Java code:
int i = 1;
ps.setLong(i++, 80085);
ps.setString(i++, "poo poo");
ps.setString(i++, "lord");
ps.setString(i++, "posting");
ps.setString(i++, "inc");
ps.executeUpdate();

ugghhhhhhh

JewKiller 3000
Nov 28, 2006

by Lowtax

FamDav posted:

also you know how like sometimes people are like "programmers are dumb they cant understand this language feature"

well add post/pre increment to this

http://forums.somethingawful.com/showthread.php?threadid=3376083&userid=0&perpage=40&pagenumber=153#post418468778

jesus christ i read farther in that thread, and there's someone about to graduate with a 4.0 in cs who doesn't know what a hashmap is, with the excuse "i only know java"

Cocoa Crispies
Jul 20, 2001

Vehicular Manslaughter!

Pillbug

Shinku ABOOKEN posted:

why does that c code even compile? are c compilers too lazy to throw an error for that case or is there legit reason?

it's not illegal, it's just undefined

that these are distinct concepts, well

PleasingFungus
Oct 10, 2012
idiot asshole bitch who should fuck off

Shinku ABOOKEN posted:

why does that c code even compile? are c compilers too lazy to throw an error for that case or is there legit reason?

compiler maintainers feel a need to support "legacy code" - if it 'works' then people get mad when the new compiler version breaks it, even if it should never have worked in the first place

-pedantic --Wall exist for a reason but sadly aren't on by default

FamDav
Mar 29, 2008

Cocoa Crispies posted:

it's not illegal, it's just undefined

that these are distinct concepts, well


PleasingFungus posted:

compiler maintainers feel a need to support "legacy code" - if it 'works' then people get mad when the new compiler version breaks it, even if it should never have worked in the first place

-pedantic --Wall exist for a reason but sadly aren't on by default

my variation on this is that certain types of behavior were left undefined to make things easier for compiler writers to optimize, or something.

of course this is a bad thing

Brain Candy
May 18, 2006

JewKiller 3000 posted:

jesus christ i read farther in that thread, and there's someone about to graduate with a 4.0 in cs who doesn't know what a hashmap is, with the excuse "i only know java"

this person will get a job writing your medical billing software

tef
May 30, 2004

-> some l-system crap ->

MononcQc posted:

Possibly. Not that it matters given the context of the original question was "is there a language where a nice pile of regex isn't going to look like that, though?"

Off the top of my head:

prolog uses definite clause grammars, it's like peg but with no memorisation, but way more control over search/backtracking.

icon uses a neat but somewhat weird system of generators and goal directed evaluation

snobol uses pattern matching and conditional jumps

a lot of languages use combinator parsing, which is basically like building a recursive descent parser from functional composition

but, heh, lua has lpeg, which is a backtracking combinator library, which then a regex library is built on. it's well nice

Adbot
ADBOT LOVES YOU

Zombywuf
Mar 29, 2008

JewKiller 3000 posted:

jesus christ i read farther in that thread, and there's someone about to graduate with a 4.0 in cs who doesn't know what a hashmap is, with the excuse "i only know java"

I'm pleasantly surprised when I meet high level CS degree holders who know what an array is.

Or a variable.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply