Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
EIDE Van Hagar
Dec 8, 2000

Beep Boop

Inverse Icarus posted:

the tides rise and fall

a star explodes

time marches ever forward

never a miscommunication.

Adbot
ADBOT LOVES YOU

Toad King
Apr 23, 2008

Yeah, I'm the best

Gazpacho posted:

well yeah, because that information isn't in a raster font to be gotten

the actual part i was using was grabbing a bitmap of the glyph at a specified size, which seems like it would be easy to do for a raster font, but alas there appears to be no way to do it with the windows api

but i did find this when searching for a solution: http://wine.1045685.n5.nabble.com/Re-1-2-gdi32-GetGlyphOutline-should-fail-for-a-bitmap-font-td5675449.html

"well this is incorrect behavior but the correct behavior breaks other things soooooo gently caress it"

Gazpacho
Jun 18, 2004

by Fluffdaddy
Slippery Tilde

Toad King posted:

the actual part i was using was grabbing a bitmap of the glyph at a specified size, which seems like it would be easy to do for a raster font, but alas there appears to be no way to do it with the windows api
oh that, you can make a bitmap DC and draw the character into it

tef
May 30, 2004

-> some l-system crap ->

JawnV6 posted:

i feel like i should read higher order perl but i mash text around just fine w/ perl right now

a good book.

tef
May 30, 2004

-> some l-system crap ->

Hammerite posted:

oh. i assumed it would be something more interesting than that. idk why you'd care much about that

it's a little more annoying than that

python strings act like lists, that contain lists, etc. there is no notion of a character. the problem is telling a list of things and a string apart becomes awkward.

you'll have hit this when you've passed some arg "foo" instead of ("foo",)

it causes some other issues, for some reason, python lacks a flatten operator (guido :argh:), and so everyone ends up rewriting it in their own unique way: for example

code:
def flatten(input):
    out = []
    try:
        for x in input:
            out.extend(flatten(x))
    except StandardError, e:
        out.append(input)
    return out


print flatten([1,[2,3],4])
print flatten("abc")
flatten "abc" shouldn't work but it does because you hit the recursion limit :q:

aside: array indexing on strings isn't a good idea when you actually use unicode rather than the sticking your fingers in your ears *la la la* everything is ascii *la la la nonsense*

rotor
Jun 11, 2001

classic case of pineapple derangement syndrome

ahhh spiders posted:

How would that even work

returns bounding box

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

tef posted:

it's a little more annoying than that

python strings act like lists, that contain lists, etc. there is no notion of a character. the problem is telling a list of things and a string apart becomes awkward.

you'll have hit this when you've passed some arg "foo" instead of ("foo",)

it causes some other issues, for some reason, python lacks a flatten operator (guido :argh:), and so everyone ends up rewriting it in their own unique way: for example

code:
def flatten(input):
    out = []
    try:
        for x in input:
            out.extend(flatten(x))
    except StandardError, e:
        out.append(input)
    return out


print flatten([1,[2,3],4])
print flatten("abc")
flatten "abc" shouldn't work but it does because you hit the recursion limit :q:

oh. yeah that seems quite a nasty thing to do. Presumably there is a python version of is_string you can use to avoid doing that any time you want to flatten a list of (lists of) strings?

tef posted:

aside: array indexing on strings isn't a good idea when you actually use unicode rather than the sticking your fingers in your ears *la la la* everything is ascii *la la la nonsense*

Why? because combining characters?

tef
May 30, 2004

-> some l-system crap ->

Hammerite posted:

Why? because combining characters?

that and variable width encodings. unless you use utf-32 to store strings, you will have to deal with variable width encodings in utf-8 or utf-16 (surrogate pairs). being able to leap n chars into a string without scanning is a holdover from ascii.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

tef posted:

that and variable width encodings. unless you use utf-32 to store strings, you will have to deal with variable width encodings in utf-8 or utf-16 (surrogate pairs). being able to leap n chars into a string without scanning is a holdover from ascii.

But what's the problem with array indexing of strings if you index codepoints or characters rather than bytes? (Characters to avoid the combining characters thing.) The language should be able to do that surely.

Sweevo
Nov 8, 2007

i sometimes throw cables away

i mean straight into the bin without spending 10+ years in the box of might-come-in-handy-someday first

im a fucking monster

when each character is a variable size then you have to walk the entire string every time you try to access the Nth entry

karms
Jan 22, 2006

by Nyc_Tattoo
Yam Slacker

Sweevo posted:

when each character is a variable size then you have to walk the entire string every time you try to access the Nth entry

not if the internal representation is utf 32!! :q:

Lysidas
Jul 26, 2002

John Diefenbaker is a madman who thinks he's John Diefenbaker.
Pillbug

tef posted:

code:
    except StandardError, e:

really tef? invalid syntax in your example code?

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Sweevo posted:

when each character is a variable size then you have to walk the entire string every time you try to access the Nth entry

so don't do it if performance is crucial? idgaf and if i did gaf i'd just find a more efficient way to do whatever it is i wanted to do

tef
May 30, 2004

-> some l-system crap ->

Lysidas posted:

really tef? invalid syntax in your example code?

eat it python-3 havers

tef
May 30, 2004

-> some l-system crap ->

Hammerite posted:

But what's the problem with array indexing of strings if you index codepoints or characters rather than bytes?

tef posted:

that and variable width encodings. unless you use utf-32 to store strings,

KARMA! posted:

not if the internal representation is utf 32!! :q:


basically you have to use 4 bytes for each character. it's expensive.


quote:

(Characters to avoid the combining characters thing.) The language should be able to do that surely.

combining characters are different from variable width encodings, that's a whole separate issue altogether. :q:

tef
May 30, 2004

-> some l-system crap ->
from dive into mark (rip)


I was walking across a bridge one day, and I saw a man standing on the edge, about to jump off. So I ran over and said, “Stop! Don’t do it!”

“I can’t help it,” he cried. “I’ve lost my will to live.”

“What do you do for a living?” I asked.

He said, “I create web services specifications.”

“Me too!” I said. “Do you use REST web services or SOAP web services?”

He said, “REST web services.”

“Me too!” I said. “Do you use text-based XML or binary XML?”

He said, “Text-based XML.”

“Me too!” I said. “Do you use XML 1.0 or XML 1.1?”

He said, “XML 1.0.”

“Me too!” I said. “Do you use UTF-8 or UTF-16?”

He said, “UTF-8.”

“Me too!” I said. “Do you use Unicode Normalization Form C or Unicode Normalization Form KC?”

He said, “Unicode Normalization Form KC.”

“Die, heretic scum!” I shouted, and I pushed him over the edge.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

tef posted:

combining characters are different from variable width encodings, that's a whole separate issue altogether. :q:

yes I know.

with the variable width thing, you're saying: you either have to store in utf-32 (expensive in terms of memory), or scan through the string to find the offset (expensive in terms of computation). I'm saying I'm fine with that, gimme my array indexing.

as regards normalisation forms and all that, I have to concede I don't know enough to have an opinion.

tef
May 30, 2004

-> some l-system crap ->

Hammerite posted:

with the variable width thing, you're saying: you either have to store in utf-32 (expensive in terms of memory), or scan through the string to find the offset (expensive in terms of computation).

yup. utf-32 is quite expensive in terms of memory. utf-8 is what a large proportion of documents are serialised in

quote:

I'm saying I'm fine with that, gimme my array indexing.

fwiw, when I read at an offset, i'm likely scanning the string in the first place.

(regarding python, It is more I don't think "foo"[1] should be valid, because strings don't provide list semantics or list performance in python, guido :argh:).


quote:

as regards normalisation forms and all that, I have to concede I don't know enough to have an opinion.

enjoy http://www.unicode.org/reports/tr15/ :catdrugs:

qntm
Jun 17, 2009

tef posted:

yup. utf-32 is quite expensive in terms of memory.

quite expensive meaning that the first 11 of those 32 bits are always guaranteed to be zeroes

blorpy
Jan 5, 2005

i dont care about characters for some weirdo in cambodia hth

penus de milo
Mar 9, 2002

CHAR CHAR

qntm posted:

quite expensive meaning that the first 11 of those 32 bits are always guaranteed to be zeroes

oh no, a whole 0.000001086543 cents worth of memory is wasted how expensive :rolleyes:

tef
May 30, 2004

-> some l-system crap ->

Markov Chain Chomp posted:

i dont care about characters for some weirdo in cambodia hth

racist

MononcQc
May 29, 2007

Is UTF-32 actually changing anything regarding grapheme clusters and whatnot? From my understanding, you can go to a given code point as you want thanks to being able to index everything right, but you don't have any better concept of 'character' in there, only stable code points.

I mean, unless you're happily handling code point strings with code point-based length and whatnot, using UTF-32 won't give you a big advantage come text manipulation.

To handle text, do truncation, word splitting, length calculation, word or letter replacement, you still need to analyze the whole drat thing, figure out your grapheme clusters and whatnot.

So basically, you still need to do 95% (or some other arbitrary value) of the work of UTF-8 when in UTF-32, at a higher memory cost. UTF-8 makes sense, and variable code point size is the least of your worries when working with Unicode strings.

Tiny Bug Child
Sep 11, 2004

Avoid Symmetry, Allow Complexity, Introduce Terror
there are 256 characters. many are used for drawing boxes with single/double borders. all the other characters are fake ones made up by some guys in sweden. i'm not saying you can't use the fake characters, just understand that they're technically 2 or more real characters and the computer is pretending they're something else

Rufus Ping
Dec 27, 2006





I'm a Friend of Rodney Nano

JawnV6 posted:

i feel like i should read higher order perl but i mash text around just fine w/ perl right now

read it its good

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe
YOSPOS: Computer is pretending

Police Academy III
Nov 4, 2011
computers are basically pretending machines when you think about it

JawnV6
Jul 4, 2004

So hot ...
yeah every layer pretends it's something it's not to the other layers

penus de milo
Mar 9, 2002

CHAR CHAR
I'm pretending to be paying attention to this meeting but really I'm being a dick in IRC.

rotor
Jun 11, 2001

classic case of pineapple derangement syndrome

JawnV6 posted:

yeah every layer pretends it's something it's not to the other layers

interface? More like poseur imo.

trex eaterofcadrs
Jun 17, 2005
My lack of understanding is only exceeded by my lack of concern.
java would be much more fun to read

code:
public poser List<T> { ... } 

JawnV6
Jul 4, 2004

So hot ...

rotor posted:

interface? More like poseur imo.

multiple clique inheritance

Toady
Jan 12, 2009

ahhh spiders posted:

the lua book

hey spiders what do you think of squirrel

JawnV6
Jul 4, 2004

So hot ...
oh sure, sure, we'll just execute those instructions.. one at a time.. in the order u listed...

double sulk
Jul 2, 2010

wheres shaggar he hasnt been around for a lil bit

python owns

Rufus Ping
Dec 27, 2006





I'm a Friend of Rodney Nano
:pwn:thon

The Best Christmas
May 13, 2011

programming is confusing :(

CaptainMeatpants
Jun 1, 2010

Sulk posted:

python owns

vapid cutlery
Apr 17, 2007

php:
<?
"it's george costanza" ?>

Toady posted:

hey spiders what do you think of squirrel

why would i use it over lua

Adbot
ADBOT LOVES YOU

Shaggar
Apr 26, 2006
python is trash for hobbyists who cant just use java or c# cause those languages have already solved or trivialized all the problems people use python for.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply