|
Inverse Icarus posted:the tides rise and fall never a miscommunication.
|
# ? May 8, 2012 07:45 |
|
|
# ? May 9, 2024 18:52 |
|
Gazpacho posted:well yeah, because that information isn't in a raster font to be gotten the actual part i was using was grabbing a bitmap of the glyph at a specified size, which seems like it would be easy to do for a raster font, but alas there appears to be no way to do it with the windows api but i did find this when searching for a solution: http://wine.1045685.n5.nabble.com/Re-1-2-gdi32-GetGlyphOutline-should-fail-for-a-bitmap-font-td5675449.html "well this is incorrect behavior but the correct behavior breaks other things soooooo gently caress it"
|
# ? May 8, 2012 08:33 |
|
Toad King posted:the actual part i was using was grabbing a bitmap of the glyph at a specified size, which seems like it would be easy to do for a raster font, but alas there appears to be no way to do it with the windows api
|
# ? May 8, 2012 09:14 |
|
JawnV6 posted:i feel like i should read higher order perl but i mash text around just fine w/ perl right now a good book.
|
# ? May 8, 2012 09:28 |
|
Hammerite posted:oh. i assumed it would be something more interesting than that. idk why you'd care much about that it's a little more annoying than that python strings act like lists, that contain lists, etc. there is no notion of a character. the problem is telling a list of things and a string apart becomes awkward. you'll have hit this when you've passed some arg "foo" instead of ("foo",) it causes some other issues, for some reason, python lacks a flatten operator (guido ), and so everyone ends up rewriting it in their own unique way: for example code:
aside: array indexing on strings isn't a good idea when you actually use unicode rather than the sticking your fingers in your ears *la la la* everything is ascii *la la la nonsense*
|
# ? May 8, 2012 09:54 |
|
ahhh spiders posted:How would that even work returns bounding box
|
# ? May 8, 2012 10:06 |
|
tef posted:it's a little more annoying than that oh. yeah that seems quite a nasty thing to do. Presumably there is a python version of is_string you can use to avoid doing that any time you want to flatten a list of (lists of) strings? tef posted:aside: array indexing on strings isn't a good idea when you actually use unicode rather than the sticking your fingers in your ears *la la la* everything is ascii *la la la nonsense* Why? because combining characters?
|
# ? May 8, 2012 10:11 |
|
Hammerite posted:Why? because combining characters? that and variable width encodings. unless you use utf-32 to store strings, you will have to deal with variable width encodings in utf-8 or utf-16 (surrogate pairs). being able to leap n chars into a string without scanning is a holdover from ascii.
|
# ? May 8, 2012 11:14 |
|
tef posted:that and variable width encodings. unless you use utf-32 to store strings, you will have to deal with variable width encodings in utf-8 or utf-16 (surrogate pairs). being able to leap n chars into a string without scanning is a holdover from ascii. But what's the problem with array indexing of strings if you index codepoints or characters rather than bytes? (Characters to avoid the combining characters thing.) The language should be able to do that surely.
|
# ? May 8, 2012 12:01 |
|
when each character is a variable size then you have to walk the entire string every time you try to access the Nth entry
|
# ? May 8, 2012 12:16 |
|
Sweevo posted:when each character is a variable size then you have to walk the entire string every time you try to access the Nth entry not if the internal representation is utf 32!!
|
# ? May 8, 2012 12:41 |
|
tef posted:
really tef? invalid syntax in your example code?
|
# ? May 8, 2012 12:42 |
|
Sweevo posted:when each character is a variable size then you have to walk the entire string every time you try to access the Nth entry so don't do it if performance is crucial? idgaf and if i did gaf i'd just find a more efficient way to do whatever it is i wanted to do
|
# ? May 8, 2012 13:07 |
|
Lysidas posted:really tef? invalid syntax in your example code? eat it python-3 havers
|
# ? May 8, 2012 13:21 |
|
Hammerite posted:But what's the problem with array indexing of strings if you index codepoints or characters rather than bytes? tef posted:that and variable width encodings. unless you use utf-32 to store strings, KARMA! posted:not if the internal representation is utf 32!! basically you have to use 4 bytes for each character. it's expensive. quote:(Characters to avoid the combining characters thing.) The language should be able to do that surely. combining characters are different from variable width encodings, that's a whole separate issue altogether.
|
# ? May 8, 2012 13:47 |
|
from dive into mark (rip) I was walking across a bridge one day, and I saw a man standing on the edge, about to jump off. So I ran over and said, “Stop! Don’t do it!” “I can’t help it,” he cried. “I’ve lost my will to live.” “What do you do for a living?” I asked. He said, “I create web services specifications.” “Me too!” I said. “Do you use REST web services or SOAP web services?” He said, “REST web services.” “Me too!” I said. “Do you use text-based XML or binary XML?” He said, “Text-based XML.” “Me too!” I said. “Do you use XML 1.0 or XML 1.1?” He said, “XML 1.0.” “Me too!” I said. “Do you use UTF-8 or UTF-16?” He said, “UTF-8.” “Me too!” I said. “Do you use Unicode Normalization Form C or Unicode Normalization Form KC?” He said, “Unicode Normalization Form KC.” “Die, heretic scum!” I shouted, and I pushed him over the edge.
|
# ? May 8, 2012 13:48 |
|
tef posted:combining characters are different from variable width encodings, that's a whole separate issue altogether. yes I know. with the variable width thing, you're saying: you either have to store in utf-32 (expensive in terms of memory), or scan through the string to find the offset (expensive in terms of computation). I'm saying I'm fine with that, gimme my array indexing. as regards normalisation forms and all that, I have to concede I don't know enough to have an opinion.
|
# ? May 8, 2012 13:56 |
|
Hammerite posted:with the variable width thing, you're saying: you either have to store in utf-32 (expensive in terms of memory), or scan through the string to find the offset (expensive in terms of computation). yup. utf-32 is quite expensive in terms of memory. utf-8 is what a large proportion of documents are serialised in quote:I'm saying I'm fine with that, gimme my array indexing. fwiw, when I read at an offset, i'm likely scanning the string in the first place. (regarding python, It is more I don't think "foo"[1] should be valid, because strings don't provide list semantics or list performance in python, guido ). quote:as regards normalisation forms and all that, I have to concede I don't know enough to have an opinion. enjoy http://www.unicode.org/reports/tr15/
|
# ? May 8, 2012 14:17 |
|
tef posted:yup. utf-32 is quite expensive in terms of memory. quite expensive meaning that the first 11 of those 32 bits are always guaranteed to be zeroes
|
# ? May 8, 2012 15:18 |
|
i dont care about characters for some weirdo in cambodia hth
|
# ? May 8, 2012 15:38 |
|
qntm posted:quite expensive meaning that the first 11 of those 32 bits are always guaranteed to be zeroes oh no, a whole 0.000001086543 cents worth of memory is wasted how expensive
|
# ? May 8, 2012 15:39 |
|
Markov Chain Chomp posted:i dont care about characters for some weirdo in cambodia hth racist
|
# ? May 8, 2012 16:03 |
|
Is UTF-32 actually changing anything regarding grapheme clusters and whatnot? From my understanding, you can go to a given code point as you want thanks to being able to index everything right, but you don't have any better concept of 'character' in there, only stable code points. I mean, unless you're happily handling code point strings with code point-based length and whatnot, using UTF-32 won't give you a big advantage come text manipulation. To handle text, do truncation, word splitting, length calculation, word or letter replacement, you still need to analyze the whole drat thing, figure out your grapheme clusters and whatnot. So basically, you still need to do 95% (or some other arbitrary value) of the work of UTF-8 when in UTF-32, at a higher memory cost. UTF-8 makes sense, and variable code point size is the least of your worries when working with Unicode strings.
|
# ? May 8, 2012 16:13 |
|
there are 256 characters. many are used for drawing boxes with single/double borders. all the other characters are fake ones made up by some guys in sweden. i'm not saying you can't use the fake characters, just understand that they're technically 2 or more real characters and the computer is pretending they're something else
|
# ? May 8, 2012 16:14 |
|
JawnV6 posted:i feel like i should read higher order perl but i mash text around just fine w/ perl right now read it its good
|
# ? May 8, 2012 16:38 |
|
YOSPOS: Computer is pretending
|
# ? May 8, 2012 16:55 |
|
computers are basically pretending machines when you think about it
|
# ? May 8, 2012 17:06 |
|
yeah every layer pretends it's something it's not to the other layers
|
# ? May 8, 2012 17:06 |
|
I'm pretending to be paying attention to this meeting but really I'm being a dick in IRC.
|
# ? May 8, 2012 17:09 |
|
JawnV6 posted:yeah every layer pretends it's something it's not to the other layers interface? More like poseur imo.
|
# ? May 8, 2012 17:12 |
|
java would be much more fun to read code:
|
# ? May 8, 2012 17:14 |
|
rotor posted:interface? More like poseur imo. multiple clique inheritance
|
# ? May 8, 2012 17:19 |
|
ahhh spiders posted:the lua book hey spiders what do you think of squirrel
|
# ? May 8, 2012 17:19 |
|
oh sure, sure, we'll just execute those instructions.. one at a time.. in the order u listed...
|
# ? May 8, 2012 17:19 |
wheres shaggar he hasnt been around for a lil bit python owns
|
|
# ? May 8, 2012 17:21 |
|
thon
|
# ? May 8, 2012 17:22 |
|
programming is confusing
|
# ? May 8, 2012 17:22 |
|
Sulk posted:python owns
|
# ? May 8, 2012 17:23 |
|
Toady posted:hey spiders what do you think of squirrel why would i use it over lua
|
# ? May 8, 2012 17:27 |
|
|
# ? May 9, 2024 18:52 |
|
python is trash for hobbyists who cant just use java or c# cause those languages have already solved or trivialized all the problems people use python for.
|
# ? May 8, 2012 17:30 |