Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
His Divine Shadow
Aug 7, 2000

I'm not a fascist. I'm a priest. Fascists dress up in black and tell people what to do.
I'm working through learning C with the Dennis* Ritchie & Brian Kernighan book. Spent more time fiddling with the environment than programming the last few days however... I am on excersize 1.9. Been able to figure them out so far. I found a site with all the exercises as the books, in the same order. So I started going there afterwards to check how my solutions fare against the ones there.

This is how I approached the excersize of removing all double or longer blanks, and it works when I test it on a text file I got set up.


The site solution seems to be doing what I came up with though I think my syntax is more compact.
https://www.learntosolveit.com/cprogramming/ex_1.9_sinblank

I note they set up an initial value for the previous char as well as used a symbolic constant. I didn't do any of that since I didn't see the point. But is this a practice I should consider when doing C? Just trying to get good habits in from the start.

*I share the same name as the creator of C, that bodes well :)

His Divine Shadow fucked around with this message at 18:44 on Jan 1, 2023

Adbot
ADBOT LOVES YOU

OddObserver
Apr 3, 2009
Pretty sure not initializing pc means that the program is permitted to make your computer explode, or at least delete all your files and send nasty e-mails to all people you care about.

(The value can be accessed while not initialized if the first character is a space; the term is "undefined behavior").

OddObserver fucked around with this message at 18:54 on Jan 1, 2023

OddObserver
Apr 3, 2009
Edit: quote is not edit, my apologies

RPATDO_LAMD
Mar 22, 2013

🐘🪠🍆
Pro tip: if you have a variable named "pc" that needs a comment to explain it means "previous character", just name it previous_character instead. Variable names are comments too.

That's not the style in the K&R book but we don't program on 80 character wide mainframe terminals anymore.

And yeah if the first character is ' ' you read from pc without setting its value which is undefined behavior. In most cases means you get whatever random garbage data was already sitting in that spot in memory.

RPATDO_LAMD fucked around with this message at 19:02 on Jan 1, 2023

His Divine Shadow
Aug 7, 2000

I'm not a fascist. I'm a priest. Fascists dress up in black and tell people what to do.
Thanks that makes sense, learned the importance of initializing variables.

StumblyWumbly
Sep 12, 2007

Batmanticore!
More descriptive variable names and initializing values are important. Couple of other small things that matter for larger, longer lived programs:
- You should indent the putchar line since it follows the if statement. The indents make the execution flow easier to read, and it is just a universal standard. I prefer to also always have braces, even for one line if statements, because it makes the code clearer and easier to expand.
- c and pc are ints, seems like they should be char, I assume that's the type that get_char returns.

His Divine Shadow
Aug 7, 2000

I'm not a fascist. I'm a priest. Fascists dress up in black and tell people what to do.
The book used ints in the examples and said this is because in a single character inside '' quotes is actually a small integer so it works with ints

Apparently you can use char, but chars don't like negative values, which might be returned by some non-ascii characters. So not a problem with ints, but apparently you can use a signed char to make it allow negative values. Or use ints?

My understanding might be flawed.

csammis
Aug 26, 2003

Mental Institution
Learning C from K&R in the year 2023 seems really counterproductive unless your goal is explicitly to learn how C was written forty years ago. The modern C standard has more expressive power and safety than K&R had at the time.

That said I’ve been in C++ for long enough that I don’t know any modern resources for learning modern C, so. Maybe someone else here does?

Zopotantor
Feb 24, 2013

...und ist er drin dann lassen wir ihn niemals wieder raus...

StumblyWumbly posted:

- c and pc are ints, seems like they should be char, I assume that's the type that get_char returns.

No, they have to be ints, because they must be able to hold the value EOF which is outside the range of type char.

His Divine Shadow
Aug 7, 2000

I'm not a fascist. I'm a priest. Fascists dress up in black and tell people what to do.

csammis posted:

Learning C from K&R in the year 2023 seems really counterproductive unless your goal is explicitly to learn how C was written forty years ago. The modern C standard has more expressive power and safety than K&R had at the time.

That said I’ve been in C++ for long enough that I don’t know any modern resources for learning modern C, so. Maybe someone else here does?

I like this way personally. It's more interesting I find.

And I asked around the net before starting, asking about books and so, this was still recommended by loads of people to me. This is the 2nd edition which uses ANSI C which is very close to modern C that it should not be an issue.

fawning deference
Jul 4, 2018

I'm learning it through Algorithms in C and I'm enjoying it a lot.

yippee cahier
Mar 28, 2005

If you’re going to stick with K&R you can still crank up the warnings offered by your compiler. Reading known uninitialized variables would be a compilation error in the codebase at my job. Sometimes they can get annoying to deal with, but a little poking into why something’s considered a warning/error can be a real learning experience.

qsvui
Aug 23, 2003
some crazy thing

csammis posted:

That said I’ve been in C++ for long enough that I don’t know any modern resources for learning modern C, so. Maybe someone else here does?

Here's a book that's literally called Modern C :v:. It has some questionable advice that I disagree with from place to place and the style reeks of being academic, but I still think it should be better than K&R. Edit: It looks like the author has a free version of this book here.

Beej's Guide to C Programming looks to be updated regularly. I haven't read through it but based on the table of contents, it seems to cover all the modern features.

qsvui fucked around with this message at 23:58 on Jan 1, 2023

Dijkstracula
Mar 18, 2003

You can't spell 'vector field' without me, Professor!

Agreeing with folks ITT that K&R is a book that everyone recommends but nobody actually reads and whose value is pretty marginal these days - I consider C99 an absolute minimum in terms of language revisions, and as others have said make sure you've enabled compiler warnings with -Wall or potentially -Wextra so the compiler can let you know when you're doing something potentially wrong.

Can confirm that Beej's book is good; I used it when I taught sophomores their intro to C class.

His Divine Shadow posted:

Apparently you can use char, but chars don't like negative values, which might be returned by some non-ascii characters. So not a problem with ints, but apparently you can use a signed char to make it allow negative values. Or use ints?
This is sort of correct - whether chars "like negative values" refers to its signedness, and this is actually compiler-specific since the C standard doesn't specify anything about chars beyond "they're one byte in size". So, if it's signed, a char can hold values from -128 through to 127, and if it's unsigned, it holds values from 0 to 255 - different bits in the one byte value will potentially encode different numbers depending on the signedness of the variable. Traditional ASCII characters are all between 0 and 127 so it doesn't strictly matter about char's signedness here, unless you start caring about "extended ASCII" for things like vowels with accents and whatnot, which gets into a whole big ball of yarn which I'd suggest ignoring until you're completely solid on the fundamentals.

Wipfmetz
Oct 12, 2007

Sitzen ein oder mehrere Wipfe in einer Lore, so kann man sie ueber den Rand der Lore hinausschauen sehen.

Dijkstracula posted:

This is sort of correct - whether chars "like negative values" refers to its signedness, and this is actually compiler-specific since the C standard doesn't specify anything about chars beyond "they're one byte in size".
I might confuse C and C++ here, but I think the standard doesn't even say that @ one byte.
It's just "the smallest integer type, as large as the smallest adressable memory thing". Good and honest systems translate this to "one byte at 8 bits", but there were some weirdos out there with 6bit-chars.

leper khan
Dec 28, 2010
Honest to god thinks Half Life 2 is a bad game. But at least he likes Monster Hunter.

Wipfmetz posted:

I might confuse C and C++ here, but I think the standard doesn't even say that @ one byte.
It's just "the smallest integer type, as large as the smallest adressable memory thing". Good and honest systems translate this to "one byte at 8 bits", but there were some weirdos out there with 6bit-chars.

The data type is sizing though. And an unsigned or signed modifier affect signedness. With a signed default.

I don't remember if representation of signedness as two's complement is specified in the standard; I'd guess it's implementation defined for those weird 70s machines that have a hardware sign bit.

nielsm
Jun 1, 2009



No, two's complement is specifically not in the C standard. That's (part of?) why signed integer overflow is undefined behavior.

giogadi
Oct 27, 2009

In fact, C++ only started requiring two’s complement for signed numbers in C++20

more falafel please
Feb 26, 2005

forums poster

Wipfmetz posted:

I might confuse C and C++ here, but I think the standard doesn't even say that @ one byte.
It's just "the smallest integer type, as large as the smallest adressable memory thing". Good and honest systems translate this to "one byte at 8 bits", but there were some weirdos out there with 6bit-chars.

One byte is not 8 bits, it's "as large as the smallest addressable memory thing".

Dijkstracula
Mar 18, 2003

You can't spell 'vector field' without me, Professor!

Wipfmetz posted:

I might confuse C and C++ here, but I think the standard doesn't even say that @ one byte.
It's just "the smallest integer type, as large as the smallest adressable memory thing". Good and honest systems translate this to "one byte at 8 bits", but there were some weirdos out there with 6bit-chars.
So c99 at least specifies SCHAR_MIN / UCHAR_MIN and SCHAR_MAX / UCHAR_MAX as the expected values (section 5.2.4.2), but at the same time defines a byte as the datatype big enough to hold any member of the environment's character set (section 6.2.5), which section 5.2.1 defines as essentially the 7-bit clean ASCII characters plus some control codes (whichwouldn't fit into a six-bit value so I don't honestly know what the architecture you're referring to would do here).

Dijkstracula fucked around with this message at 19:00 on Jan 2, 2023

RPATDO_LAMD
Mar 22, 2013

🐘🪠🍆

Wipfmetz posted:

I might confuse C and C++ here, but I think the standard doesn't even say that @ one byte.
It's just "the smallest integer type, as large as the smallest adressable memory thing". Good and honest systems translate this to "one byte at 8 bits", but there were some weirdos out there with 6bit-chars.

There are/were architectures out there with 5, 6, or 9-bit bytes! C was designed to work on a variety of architectures back when poo poo was way less standardized/interoperable

RPATDO_LAMD fucked around with this message at 20:04 on Jan 2, 2023

Wipfmetz
Oct 12, 2007

Sitzen ein oder mehrere Wipfe in einer Lore, so kann man sie ueber den Rand der Lore hinausschauen sehen.

more falafel please posted:

One byte is not 8 bits, it's "as large as the smallest addressable memory thing".
Oh! Well, learned something new then.

That Turkey Story
Mar 30, 2003

RPATDO_LAMD posted:

There are/were architectures out there with 5, 6, or 9-bit bytes! C was designed to work on a variety of architectures back when poo poo was way less standardized/interoperable

The more common case (still can be encountered) are implementations that have their "byte" as 16-bit or 32-bit. I never worked with such an implementation directly, but when I was in robotics not that long ago people had worked with such implementations, as well as signed magnitude implementations.

Foxfire_
Nov 8, 2010

nielsm posted:

That's (part of?) why signed integer overflow is undefined behavior.
The main reason I think it's still around is that an optimizer being able to assume that things like "x+1 is always greater than x" is useful

mmkay
Oct 21, 2010

Foxfire_ posted:

The main reason I think it's still around is that an optimizer being able to assume that things like "x+1 is always greater than x" is useful

Why would this differ between signed and unsigned integers?

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

mmkay posted:

Why would this differ between signed and unsigned integers?

Because unsigned integers acting modulo 2^(bit size) is also a useful behaviour that some legitimate programs want to intentionally take advantage of.

Dylan16807
May 12, 2010

mmkay posted:

Why would this differ between signed and unsigned integers?

Many decades ago, enough systems handled unsigned overflow identically, but not signed overflow.

You also want at least one type that wraps, for easy math use.

pseudorandom name
May 6, 2007

mmkay posted:

Why would this differ between signed and unsigned integers?

Because everybody writes for (int i = 0; ...) and nobody writes for (unsigned int i = 0; ...).

RPATDO_LAMD
Mar 22, 2013

🐘🪠🍆
There's 0 technical need to make signed overflow undefined, it could've just been implementation defined in the same way that sizeof(int) is. Each compiler can do its own thing but they have to be consistent and sensible about whatever thing they chose.

Marking it as UB just lets the compiler pretend/assume "this will never happen" and enables a few optimizations as a result. They could've done the same thing with unsigned overflow! But it would break a lot more programs and piss a lot more programmers off since using unsigned overflow for meaningful things is more common in real codebases.

Zopotantor
Feb 24, 2013

...und ist er drin dann lassen wir ihn niemals wieder raus...

pseudorandom name posted:

Because everybody writes for (int i = 0; ...) and nobody writes for (unsigned int i = 0; ...).

:eng99: Did the latter. Had a bug. Compiler turned my code into an endless loop. Was schooled on the GCC mailing list on how this was my bug.

Presto
Nov 22, 2002

Keep calm and Harry on.
Once in a while you'll see someone do:

code:

for (unsigned int i = 10; i >= 0; --i)

Oops.

go play outside Skyler
Nov 7, 2005


Presto posted:

Once in a while you'll see someone do:

code:
for (unsigned int i = 10; i >= 0; --i)
Oops.

i had that happen to me last year even though i've been writing embedded c for 10 years.

roomforthetuna
Mar 22, 2005

I don't need to know anything about virii! My CUSTOM PROGRAM keeps me protected! It's not like they'll try to come in through the Internet or something!

Presto posted:

Once in a while you'll see someone do:
A much easier mistake to make with a type that obscures the unsignedness, e.g. size_t.

Xarn
Jun 26, 2015
Even better:
C++ code:
for (size_t i = 0; i < vec.size() - 1; ++i) {
    for (size_t j = 0; j < vec.size(); ++j) {
        ...
    }
}

vote_no
Nov 22, 2005

The rush is on.

pseudorandom name posted:

Because everybody writes for (int i = 0; ...) and nobody writes for (unsigned int i = 0; ...).

I wrote the latter all the time for brevity (with size_t in particular) and to avoid signed-unsigned comparisons, though now of course they’re all range-based fors wherever possible.

giogadi
Oct 27, 2009

Sigh, kinda wish they had just made vector sizes a signed int. Who actually needs that last bit of storage afforded by 2^64 vs 2^63?

Unsigned arithmetic is important but I feel like it should be totally opt-in, not mandated by the most basic vector api

more falafel please
Feb 26, 2005

forums poster

giogadi posted:

Sigh, kinda wish they had just made vector sizes a signed int. Who actually needs that last bit of storage afforded by 2^64 vs 2^63?

Unsigned arithmetic is important but I feel like it should be totally opt-in, not mandated by the most basic vector api

Of all the issues I have with the design of the STL, "doesn't allow negative vector sizes" is not among them tbh

giogadi
Oct 27, 2009

It’s not like they’d actually allow negative sizes: the sizes are all controlled by the vector api. But then it would allow completely sane and sensible comparison with signed ints which is what people should be using all the time anyway. “Unsigned int tells people it can’t be negative” is not actually useful when the failure mode ends up being a silent overflow.

E: or underflow, loving whatever.

giogadi fucked around with this message at 18:13 on Jan 4, 2023

Volte
Oct 4, 2004

woosh woosh
Unsigned int should usually mean "raw bit pattern" not "integer value that logically can't be negative". Sometimes even values that can't be negative can be subtracted from and you end up with really stupid bugs when the difference would be negative. Subtracting two vector sizes to get the difference between them? Better make drat sure to check which one's bigger first.

Adbot
ADBOT LOVES YOU

Xarn
Jun 26, 2015
The C integer promotion rules aren't helping. Did you add two unsigned values? Well if their size was smaller than int's, you now have a signed int, get hosed.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply