|
It would be very bad if optimizer hints which the optimizer can't use produced warnings or errors. It would make supporting more than exactly one version of one compiler a nightmare.
|
# ? Jul 5, 2022 02:57 |
|
|
# ? Jun 7, 2024 03:16 |
|
Plorkyeran posted:It would be very bad if optimizer hints which the optimizer can't use produced warnings or errors. It would make supporting more than exactly one version of one compiler a nightmare.
|
# ? Jul 5, 2022 03:05 |
|
I feel like there should be at least an optional diagnostic for that sort of mistake. It's not like it'd be the only warning that doesn't get turned on by -Wall. You can just turn it off if you're compiling with multiple compiler versions and some of them don't understand a subset of your optimizer hints.
|
# ? Jul 5, 2022 03:05 |
|
A linter might catch that but I don't know if you actually want it to be an official warning.
|
# ? Jul 5, 2022 03:07 |
|
roomforthetuna posted:As a general thing that seems appropriate, but for "the compiler recognizes this token but this is explicitly documented as the wrong place for it" it seems like a warning would probably be more useful than ignoring it in case a different compiler might use the token there. [[likely]] and [[unlikely]] are part of the C++ standard, so another compiler doing something useful with it appearing in places where clang does not is not exactly wildly implausible. Last I checked each of the three major compilers do in practice have different rules for what the annotations actually do.
|
# ? Jul 5, 2022 06:42 |
|
You do get a warning and have the hints ignored if you give conflicting hints for the sides of a branch.
|
# ? Jul 5, 2022 07:01 |
|
[[likely]] and [[unlikely]] are complete rear end and their design sucks and should've never made it into the standard. Also C++ attributes are stupid, but that is a much longer and involved discussion.
|
# ? Jul 5, 2022 20:36 |
|
Xarn posted:[[likely]] and [[unlikely]] are complete rear end and their design sucks and should've never made it into the standard. I’ll continue to use builtin expect macros which work great, not sure why I’d want to plop weird brackets things into my ifs instead, harder to read imo
|
# ? Jul 5, 2022 22:23 |
|
You prefer writing `__attribute__((__foo__))` to `【[foo]]` ?
|
# ? Jul 5, 2022 22:44 |
|
Macros are horrible and inconsistent and hard to parse. Also here are 300+ additional uses of existing punctuation marks in weird ways to sometimes provide slight improvements to your code and output if your programmers bother learning about it and your toolchain gets to supporting it.
|
# ? Jul 5, 2022 22:46 |
|
Beef posted:You prefer writing `__attribute__((__foo__))` to `【[foo]]` ? I prefer code:
|
# ? Jul 5, 2022 22:54 |
|
Sheesh, at least use all caps for those
|
# ? Jul 6, 2022 04:20 |
|
Beef posted:You prefer writing `__attribute__((__foo__))` to `【[foo]]` ? I assume this is about my assertion that C++ attributes are stupid in general. The issue isn't with the syntax (although it could've been better), the issue is that there isn't a real agreement on what attributes actually mean, and what should be their standard semantics. The original design for attributes is that they should be ignorable and thus forward compatible, with the idea being that if a compiler doesn't understand attributes from newer standard, that's fine and code that is well-formed with the attribute should also be well-formed without it. But this means that it is much easier to add in more attributes than it is to add keywords, and people started backdooring in keywords as attributes with some furious handwaving about how they are technically still attributes (and thus ignorable). This culminated with C++20's [[no_unique_address]], which really stretches the ignorability of attributes, because people want to use it as replacement for costly EBO-based compresible-pair impls. But if your code can be used with compiler where the attr might be ignored, then your code suddenly becomes a lot shittier, and likely violates bunch of assumption you've made about it. Also to make things really suck, MSVC understands, but silently ignores [[no_unique_address]]... to get the correct behaviour, you need [[msvc::no_unique_address]]]. The reason behind this boils down to ABI stability
|
# ? Jul 6, 2022 23:00 |
|
Sweeper posted:I prefer Same, at least this version has well-defined semantics
|
# ? Jul 6, 2022 23:01 |
|
*ahem*code:
|
# ? Jul 6, 2022 23:08 |
|
pseudorandom name posted:*ahem* How wasteful. code:
|
# ? Jul 7, 2022 19:16 |
|
OK, whatever, but the example I was replying to doesn't work correctly.
|
# ? Jul 7, 2022 19:25 |
|
Zopotantor posted:How wasteful. code:
|
# ? Jul 7, 2022 19:42 |
|
b0lt posted:
|
# ? Jul 7, 2022 19:49 |
|
Zopotantor posted:How wasteful. code:
|
# ? Jul 7, 2022 21:25 |
|
Foxfire_ posted:Gotta get rid of the magic numbers to pass code review Style guide indicates that these should be meaningful. code:
|
# ? Jul 7, 2022 21:28 |
|
I'm back with more built-in questions, specifically about https://gcc.gnu.org/onlinedocs/gcc/x86-Built-in-Functions.htmlhttps://gcc.gnu.org/onlinedocs/gcc/x86-Built-in-Functions.html posted:If you specify command-line switches such as -msse, the compiler could use the extended instruction sets even if the built-ins are not used explicitly in the program. For this reason, applications that perform run-time CPU detection must compile separate files for each supported architecture, using the appropriate flags. In particular, the file containing the CPU detection code should be compiled without these options. This suggests that if I use something like -mavx2, the compiler may emit AVX2 instructions as optimizations for code even in places that I am not explicitly using AVX2 intrinsics myself, right? That makes it seem like to create a binary that uses AVX2 or AVX512 if available, I am required to compile multiple binaries and have an entrypoint binary that detects CPU feature support and launches the correct binary based on that. Am I reading this right? How does anything get packaged correctly such that it uses features if available, but is able to run on CPUs that don't support those instructions?
|
# ? Jul 7, 2022 21:36 |
|
Most stuff doesn't, but no, you don't need separate binaries. Suppose you currently have the following: foo.h: C++ code:
C++ code:
foo_impl.h: C++ code:
C++ code:
C++ code:
C++ code:
C++ code:
|
# ? Jul 7, 2022 22:07 |
|
Or you use something like multiversioning to explicitly enable architecture support only in specific functions. But yes, -mavx is the “I guarantee my target CPU supports AVX” switch, not the “stop complaining if I use AVX intrinsics, but it’s my fault if I use them wrong” switch. Instruction set extensions are frequently useful for normal code generation.
|
# ? Jul 7, 2022 22:52 |
|
Compiling from source with -march=native makes my programs go brrrrr. Grep for xmm and ymm in your objdump and you will see the compiler using it everywhere. E.g. pass structs args in the wider vector registers and just vmoving instead of calling memcpy.
|
# ? Jul 7, 2022 23:08 |
|
My mental model of compiler SIMD flags was extremely primitive and wrong. Now I'm curious how effectively the compiler will auto-vectorize tight numerical loops with -march=native -mtune=native -O3 (or is Linus right and -O2 is better in most situations due to smaller code generated?) I'm guessing that the compiler still isn't willing to do things like pad vectors or have a vector loop followed by a scalar loop to get the leftovers, but I hadn't even thought of being able to fit small structs entirely in 256 or even 512 bit registers. It looks like the default target is only SSE2 if you don't pass any of this flags, but also it looks like passing say, -mavx2 implies AVX1, SSE4.2, POPCNT and all. My interest in this is more than academic, we've got a cluster of varying vintages and also extremely vectorizable workloads. The sooner we're able to more effectively use AVX-512 while still being able to execute on Broadwell, the better. Edit: Also, it looks like mtune is pretty important. I'm seeing that the default -mtune generic produces some code that leaves a good amount of performance on the table for modern Intel or AMD CPUs: https://stackoverflow.com/questions/52626726/why-doesnt-gcc-resolve-mm256-loadu-pd-as-single-vmovupd . I can think of several applications off the top of my head that are using -mavx2 -mfma without -mtune. Twerk from Home fucked around with this message at 05:00 on Jul 8, 2022 |
# ? Jul 8, 2022 04:57 |
Plorkyeran posted:Most stuff doesn't, but no, you don't need separate binaries. One thing to be careful of when doing this: If the source files you compile with additional -m flags include any headers with inline functions or templates, those functions could be instantiated with AVX opcodes in them, and linked with the rest of the program due to the One Definition Rule. Then your entire program is dependent on AVX regardless. So you either have to avoid including headers that have templates or inline functions in your source files using intrinsics, or alternatively only enable the machine flags on a per-function basis. OpenTTD got a bug report about this issue recently, and it was solved with the per-function machine flag. Microsoft C++ lets you use any intrinsics regardless of compiler flags, and don't have this issue.
|
|
# ? Jul 8, 2022 05:27 |
|
Twerk from Home posted:My interest in this is more than academic, we've got a cluster of varying vintages and also extremely vectorizable workloads. The sooner we're able to more effectively use AVX-512 while still being able to execute on Broadwell, the better. If you're just running programs on machines you control and not distributing software to third-parties then it seems like it'd be easiest to just build a separate version for each architecture and push the complexity to however you're deploying the software on your cluster (unless your deployment mechanism is an utter nightmare or something I guess).
|
# ? Jul 8, 2022 06:12 |
|
Compilers can give you vectorization advice, in the form of annotations added to your code. It can be really helpful. I used to rely on the Intel compiler and the vector advisor tool, but gcc was producing pretty good vectorized code too. You don't have to split your own loops into canonical form etc. The compiler does that for you. The vectorization feedback can help you make slight tweaks to your code so the compiler can do that job better.
|
# ? Jul 8, 2022 17:52 |
|
Beef posted:Compilers can give you vectorization advice, in the form of annotations added to your code. It can be really helpful. I used to rely on the Intel compiler and the vector advisor tool, but gcc was producing pretty good vectorized code too. Yeah, my thought through all of this is that I really would prefer to just try to keep the hot regions of code as tight loops without branching, and seeing if the compiler can get to a good-enough result. GCC's target_clones looks like a great approach towards that. In fact, while poking around a bit in Godbolt, I found out that GCC's target_clones will produce code that uses the AVX-512 registers, while compiling with -O3 -march=icelake-server -mtune=icelake-server will not! code:
|
# ? Jul 8, 2022 20:31 |
|
What's the most comfortable pattern for doing I/O stream filtering type work like compression or deserializing on a separate thread? Ideally i'm looking for a way to do something like a Boost filtering_streambuf that you can super easily pass a boost::iostreams::gzip_compressor() to to compress or decompress a stream, but have that work scheduled on a different thread than what is writing to or reading from the I/O stream. The sanest way that jumps out at me is using a concurrent queue, and having the reader or writer just read or write from the queue and the actual source or sink has the filtering_streambuf, but then I'm adding another layer of buffering and batching to worry about. I'd also appreciate a lay of the land for concurrent queues, in Java land I default to a basic BlockingQueue if performance isn't critical and the LMAX disruptor if performance is, but in C++ I have no idea what reputation the boost::lockfree:queue or just doing std::queue with a basic mutex have.
|
# ? Jul 13, 2022 04:02 |
|
Twerk from Home posted:The sanest way that jumps out at me is using a concurrent queue, and having the reader or writer just read or write from the queue and the actual source or sink has the filtering_streambuf, but then I'm adding another layer of buffering and batching to worry about. Envoyproxy does this kind of thing with moveable buffer-chunks and an event queue though, so if you need that degree of control you're not barking up the wrong tree.
|
# ? Jul 13, 2022 12:25 |
|
What is it with C++ tools and being incredibly half-baked? clang-format? sounds good, but if you do something relatively normal like include all headers with angle brackets, its main header detection won't work, because that's just too novel IWYU? sounds good, but it will suggest you remove all <> includes and replace them with "" includes, because that's what Google does, and the issue to maybe let users change that has been open for 5 years now vcpkg? some things work properly in manifest mode, some work properly in classic mode, both of them are shite when you do something incredible like having transitive dependencies. make? lol there is a space in some path get loving hosed cmake? miserable pile of backwards compatible hacks, that let you do 70% of what you want with only reasonable amount of effort. If you are up to date with latest best practices. These are never actually documented anywhere and people like to tradeoff good eng. design for ease of use. Compilers? DON'T GET ME STARTED ON THE loving COMPILERS Xarn fucked around with this message at 21:03 on Jul 16, 2022 |
# ? Jul 16, 2022 21:01 |
|
Xarn posted:you do something relatively normal like include all headers with angle brackets
|
# ? Jul 16, 2022 21:22 |
|
It is pretty normal, see e.g. Boost. It is also the objectively superior option.
|
# ? Jul 16, 2022 21:28 |
|
I use both. Angle for system headers, quotes for local ones. Isn't that the intention?
|
# ? Jul 16, 2022 22:04 |
|
Xarn posted:It is pretty normal, see e.g. Boost. It is also the objectively superior option. e.g. Locale's std\numeric.cpp has code:
|
# ? Jul 16, 2022 22:26 |
|
Foxfire_ posted:Boost doesn't do that. The header-only parts of it inconsistently use angle brackets or quotes when accessing other public headers in itself. The few separately compiled parts of it consistently use quotes when including their internal headers as far as I can tell. Ok, this might be an artifact of different libraries being made by different people. I opened Boost.Container and that looks like this code:
Foxfire_ posted:Why do you think always using <> is better? Combination of multiple factors 1) Includes using relative paths are stupid. I am not interested in reasoning about the current file location to know whether #include "../utils.hpp" will include cool-project/audio-manipulators/utils.hpp or cool-project/video-manipulators/utils.hpp. 2) The only part of include resolution defined by standard is that after the 3) I can tell whether a specific header is from my project based on the path prefix, I don't need to check for "" vs <>. Taken together, using "" for includes introduces compile time overhead*, allows you to include different file than you thought** and doesn't provide any advantage in return. * failed stat when the preprocessor looks for the path relative to current path ** I've actually had to deal with this just this week. At some point someone got sloppy, didn't include a file with full path from project root and due to somewhat messy include dirs, suddenly the include was picking up a different file than intended. Xarn fucked around with this message at 23:02 on Jul 16, 2022 |
# ? Jul 16, 2022 22:56 |
|
Xarn posted:so I will admit that it is not a uniform policy across Boost. Actually if I am reading this right, you are supposed to use angled includes everywhere, but uh, it isn't enforced and people are bad at this consistency thing. https://www.boost.org/development/header.html quote:Then both your code and user code specifies the sub-directory in #include directives.
|
# ? Jul 16, 2022 23:26 |
|
|
# ? Jun 7, 2024 03:16 |
|
I think that's just about the publicly visible headers that would end up somewhere like /usr/local/include/boost, since you want #includes for peer boost stuff to not find conflicting names in the local project. Headers specific to separately compiled libraries whose binaries end up in /usr/local/lib (the ones that are under libs/LibraryName/src/, not boost/include/LibraryName/ in the boost source code) use normal quotes since if there's a local vs system conflict while building that library, they want the local file. Like if you put a file named cpuid.hpp in /usr/local/include/, compiling Boost's Atomic library still wants its own file, not that one Using from-the-base-of-your-project paths seems like a reasonable goal to me, but it'd be better implemented by sticking to the normal <> vs "" convention and modifying the user include path to include the root of the project instead of modifying the system include path (-iquote on gcc vs -I). One extra stat doesn't seem likely to actually matter to compile time enough to be worth doing something unusual.
|
# ? Jul 17, 2022 00:19 |