Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
crazypenguin
Mar 9, 2005
nothing witty here, move along

I LIKE TO SMOKE WEE posted:

Am I the only one sitting over here with a 2600k (~4.6Ghz, H80 cooler) wondering when I'm going to see something actually worth upgrading for? I built this machine in DEC 2011 for fucks sake.

Who / what can I blame for this?

I'm sitting on a 2500k.

I'm hoping that they'll decide Cannonlake will have PCIe 4.0, and that motherboard makers will start adopting 1/2.5/5/10 Gbps ethernet as standard around the same time. Hopefully they'll also decide to start moving the standard desktop core count up from 4 around then, too.

With a little luck (like Cannonlake being PCIe 4 and not delayed. That's wishful thinking at the moment, not announced afaik), we might even see this 1.5-2 years from now.

Adbot
ADBOT LOVES YOU

crazypenguin
Mar 9, 2005
nothing witty here, move along

Shimrra Jamaane posted:

It's disappointing that the 7700k still doesn't seem like a worthwhile upgrade over my OC 2500k. I want to build a whole new PC already but I might as well continue to wait as SSDs continue to fall in price.

Are there any motherboards with more than one USB 3.1C port?

My plan is 2nd half of 2018 to build a new machine to replace mine. (I don't even overclock the 2500k. My motherboard doesn't do overclocks without disabling lower power states for some dumb reason, and really, I don't play demanding enough games to care! Although, I'm going to pay attention more to the motherboard more when I do my next build... drat it.)

I'm hoping it'll be a sweet spot that year. I think things are internally bandwidth starved right now, and PCIe 4 might be out then. I think that extra bandwidth will make a difference for things like multiple USB-C ports that do 3.1gen2 or whatever bizarre versioning they come up with next. Maybe 10 Gbps NICs. SSDs are likely to get even faster, and need that bandwidth, too. In addition to all that, maybe 6 cores from Coffee Lake. Hopefully monitors that do 1440p at 144Hz with *Sync come down in price a bit, and the GPUs capable of driving them might be more affordable.

Crossing my fingers for a good couple of years here.

crazypenguin
Mar 9, 2005
nothing witty here, move along

ufarn posted:

Are they counting on people to buy PCI expansions or something? Or hooking them up to hubs. It does seem a bit curious if it's supposed to be the future.

To expand on my last post, here's the thing I think might be standing in the way right now. (Lacking any insider knowledge, just guessing from the outside here.)

A single PCIe 3.0 lane offers 980 MB/s bandwidth. A single USB 2.0 port was 60 MB/s, so you could easily fit like 16 ports into a single PCIe lane allocation. A USB 3.1gen2 port is 1200 MB/s. So you need more than 1 PCIe lane allocation per port!

With PCIe 4.0, you get 1900 MB/s bandwidth per lane. Which means with 2 lanes, you can fit 3 ports at full speed.

So my suspicion is that we're waiting on PCIe 4 for things to become standard here. They don't want to offer more ports because people might get upset when the port doesn't offer the speed it's supposed to, and allocating more PCIe lanes to offer that speed gets expensive fast.

(And for similar reasons, hopefully we'll also get 10 Gbps NICs standard as well. They'll fit into 1 PCIe 4.0 lane, too.)

crazypenguin
Mar 9, 2005
nothing witty here, move along

Anime Schoolgirl posted:

x86 as it is is very efficient in a straight line, so the third thread won't see much work being put in it if it existed.

Well, not really to do with x86. In fact, those Phi cards Intel makes have 4 threads.

Basically, it's just that it trades off against single thread performance. Larger hardware thread counts are good for systems that care mostly about throughput. You can extract every last bit of bang for buck out of the core, at the cost of all the threads running more slowly overall.

crazypenguin
Mar 9, 2005
nothing witty here, move along

fishmech posted:

You mean the sort of physical attacks that you could do with most laptops with say, Thunderbolt, or Firewire, or ExpressCard, and I think maybe PCMCIA? That is, attacks that let you extract a bunch of information from the running system and allow for the possibility of injecting some manner of malicious executable code.

Isn't that why skylake brought VT-d mainstream? IOMMU these things off, so these attacks don't work anymore.

crazypenguin
Mar 9, 2005
nothing witty here, move along
So, with all the excitement being in the AMD thread at the moment, I thought I'd ask about some crystal ball gazing here.

What's coming that we know of in the next few years, in terms of non-CPU non-GPU stuff? (Because we pay enough attention to vega, ryzen, cannon/coffee lake, volta, etc)

I can think of:

  • 2.5/5 Gbps ethernet. Starting to show up already, sometimes even with 10G on consumer boards.
  • HDMI 2.1. Spec out last month. 48 Gbps cables.
  • Display port 1.5 spec soon, supposedly. (32Gbps cables, 4k 144hz capable)
  • PCIe 4. This year, right?
  • DDR5 spec is supposed to come in 2020, I believe.

And, I suppose as a process thing, 3d nand is getting re-tooled so we should see more big SSDs for cheaper in a yearish, supposedly.

Anything else interesting that we know is in the works?

crazypenguin
Mar 9, 2005
nothing witty here, move along

PerrineClostermann posted:

Part of the problem is that USB 3.1 Gen 2 requires a lot of bandwidth and thus, multiple ports would require a not-insignificant amount of PCI-E lanes, iirc.

Yeah, on a 2-lane budget, PCIe 4 would let us go from 1 port to 3 ports.

Though to be honest, I'm not sure why they don't double up ports and share bandwidth between pairs. This would be fine in many cases. Maybe the chips don't support such a thing? (Yet?) idk

priznat posted:

PCIe gen4 will start showing up more in 2018, when Ice Lake CPUs are supposed to arrive in the 2nd half of the year as part of the Tinsley platform.

Is any of this confirmed at all? :( I was really hoping to see PCIe 4 sooner rather than later.

Internals are bandwidth constrained these days with 40G thunderbolt, multiple 10G usb, 5+G ethernet, 32G NVMe... all bottle-necked on 32G connection between CPU and chipset. It hasn't turned nasty for consumers because it's rare to use that much bandwidth at once, but it's already quite possible to. (e.g. VR sensor poo poo likes 15G of total USB bandwidth. So just using VR is going to slow NVMe performance by up to half... which doesn't really matter yet but it means we're already affected. Who knows how many more sensors they'd like to have if it wouldn't overload our systems. Oculus is already on the record as wishing they could use more USB 3 ports but USB controllers seem to have trouble keeping up.)

crazypenguin
Mar 9, 2005
nothing witty here, move along

Oh neat. I didn't know this was possible, let alone at only a $500 premium.

crazypenguin
Mar 9, 2005
nothing witty here, move along
Custom design of these multi-chips assemblies is going to be enterprise-only, but that method of manufacturing is going to be standard even for consumer chips before too long. (e.g. 5 years)

Tokamak posted:

EUV is so behind and below projections that they would want to get as high of a yield as possible.

More than that, too. HBM of course. Multiple configurations with GPU/FPGA. The main one they might be trying to treat as semi-secret here is silicon photonics. That process will never be the same one they use for compute, so a technique like this is absolutely necessary. So this is kinda the first step towards chips shooting light at each other, instead of electrical IO.

crazypenguin
Mar 9, 2005
nothing witty here, move along
I don't really know what the breakdown looks like for CPU pins, but I doubt they actually need more to add more cores.

Pins are mostly PCie lanes and DRAM traces right? Now I'm curious.

e: http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/4th-gen-core-family-desktop-vol-1-datasheet.pdf

okay mostly voltage sink/sources, followed by ram, followed by other poo poo. I guess a socket could be power inadequate for handling higher core counts, without dropping the clock speed or something.

crazypenguin fucked around with this message at 00:02 on Apr 20, 2017

crazypenguin
Mar 9, 2005
nothing witty here, move along

Saukkis posted:

In what kind of situations do those extra PCIe lanes provide noticeable benefit?

Most people only ever plug one graphics card and nowadays an NVMe SSD in, so reasonable question.

But: more than 1 x16 graphics card (the bandwidth is actually useful for deep learning, if not gaming). More NVMe SSDs without being bottlenecked. Thunderbolt cards. FPGAs. Raid controllers. More high-speed USB ports (less of an issue now that we're getting those on the newest chipsets...)

The CPU-chipset bottleneck is DMI 3.0 (or x4 PCIe 3.0), or 32 Gbps of bandwidth. With single USB ports offering 10 Gbps, and an SSD that can eat the whole 32 Gbps, we're actually surprisingly close to this bottleneck being a problem even for normal consumers. (Not yet a big one, mind, but still... it's supposed to be a total non-issue.)

Here's hoping it won't be too long until PCIe 4.0 is actually adopted (and we get DMI 4.0). And I also kinda hope Intel will follow AMD's lead and add another 4 CPU lanes to their consumer CPUs (for the SSD). It's a nice design.

crazypenguin
Mar 9, 2005
nothing witty here, move along

Paul MaudDib posted:

The people who are buying these are going to be relatively price-insensitive

The only problem with this logic is that it's describing the existing market, not the market that could potentially be created here. It's the same kind of logic that's kept Intel on 4 cores for consumers. They're right, frustratingly, that consumers don't need more, but that is in part because consumers don't have more than 4 cores for applications to be designed to take advantage of. Ouroboros, eat tail.

I know a lot of software developers who are stuck using an i5-class machine, because the alternative is 6 times more expensive for only 3 times the performance.

A $1000 16c32t would be, like, 7 times the performance for maybe 50% more. That completely changes things.

crazypenguin
Mar 9, 2005
nothing witty here, move along
There's a PCI-SIG conference going on now, and some announcements. It looks like PCIe 4 is now all-but-published:

http://techreport.com/news/32064/pcie-4-0-specification-finally-out-with-16-gt-s-on-tap

Just waiting on the lawyers for the spec to be final. Still no word from AMD or Intel about a timeline for support. :(

SMI has a roadmap with PCIe 4 SSD controllers at end of next year: https://tweakers.net/ext/f/tv4gcfsSQVuoif5T3coYAc28/full.png

PCIe 5 announced, and they're actually shooting for 32 Gbps per lane, which is awesome. Also, apparently a super-aggressive 2019 spec target date, though at this point they're famous for delays...

https://www.techpowerup.com/234167/pci-sig-fast-tracks-evolution-to-32-gt-s-with-pci-express-5-0-architecture

crazypenguin
Mar 9, 2005
nothing witty here, move along

Cygni posted:

I don't think there's gonna be a sudden explosion in CPU parallelism. People predicted that 8 years ago (hence Bulldozer), and it never happened.

At the risk of being a parallelism-apologist, I think it's going to happen.

The trouble with Bulldozer was that the IPC sucked, was worse than Intel's chips at 4 cores, and so everyone just regarded it as bad and ignored it.

The trouble with parallelizing game engines is that it has to seem worth it to the developers, and 4 cores is a god drat rut. Parallel algorithms usually come with higher overhead (except the "embarrassingly parallel" ones). So your complex single thread algorithm naively parallelized gives you 160% of the performance when using all 4 cores. Which sounds like poo poo, so efforts are usually abandoned at that point. But at 8 cores it'd be 320% of the performance, and if they then decide to put the effort in to improve it, they might be able to achieve 6-700% the performance eventually.

I'll grant that game engines are probably the most challenging thing to parallelize: you've got 16ms deadlines at 60fps. There's really hard upper limits to the communication costs you can suffer here.

But we've only just got parallel rendering APIs like Vulkan/DX12 a year ago. We're only just getting good mainstream CPUs with more than 4 cores this year. We're starting to see some game engines that are proving you can actually use that power. I think the logjam will break.

crazypenguin
Mar 9, 2005
nothing witty here, move along
fwiw, I kinda like having 2 forum threads. One thread can get hyped about some new release, and the other thread can talk about other general hardware stuff that doesn't have a thread.

Like, USB 3.2 got announced recently. They're going to open up a second "lane" in usb-c, doubling bandwidth to 20 Gbps for USB-C 3.2 gen2 x2 because this is how we do versioning now.

http://www.usb.org/press/USB_3.2_PR_USB-IF_Final.pdf

We're going to be even more crunched for internal bandwidth in the future.

crazypenguin
Mar 9, 2005
nothing witty here, move along
I very much doubt that we'll ever see more than an x4 link between chipset and CPU. There's very little reason to want it, except for wanting to plug in higher bandwidth devices, which generally just call for CPU lanes.

In my ideal world, Icelake will come out with PCIe 4 (+ DMI 4, improving bandwidth to chipset) and 20 CPU lanes on the consumer boards (GPU + NVMe), and I'll squeal with delight. Who the hell knows what will happen, though, everybody's mum about a time frame for improving this stuff.

I'm not sure if I'm a weirdo for wanting improved IO, or if they're just blind to the demand for it. I guess NVMe has kinda splashed onto the stage pretty quick, but hell, AMD saw the demand for 20 CPU lanes because of it already, right?

crazypenguin
Mar 9, 2005
nothing witty here, move along

BangersInMyKnickers posted:

Hyper threading adds MAYBE a 5% performance improvement under very specific workloads that typically aren't games. You're really overselling its value.

Curious what you're basing this on? Digital Foundry's tests for i5 vs i7, for example, show 30% for games that multithread well. Which is consistent with the gains for many other (non-gaming) parallel workloads.

crazypenguin
Mar 9, 2005
nothing witty here, move along
Are these high-speed trading or scientific numerical simulations?

Pauses waiting on RAM are going to always be present for games. Trees and graphs are too natural of a data structure for pretty much everything non-graphics (and even a lot of graphics, scene graphs and the like). Those will always involve a lot of pointer indirection, even if it can be reduced, and let's face it... Game devs aren't always the bestest most efficient coders imaginable. (e: I suppose I should say this is because the business only demands "good enough" and rarely offers the time to do an actual good job.)

crazypenguin fucked around with this message at 21:21 on Oct 17, 2017

crazypenguin
Mar 9, 2005
nothing witty here, move along
I've posted about that before in the SSD thread: basically beyond SATA speeds you're actually CPU limited. Even for well-coded things, decompression of data can't keep up even on 8 core CPUs. The things that actually benefit from NVMe are non-compressed data (so boot, VMs, databases, etc)

I'm kinda hoping we someday see CPUs with hardware decompression cores or fpgas. But then there's also the problem of getting software to make use of it... but I want games without load time at all, darn it.

crazypenguin
Mar 9, 2005
nothing witty here, move along

WAR DOGS OF SOCHI posted:

Really? Geez, that's interesting, though depressing. So if you already have an SSD, improving loading times on a game like Civ V will really only happen through a CPU upgrade?

Yep. (Though for Civ V, the big problem is just lazy coding... they didn't care about load times at all. So much gameplay data is scattered about in a huge number of xml files that it's only a moderate exaggeration to say the load times are so long because the game is compiling itself every run. If they just cached that work, it'd literally be 1000 times faster.)

crazypenguin
Mar 9, 2005
nothing witty here, move along

EoRaptor posted:

The shadow cannot be read or written by any normal process, and only gets pushed up to the main cache if speculation succeeds.

This is an especially expensive approach, silicon-wise, but a ton of approaches exist striking different trade-offs. This is certainly a solvable problem. All they have to do is make sure that cache cannot be used as a side-channel from a speculative execution.

crazypenguin
Mar 9, 2005
nothing witty here, move along

Malcolm XML posted:

Pcie 4 and 5 are for networking and storage appliances

PCIe 4 has plenty of reason to come to consumer hardware. (PCIe 5 who knows though. AFAIK, that really is motivated by the devices you mention.)

Besides the features other than bandwidth, USB 20 Gbps is coming. Wifi closer to 10 Gbps is coming. We're overdue for faster ethernet by default in consumer hardware. NVMe will eat all the bandwidth it's given.

And everything but the graphics card gets put on a 4 lane bottleneck.

crazypenguin
Mar 9, 2005
nothing witty here, move along

PerrineClostermann posted:

Intel HT typically gives around 15-20% more performance, iirc.

For workloads that would scale well with more real cores, HT is worth about 30%. It's really quite decent.

crazypenguin
Mar 9, 2005
nothing witty here, move along

Space Racist posted:

Has anything been confirmed about exactly *why* Intel’s 10nm process is so boned compared to the rest of the industry? Global Foundries and TSMC seem rather confident of hitting 7nm, while Intel is 3+ years past their target date and still sweating the ramp for 10nm.

It's all rumors and guesswork at the moment, but the guesses I hear are:
  • Intel was pretty ambitious about 10nm (iirc, there's some important ways it's better than competitor's 7nm) and was trying to "prove Moore's law still held"
  • This had them focused a little too much on improving density, rather than improving costs, which is arguably more important.
  • Intel is pretty arrogant and huffed their own farts about their unassailable process leadership
  • Other foundries ran into trouble scaling down their processes, and---get this---it seems they worked together and shared a fair amount of work with each other to help each other figure out the problems. Intel probably considers that a weakness, lol.
  • Management gently caress ups.

crazypenguin
Mar 9, 2005
nothing witty here, move along

movax posted:

but I’m wondering what the next IP core to get laid down is

If it were up to me, fpga. Or, if that seems to complex for the benefit, several general-purpose compression/decompression cores. Like hardware zlib.

A big part of the reason game load times aren't better between SATA and NVMe is because they're CPU-bound, and a big reason they're CPU-bound seems to be decompression.

For decompression, 8-core CPUs can handle only like 600 MB/s of compressed data, while a hardware core should be able to handle the full 7 GB/s of a PCIe 4 NVMe SSD.

crazypenguin
Mar 9, 2005
nothing witty here, move along

Combat Pretzel posted:

Decompression is the least of an issue. I'd be surprised, if games use bog-standard compression algorithms these days. Audio and video is compressed with adequate codecs, gamedata is in god knows what shape. Latter usually has to be processed during load, that's what's limiting load times.

When I was figuring that bit out, I took a look at Skyrim. The file format they use generally consists of mostly zlib compressed data. Audio doesn't usually need up front decompression, so that's not relevant to load times, and textures were usually S3 compressed images that were then also zlib compressed in the pack file. (S3 and friends are designed to be stream-decompresed inside the gpu, so they're not actually very good compression overall and zlib can improve upon them for storage on disk.)

Glancing around, this seems to be commonplace in modern games: more modern BC7 texture compression, then further compressed with standard zlib on disk.

Taking decompression load off CPU would be a big help. With many existing games, you're right that there'd still be a lot of other processing going on, but that's something that's subject to optimization if the developers want to.

crazypenguin
Mar 9, 2005
nothing witty here, move along

Malcolm XML posted:

zlib is fundamentally not optimized for either speed or compression ratio but for ease of programming. for speed you can hit ram speed with something like LZ4 or zstd

The first part is true, there's worthwhile stuff that's more than 4 times faster than zlib. But I'm really quite sure it's not near ram speed, not at all.

Looking at the Xeon on https://quixdb.github.io/squash-benchmark and digging around it looks like you can get 4 GB/s out of 25 GB/s of theoretical memory bandwidth (and I'd guess about 20 GB/s of practical) with a fast LZ4 algorithm on all 4 cores. (Don't put too much stock in the 'copy' numbers on that page, I don't think they have really tried to make that actually efficient, it's just a point of comparison.)

Anyway, I picked zlib mostly because (a) I know there are games that use it and (b) I know fpga cores exist that can eat it at GB/s, so it can definitely be accelerated in hardware.

crazypenguin
Mar 9, 2005
nothing witty here, move along

MaxxBot posted:

This put the onus on the compiler to extract parallelism and it turns out that's really loving hard when running general purpose code.

Just to expand on this remarkably succinct explanation (nice job): general purpose code branches a lot.

When it's the compiler's job to create big honking mega-instructions that explain how to use the processor's resources to their fullest potential, it needs to know what's being computed to do that. You can't cram more compute in, when you don't know what else needs to happen. When you have a branch that can go two different ways, it basically has to throw up its hands and go "idk sry." It can't really predict how the code will run. (And god help you if there's more than 2 different ways it can go!)

The CPU can do that dynamically just fine with speculation though. (Like, even considering recent security problems.) So modern CPUs just do this scheduling of instructions onto ALUs dynamically while running the code, mostly unimpeded by branching.

This whole failure is sometimes blamed on insufficiently smart compilers, and that's sort of true, but the truth is they designed an architecture that's bad at branching, wanted to run branch-heavy code on it, and said "this problem left to the compiler devs lol."

crazypenguin
Mar 9, 2005
nothing witty here, move along

FRINGE posted:

To be honest the fact that gen4 makes so much heat they had to go back to doing on-mobo fans makes me want to avoid them anyway.

I think it's more likely than not that that chip was just the cheapest thing they could get done fast, and without much optimization for power efficiency.

I doubt the next generation of pcie 4 supporting chipsets will need a fan.

crazypenguin
Mar 9, 2005
nothing witty here, move along
All you need to want more bandwidth is to find something where bandwidth reduces the latency of some action. You don't have to somehow use it continuously.

Play games without spending any time downloading anything first. Backup software that just instantly mirrors your disk remotely in the background without ever bothering you or being even remotely noticeable. Upload a video with a tap and without committing to waiting for 10 minutes for it to slowly finish. Software delivered as virtual machine images that you don't even have to install before running.

In the middle of working on a big project, but want to work on it with a different computer (e.g. moving between home/work/laptop)? Too disruptive to close everything down, save, and then reopen and get back to what you were doing? Replicate the machine state from one computer to a another. What could it be? 300 gigs? ezpz. The applications don't even have to know they were moved. With a nice 100G link, that's less than 30s to accomplish from scratch.

crazypenguin
Mar 9, 2005
nothing witty here, move along
I was under the impression that motherboards already needed their own circuitry for feeding USB and the power supply’s wasn’t acceptable for that purpose. And that the only real drawback of 12VO was legacy SATA.

Is that wrong?

Adbot
ADBOT LOVES YOU

crazypenguin
Mar 9, 2005
nothing witty here, move along
Seems neat. Hopefully it helps fix some of the boot-up complexity, because their own description of how boot works is massively over-simplified, especially when a TPM (+ microcode updates + ...) is in the mix.

Here's a good article: https://mjg59.dreamwidth.org/66109.html

Intel's summary hosed up the most basic part of pitching though. "What are the benefits?" "It removes features." lol, remember to actually say the benefits people!

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply