Post

Server CPUs, market differentiation, the e-thread hatetrain, and power efficiency

AMD dominance, Intel solid contender, ARM reaches new markets and continues to advance towards traditionally Intel and x86_64 dominant areas such as supercomputing.

Server CPUs, market differentiation, the e-thread hatetrain, and power efficiency

What is the state of the CPU market

  • dont consider server CPUs and workstations or small PC CPU architectures for high-performance compute.

Well, you typically dont consider small PC or workstations CPU product SKUs for the high-performance computing marketā€¦

Consider non-server strenuous CPU demanding tasks, such as processing information related to metrics, observables, distributions of data or simulations of distributions and their associated outputs, which may then be used to model or simulate errors and noise profiles useful for rigorous testing of edge cases and simulated errors and their frequencies. Simulating such distributions and analyzing the data requires considerable memory and resources become quickly exhausted. At the CPU level, the best advantage is a large local cache. The AMD Epyc parts and Zen 3+ architectures are considered revolutionary because of their 3D VCACHE solutions for stacking memory inside the various components of the dye and its associated microarchitectures, primarily the L3 cache. Which at modest SKUs for the AMD Epyc architectures Milan(Gen 3 or 4), Genoa(Gen 4), and Turin (Zen 5) such as the Milan Epyc 7703 CPU part, which is the winner of the value proposition for a 64-core, 128-thread part. For intensive server applications, scientific compute, simulation, video and illustration editing, animation, etc. the quality of the server part may be categorized by clock-speed (3.35Ghz; most parts are between 2-4 Ghz for ca. 2019 - 2024) instruction sets (x86_64 vs ARM), L3 cache, PCIe lanes (128?), and additional features.

My recommendation, as stated, is the Milan Epyc 7ā€” series models. 8-64 core solutions. itā€™s like 128Mb L3 cache. Obvious winner.

Yeah Intel parts are there, but the area of compute that interests me is distinctively memory heavy and resource constrained: high-performance compute. And thereā€™s a lot of buzz and hype and micro-rationalizations about AMD hype. Itā€™s not totally justified. The Intel parts are a little bit more expensive than the AMD parts. Some benchmarks show AMD pulling ahead. Then again Intel and AMD both use sus marketing gimicks and benchmarking strategies as it suits them. These arenā€™t professional benchmarking experts funded by academia or anything anyways. Long story short is that the core-price ratio tends to be pretty favorable for AMD model APUs and CPUs. Itā€™s trendy.

Itā€™s not somewhat hype. I have a friend that uses Threadripper exclusively and the part burns. If youā€™re doing data-science or statistics, or other mathematics workflows, then the AMD part market is dominant in this generation, marginally, over Intel products.

Specifically, the multithreading would be terrific with the appropriate backplane.

The obvious choice is based on architecture i.e. x86_64

The x86_64 instruction set architecture and intermediary forms grants AMD and other licensed partners the ability to use largely optimized-for-intel binaries, historically awesome programs written for earlier versions of intel chips running Windows or Linux. It means we get better ecosystems of software that largely donā€™t affect you and me at all.

ARM CPUs are really hot. IYKYK

Itā€™s actually the PCIe lanes

Yeah, you can run an extraordinary number of different applications on a single motherboard and while multi-GPU still remains the punchline, much like AMDs GPU and compute support, though modernizingā€¦ Iā€™m aware that there are big moves towards better libraries for the graphics developers as well as quality of life for game devs, graphics engine support andā€¦ well thereā€™s a lot of players modernizing and itā€™s hard not to miss the bus on choosing Nvidia for graphics compute.

And Iā€™m not even at that level completely, modestly so. I havenā€™t written my first CUDA kernel yet, Iā€™m still working on Leetcode fundamentals. I write clean program and defensively ā€œleanā€ tdd? But the topic wasnā€™t me.

So yeah the sauce is actually the PCIe connectivity

and gen3 vs gen4 is essentially the entire issue. High-speed doesnā€™t mean high-IOPS and Iā€™m not as of yet network certified by Cisco, IBM, AWS, or others.

Storage and cacheing are evergreen areas of compute, because data locality is a system-level (as opposed to subsystem) issue in desigining data-center or data-center aware solutions. My thoughts are, get in to switches and fabric, specifically Mellanox Infiniband because Iā€™d like the right connection. But.. I donā€™t know anything about it and itā€™s probably too boring or deep of a subject to pivot into, actually.

Weā€™ll see how things go as I learn about storage, networking, fabric, switches, and servers.

But itā€™s actually the instruction set

ARM stands out for its low-power profile CPUs, made famous by the power pc market and Apples early MacBook Pro and Macbook products in the early 2000s. But low-power typically means, lower boosting and reactive workloads. Server processes and asynchronous tasks are better suited for lower clock speeds. While these PC parts remain strong, even with small form-factor competitors such as the Raspberry Pi in different markets that ARM excels in, IMO these arenā€™t the parts your looking for in servers yet.

So the x86_64 intel owned architecture brings in essentially nothing for the average person, aside from the ability to run standard x86_64 operating systems, such as Fedora, CentOS, Debian, Ubuntu, Manjaro, Gentoo, Arch, Slackware and others. BSD even, hell.

SO yeah the sauce is in the clockspeed. Higher juice means the CPU is tuned and power profiled differently than other parts. Itā€™s a modern part.

So in conclusion, itā€™s the L3 cache

All you have to worry about is whatā€™s in your lane.

Fin.

Just kidding, so what about Windows Server?

Windows server remains a viable option on permissive systems, homelabs and rackmount servers. Just host a virtual machine service, assume responsibility for all of the hardening, and launch Windows servers (and others) with no obstacle. VMware remains a popular vendor.

For the other kind of people

You need some perspective on what the current server market looks like. Xeons are still heavy-hitters, and pack hundreds of processors into a single dye.

Without a clear leader, vendors including AMD and Intel are free to raise prices, thatā€™s why you see so many HP and Dell laptops in the $1-3k price range these days. The processes have become more efficient, and prices have obviously gone up on the consumer end. The chips are better and newer too, so you see some true value there, just like the average customer has with a portable office or home hardware solution.

Without that leader, itā€™s just a matter of time before we run out of Angstromā€™s to shave off the transistors and weā€™ll be on to new challenges with optimizing what we can.

Conclusion

But seriously make your decision entirely on L3 cache.


This post is licensed under CC BY 4.0 by the author.