Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
sb hermit
Dec 13, 2016





FlapYoJacks posted:

Nah, it’s because running j1 makes the logs easier to parse as everything is sequential.

:hmmyes:

Adbot
ADBOT LOVES YOU

ryanrs
Jul 12, 2011

FlapYoJacks posted:

huh, I’m pretty sure I submitted a patch for that and it was merged upstream on the master branch.

I'm building OpenWrt 23.05.3, not current? I think I just need to turn on IGNORE_ERRORS=1.

FlapYoJacks
Feb 12, 2009

ryanrs posted:

I'm building OpenWrt 23.05.3, not current? I think I just need to turn on IGNORE_ERRORS=1.

ah yeah. All of my PRs are for the master branch.

Captain Foo
May 11, 2004

we vibin'
we slidin'
we breathin'
we dyin'

make -jo

ryanrs
Jul 12, 2011

Building Linux: The Quest for USB

ryanrs posted:

I can't unfuck this SoC's USB phy in python.

code:
[   64.599866] usb 1-1: new high-speed USB device number 2 using xhci-hcd
[   64.782404] usb-storage 1-1:1.0: USB Mass Storage device detected
[   64.783418] scsi host0: usb-storage 1-1:1.0
[   66.110317] scsi 0:0:0:0: Direct-Access     Samsung  Flash Drive      1100 PQ: 0 ANSI: 6
[   66.112488] sd 0:0:0:0: [sda] 62656641 512-byte logical blocks: (32.1 GB/29.9 GiB)
[   66.196124] sd 0:0:0:0: [sda] Write Protect is off
[   66.284708] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   67.944804]  sda: sda1 sda2
[   67.945549] sd 0:0:0:0: [sda] Attached SCSI removable disk


BusyBox v1.36.1 (2024-04-24 12:12:15 UTC) built-in shell (ash)

  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 OpenWrt SNAPSHOT, r0+26009-6ca8305598
 -----------------------------------------------------
=== WARNING! =====================================
There is no root password defined on this device!
Use the "passwd" command to set up a new password
in order to prevent unauthorized SSH logins.
--------------------------------------------------
root@OpenWrt:/# lsusb
Bus 001 Device 001: ID 1d6b:0002 Linux 6.1.86 xhci-hcd xHCI Host Controller
Bus 001 Device 002: ID 090c:1000 Samsung Flash Drive
Bus 002 Device 001: ID 1d6b:0003 Linux 6.1.86 xhci-hcd xHCI Host Controller
root@OpenWrt:/# mkdir /mnt/fart
root@OpenWrt:/# mount -t f2fs /dev/sda2 /mnt/fart
[  188.702867] F2FS-fs (sda2): Found nat_bits in checkpoint
[  189.284809] F2FS-fs (sda2): Mounted with checkpoint version = 57fdb94e
root@OpenWrt:/# cat /mnt/fart/test 
yospos
:cool:

The Aruba AP-303H is based on a Qualcomm IPQ4029 SoC. This chip has a USB 2.0 controller AND a USB 3.0 controller. Different boards use 0/1/both of these USB controllers. The AP-303H has a single USB 2.0 port, so the current device tree only describes the SoC's USB 2.0 controller.

But the physical port is wired to the USB 3.0 controller's phy, lol. If you change the device tree definitions to enable the USB 3 controller usb3@8af8800, the port will start working. The superspeed lines are not connected, so you only get USB 2.0 speed.

I'll do some more testing today and prepare a PR.

e: https://github.com/openwrt/openwrt/pull/15264

ryanrs fucked around with this message at 06:18 on Apr 25, 2024

sb hermit
Dec 13, 2016





ryanrs posted:

Building Linux: The Quest for USB

code:
[   64.599866] usb 1-1: new high-speed USB device number 2 using xhci-hcd
[   64.782404] usb-storage 1-1:1.0: USB Mass Storage device detected
[   64.783418] scsi host0: usb-storage 1-1:1.0
[   66.110317] scsi 0:0:0:0: Direct-Access     Samsung  Flash Drive      1100 PQ: 0 ANSI: 6
[   66.112488] sd 0:0:0:0: [sda] 62656641 512-byte logical blocks: (32.1 GB/29.9 GiB)
[   66.196124] sd 0:0:0:0: [sda] Write Protect is off
[   66.284708] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   67.944804]  sda: sda1 sda2
[   67.945549] sd 0:0:0:0: [sda] Attached SCSI removable disk


BusyBox v1.36.1 (2024-04-24 12:12:15 UTC) built-in shell (ash)

  _______                     ________        __
 |       |.-----.-----.-----.|  |  |  |.----.|  |_
 |   -   ||  _  |  -__|     ||  |  |  ||   _||   _|
 |_______||   __|_____|__|__||________||__|  |____|
          |__| W I R E L E S S   F R E E D O M
 -----------------------------------------------------
 OpenWrt SNAPSHOT, r0+26009-6ca8305598
 -----------------------------------------------------
=== WARNING! =====================================
There is no root password defined on this device!
Use the "passwd" command to set up a new password
in order to prevent unauthorized SSH logins.
--------------------------------------------------
root@OpenWrt:/# lsusb
Bus 001 Device 001: ID 1d6b:0002 Linux 6.1.86 xhci-hcd xHCI Host Controller
Bus 001 Device 002: ID 090c:1000 Samsung Flash Drive
Bus 002 Device 001: ID 1d6b:0003 Linux 6.1.86 xhci-hcd xHCI Host Controller
root@OpenWrt:/# mkdir /mnt/fart
root@OpenWrt:/# mount -t f2fs /dev/sda2 /mnt/fart
[  188.702867] F2FS-fs (sda2): Found nat_bits in checkpoint
[  189.284809] F2FS-fs (sda2): Mounted with checkpoint version = 57fdb94e
root@OpenWrt:/# cat /mnt/fart/test 
yospos
:cool:

The Aruba AP-303H is based on a Qualcomm IPQ4029 SoC. This chip has a USB 2.0 controller AND a USB 3.0 controller. Different boards use 0/1/both of these USB controllers. The AP-303H has a single USB 2.0 port, so the current device tree only describes the SoC's USB 2.0 controller.

But the physical port is wired to the USB 3.0 controller's phy, lol. If you change the device tree definitions to enable the USB 3 controller usb3@8af8800, the port will start working. The superspeed lines are not connected, so you only get USB 2.0 speed.

I'll do some more testing today and prepare a PR.

:nice:

outhole surfer
Mar 18, 2003

gnome wayland is really starting to get under my skin.

any time i have high disk io, the ui will start stuttering. launching steam after updates have been collecting a while is a sure fire way to trigger it.

FlapYoJacks
Feb 12, 2009

outhole surfer posted:

gnome wayland is really starting to get under my skin.

any time i have high disk io, the ui will start stuttering. launching steam after updates have been collecting a while is a sure fire way to trigger it.

Use Plasma. It's made by competent people OP.

shackleford
Sep 4, 2006

outhole surfer posted:

gnome wayland is really starting to get under my skin.

any time i have high disk io, the ui will start stuttering. launching steam after updates have been collecting a while is a sure fire way to trigger it.

does the GNOME compositor not do this trick

https://github.com/swaywm/sway/blob/646019cad9e8a075911e960fc7645471d9c26bf6/sway/realtime.c#L20-L37

code:
	int prio = sched_get_priority_min(SCHED_RR);
	int old_policy;
	int ret;
	struct sched_param param;

	ret = pthread_getschedparam(pthread_self(), &old_policy, &param);
	if (ret != 0) {
		sway_log(SWAY_DEBUG, "Failed to get old scheduling priority");
		return;
	}

	param.sched_priority = prio;

	ret = pthread_setschedparam(pthread_self(), SCHED_RR, &param);
	if (ret != 0) {
		sway_log(SWAY_INFO, "Failed to set scheduling priority to %d", prio);
		return;
	}

Tankakern
Jul 25, 2007

i think chromium 125 fixed my make-chromium-crash-by-trying-random-mouse-gestures-in-wayland issue

Silver Alicorn
Mar 30, 2008

𝓪 𝓻𝓮𝓭 𝓹𝓪𝓷𝓭𝓪 𝓲𝓼 𝓪 𝓬𝓾𝓻𝓲𝓸𝓾𝓼 𝓼𝓸𝓻𝓽 𝓸𝓯 𝓬𝓻𝓮𝓪𝓽𝓾𝓻𝓮
chome

ryanrs
Jul 12, 2011

This new USB 2 port is amazing, btw. I'm getting 35 MB/s read and 11 MB/s write.

Compare those numbers to the built-in SPI NAND which runs at 1 MB/s read and write. I think it's using classic 1-bit SPI running at 24 MHz.

Beeftweeter
Jun 28, 2005

OFFICIAL #1 GNOME FAN

FlapYoJacks posted:

Nah, it’s because running j1 makes the logs easier to parse as everything is sequential.

that and running -j7 would probably start/continue building something unrelated too. i've always found openwrt's -j1 V=s thing needs suiting, they haven't changed that message much in like 20 years for a reason lol

BlankSystemDaemon
Mar 13, 2009



Antigravitas posted:

Accessing certain files or scrubbing the file system would print a fairly meaningless stacktrace in dmesg, and btrfs would swallow all i/o forever. Any process trying to do any i/o to the fs would get stuck in kernel land.

I blkdiscard-ed the entire drive and restored from backup (saved on a mirror zpool on a small server sitting in a basement 20km from me).

btrfs is fine if you treat it as an utterly disposable thing.
treating a filesystem like a disposable thing is anathema to me, but yeah that's how facebook uses it
meanwhile, their persistent storage solution is seemingly proprietary - they certainly haven't opensourced tectonic

eschaton posted:

I have it on good authority that Cantrill is pretty lovely and that response is just the tip of the iceberg

can’t say more (not my story to tell) but regardless of his technical chops, he’s persona non grata to some folks I know and trust
yeah, i've unfortunately also heard some stories
on the other hand, he does appear to have grown since then

Tankakern
Jul 25, 2007

you're holding it wrong

Tankakern
Jul 25, 2007

if you had a crash with zfs you'd blame either the hw or some wrong setting or whatnot, but since it's btrfs you immediately go to "this fs is a trash fire"

outhole surfer
Mar 18, 2003

i'm in the process of migrating off btrfs to dmraid for a huge number of nodes. on most of our nodes, we have 7 1t nme drives with each nvme drive encrypted at the hypervisor level with keys thrown away any time the vm is stopped or destroyed. since there's no persistence anyway, we trained everyone that the volume was to be strictly for throwaway data or for local staging. striped all drives with btrfs raid0 and all was happy for many months

after some unknown kernel update that i haven't pinned down yet, btrfs went to total poo poo for us. heavy io results in an io error and toasts the filesystem in a way btrfs check won't recover.

so gently caress this, gonna strip out btrfs and do a dmraid raid0 with ext4 on top

ryanrs
Jul 12, 2011

The insidious thing about open source software is that it will sometimes reward your stubborn persistence. If this router had been closed source, I could have thrown it out days ago and moved on with my life. Instead I'm paying hourly for a fast build server in the cloud while I wait for nerds to approve my fixes. :cloud:

ipq40xx: fix USB on Aruba AP-303H #15264
ipq40xx: use nvmem ethernet MACs on Aruba AP-303H #15272

But I need to pull the plug on this build server before the weekend. I'm not going to pay more in computer rental than I did for the router.

This weekend I can see how well the ARM server runs with ccache. Or maybe scrounge up the parts for a real server. I think I have enough stuff at the warehouse.

e: Why wait? I pushed the changes and shredded the server. Tonight I will sleep $0.143/hr easier.

ryanrs fucked around with this message at 07:34 on Apr 26, 2024

eschaton
Mar 7, 2007

Don't you just hate when you wind up in a store with people who are in a socioeconomic class that is pretty obviously about two levels lower than your own?

outhole surfer posted:

after some unknown kernel update that i haven't pinned down yet, btrfs went to total poo poo for us. heavy io results in an io error and toasts the filesystem in a way btrfs check won't recover.

Tankakern posted:

you're holding it wrong

ryanrs
Jul 12, 2011

engaging in heated filesystem chat is really tempting fate, imo

ryanrs
Jul 12, 2011

outhole surfer posted:

so gently caress this, gonna strip out btrfs and do a dmraid raid0 with ext4 on top

I think I tried this 5+ years ago, and the performance sucked compared to 4x nvme ssds, each with their own filesystem. This was for recording raw video, so big streaming writes. Our software could split its output to N directories, and that was way faster than writing to a single filesystem over raid0. There was some bottleneck, either in the raid layer, or maybe in having everything go through one filesystem created contention? We were pushing close to the hardware limits, or at least within a factor of 2, I think.

But maybe the situation has changed?


In fact, I'm about to go grab that old machine from the warehouse to turn into a local build server for making OpenWrt. How exactly should I configure my nvme drives for max compiler performance? It's a single-socket Xeon Silver 4114 10-core 2.20 GHz, turbo 3.00. It was pretty fast in 2018.

Poopernickel
Oct 28, 2005

electricity bad
Fun Shoe
they call it butterfs because it raises your blood pressure

ziasquinn
Jan 1, 2006

Fallen Rib
just use zfs, the gentleman's file system

ryanrs
Jul 12, 2011

Noooo! Someone beat me to the nvme array!

I have the case, mobo, cpu, ram, and a 250 GB boot drive. Just need to find a power supply.

That Fucking Sned
Oct 28, 2010

Instead of worrying about file systems, raid arrays and backups, I allow my memories and creations to be fleeting and ephemeral

Antigravitas
Dec 8, 2019

Die Rettung fuer die Landwirte:
I'm afraid the people giving me their data have the expectation that I could resurrect it all even if the entire institute burns down.

Doing software raid on nvme flash devices introduces a bottleneck, because that's a ton of data to be pushing through your CPU. That's especially true on server CPUs that are clocked low.

For a build server, just get more RAM and build in tmpfs. A single nvme ssd is also more than fast enough.

BlankSystemDaemon
Mar 13, 2009



Antigravitas posted:

I'm afraid the people giving me their data have the expectation that I could resurrect it all even if the entire institute burns down.

Doing software raid on nvme flash devices introduces a bottleneck, because that's a ton of data to be pushing through your CPU. That's especially true on server CPUs that are clocked low.

For a build server, just get more RAM and build in tmpfs. A single nvme ssd is also more than fast enough.
sounds like you have insufficient disaster recovery?

software raid on nvme flash is only a bottleneck because of all the nonsense we've been asking the computers to do to make i/o on spinning rust faster
optimizing the i/o schedulers and datapaths for nvme isn't a trivial amount of work, but it's also a lot simpler than trying to optimize for spinning rust, because the most important thing is that you use as many queues as possible

Antigravitas
Dec 8, 2019

Die Rettung fuer die Landwirte:
Our disaster recovery is fine, just slow. ZFS send/receive of 500TB is going to take a while, to say nothing of tape.

spankmeister
Jun 15, 2008






Antigravitas posted:

Our disaster recovery is fine, just slow. ZFS send/receive of 500TB is going to take a while, to say nothing of tape.

at some point it's faster to load up an entire array and drive it over

BlankSystemDaemon
Mar 13, 2009



spankmeister posted:

at some point it's faster to load up an entire array and drive it over
never underestimate the bandwidth of a van full of harddisks hurtling down the freeway

Sapozhnik
Jan 2, 2005

Nap Ghost
i think it's okay to underestimate it sometimes. every now and again. as a special treat

Truga
May 4, 2014
Lipstick Apathy
the problem is that you're i/o limited regardless unless you just plug in your bakup array and say it's now the real array

which is fair play if you have 2 off-site backups, but i imagine most people don't and just use a backblaze or similar

Antigravitas
Dec 8, 2019

Die Rettung fuer die Landwirte:
In our case the place that collects those datasets sits a few km away, connected via 40Gbps links in the backbone. However, that dwindles down to 10Gbps to the actual servers. And we can't just take their arrays away and cart them on site. We don't have access to their systems and they don't have access to ours (beyond the zfs send/receive rules), and that's on purpose.

In any case, data safety is more important than recovery speed. We have LTO6 and LTO8 tapes sitting around, and recovery from those is glacial in comparison.

What I can't abide, however, is bad file systems. Bit flips in research data are bad, hmkay.

spankmeister
Jun 15, 2008






Antigravitas posted:



What I can't abide, however, is bad file systems. Bit flips in research data are bad, hmkay.

then stop trying to create a resonance cascade or whatever it is you eggheads are doing over there! <:mad:>

oh no blimp issue
Feb 23, 2011

i don't have any backups because none of my data is important

oh no blimp issue
Feb 23, 2011

on a slightly related note, i have a LUKS encrypted boot drive which was working fine until apt updated the kernel, the /boot partition didn't have enough space and apt bailed out but didn't reverse every step of the process i think? now if i don't choose the right kernel in grub i just can't decrypt the file system
any ideas what might have gone wrong and how to fix it?
probably the easiest is to unLUKS the thing and then reencrypt on the newest kernel but if it's possible to just get all the ducks back in a row that'd be nicer

fresh_cheese
Jul 2, 2014

MY KPI IS HOW MANY VP NUTS I SUCK IN A FISCAL YEAR AND MY LAST THREE OFFICE CHAIRS COMMITTED SUICIDE
maybe not enough space in /boot for apt to build the initrd after putting the new kernel on?

try running the “remove older kernels” script , removing the busted new kernel packages, and then reinstalling the new kernel

ziasquinn
Jan 1, 2006

Fallen Rib

fresh_cheese posted:

maybe not enough space in /boot for apt to build the initrd after putting the new kernel on?

try running the “remove older kernels” script , removing the busted new kernel packages, and then reinstalling the new kernel

oh no blimp issue
Feb 23, 2011

i'll give that a go!

Adbot
ADBOT LOVES YOU

oh no blimp issue
Feb 23, 2011

is there actually a script that does that automatically? i tried to do that manually by removing the unused kernels, purging the packages and then reinstalling and that didn't fix anything before

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply