The Linux Questions Thread: a bunch of pitfalls, but technically it's possible

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Linux Questions Thread: a bunch of pitfalls, but technically it's possible

«‹›1028 »

VictualSquid: Feb 29, 2012; Gently enveloping the target with indiscriminate love.

So, I spent the last weekend trying out microos on my new practice machine.It is pretty cool. Switching to the transactional mode worked well and was easy.

SE-Linux on the other hand does not work well and is not easy. I suspect I have to give up on using rootless podman, or switch off SELinux. Or spend a stupid amount of time learning that stuff. Considering that microos had defaulted to reserve 400Gb for root containers and 20Gb for home including rootless containers, I can guess what they want. Though that isn't documented anywhere.

Speaking of stupid time, I also finally managed to get quadlets to function well enough to start understanding them. After around 2 years of bouncing off their terrible documentation.
They are pretty cool, and probably even more fun for those people who actually remember the systemd syntax.
I still don't understand why I had to convert my self-built container to a kube instead of a container.
And I don't understand how or even if I can use other backends with the volume unit system.

# ? May 6, 2024 19:18

Adbot: ADBOT LOVES YOU

# ? May 8, 2024 03:49

Computer viking: May 30, 2011; Now with less breakage.

On a related note, I've been playing with podman on FreeBSD. It seems very close to usable - random linux containers will fail at step 4/11 when pulling them, and then the same container works fine if I build it locally. The truly custom things, like the ZFS storage backend, seem to work fine?

Of course I did all this in service of booting Fedora CoreOS over PXE because I want to test running a small cluster on our retired servers and workstations - but that doesn't mean I can't use FreeBSD as the DHCP/DNS/PXE server.

# ? May 6, 2024 22:02

xzzy: Mar 5, 2009

The biggest downside to podman is it's still in rapid development and has a lot of quirks and poor documentation. Early on it was pretty clearly a gateway drug into k8s (that redhat hoped they could turn into an openshift sale) but that's tapered recently. With RHEL9 and derivatives it's a pretty painless container service.

I like it a lot more than docker (which is still totally fine, it just feels like it's getting crushed under the weight of its age). Quadlets are a really cool idea.

# ? May 6, 2024 22:15

cruft: Oct 25, 2007

xzzy posted:

Quadlets

Oh, rad, this will let me get rid of runit and my dozen+ permutations of a "run this container in podman" startup script.

# ? May 6, 2024 22:28

Nitrousoxide: May 30, 2011; do not buy a oneplus phone

VictualSquid posted:

So, I spent the last weekend trying out microos on my new practice machine.It is pretty cool. Switching to the transactional mode worked well and was easy.

SE-Linux on the other hand does not work well and is not easy. I suspect I have to give up on using rootless podman, or switch off SELinux. Or spend a stupid amount of time learning that stuff. Considering that microos had defaulted to reserve 400Gb for root containers and 20Gb for home including rootless containers, I can guess what they want. Though that isn't documented anywhere.

I really recommend against turning off SELinux. If you do you can never turn it back on.

Just use the

code:

--security-opt label=disable

flag in your podman run command if you want it ignore a SELinux headache that you can't figure out.

# ? May 6, 2024 22:31

xzzy: Mar 5, 2009

I even have users using quadlets to run rootless elasticsearch containers.

The best of all worlds.. I don't have to keep ES running, and I don't have to give out root so they can maintain it.

# ? May 6, 2024 22:33

Nitrousoxide: May 30, 2011; do not buy a oneplus phone

Quadlet has undergone a lot of improvement lately. The most recent version of podman should let you define pods for quadlets without having to use kube files. Which makes it dramatically easier to group a stack of containers that need to work together nicely.

# ? May 6, 2024 22:45

Mantle: May 15, 2004

I'm excited to see that Linux 6.9 will have support for larger console fonts. Is it possible to see if 6.9 actually includes the larger fonts or is it up to someone else (distros?) to provide them now that the support is there?

https://www.phoronix.com/news/Linux-6.9-Larger-FBCON-Fonts

# ? May 6, 2024 22:54

Klyith: Aug 3, 2007; GBS Pledge Week

VictualSquid posted:

SE-Linux on the other hand does not work well and is not easy. I suspect I have to give up on using rootless podman, or switch off SELinux. Or spend a stupid amount of time learning that stuff. Considering that microos had defaulted to reserve 400Gb for root containers and 20Gb for home including rootless containers, I can guess what they want. Though that isn't documented anywhere.

Do you want to learn SElinux in particular? It looks like microos can use either SE or apparmor, and normally uses apparmor (also what suse defaults to for their other distros). The "container host" role is what picks SE.

And if you picked container host that also might be why the storage reserve.

Mantle posted:

I'm excited to see that Linux 6.9 will have support for larger console fonts. Is it possible to see if 6.9 actually includes the larger fonts or is it up to someone else (distros?) to provide them now that the support is there?

It'll be up to distros to ship bigger fonts (easy enough, for high-res fonts you can just bitmap a real font). And then it'll be up to you to set one of the new big fonts to be used.

# ? May 6, 2024 23:23

Inceltown: Aug 6, 2019

Nitrousoxide posted:

Quadlet has undergone a lot of improvement lately. The most recent version of podman should let you define pods for quadlets without having to use kube files. Which makes it dramatically easier to group a stack of containers that need to work together nicely.

I've been migrating over from compose files to quadlet pods lately and it's amazing how painless it is.

# ? May 7, 2024 01:23

xzzy: Mar 5, 2009

The systemd dependencies you can set up are super slick too. Tired of nginx barfing because a reverse proxy backend isn't running? Systemd will start that for you. :smug:

By far my biggest complaint is that rhel9 has deprecated iptables and podman doesn't speak nftables yet. Everything works but it makes managing rules a stupid(er) chore because we converted all our configuration management to use nftables.

# ? May 7, 2024 01:59

Nitrousoxide: May 30, 2011; do not buy a oneplus phone

Inceltown posted:

I've been migrating over from compose files to quadlet pods lately and it's amazing how painless it is.

I like how it handles auto-updates too. Brings down the current container, pulls the new one, spins it up then, and most importantly, if the healthcheck for the container fails, rolls back to the previous image for the container.

Obviously things could still be wrong in a way that don't completely bork the container after an update, but that check is already significantly superior to updates that docker does.

# ? May 7, 2024 02:05

FAT32 SHAMER: Aug 16, 2012

You guys are really starting to sell me on podman over docker for my fast-approaching server build

# ? May 7, 2024 02:07

cruft: Oct 25, 2007

FAT32 SHAMER posted:

You guys are really starting to sell me on podman over docker for my fast-approaching server build

# ? May 7, 2024 02:32

bolind: Jun 19, 2005; Pillbug

Computer viking posted:

For which combination of OSes?

BlankSystemDaemon posted:

As computer viking was hinting, it's gonna depend on the OS.
For Linux, I think maybe you're limited to POSIX 1e ACLs on NFSv3 and v4, but FreeBSD, Solaris and Illumos-derivatives, macOS, and even Windows Server does NFSv4 ACLs.
NFSv4 ACLs are pretty much compatible with Windows/SMB ACLs (well, except the NFS client in Windows..),

EDIT: For FreeBSD, the wiki has everything you should need, until it gets moved into the handbook.

Sorry for abandoning this, I figured it out. And, for the record, Rocky Linux 8.9 on both server and client. I eventually figured it out.

It was easy, actually, I think I got fooled by a combination of inexperience, firewall rules and services not being started.

Full solution here: https://serverfault.com/a/1158965/600891

# ? May 7, 2024 07:08

VictualSquid: Feb 29, 2012; Gently enveloping the target with indiscriminate love.

Klyith posted:

Do you want to learn SElinux in particular? It looks like microos can use either SE or apparmor, and normally uses apparmor (also what suse defaults to for their other distros). The "container host" role is what picks SE.

And if you picked container host that also might be why the storage reserve.

Yes I picked container host. Though like I said I was mostly surprised that it ships with podman in a configuration that makes rootless hard.

When I moved the rootless container storage to /var I needed to copy some selinux rules. So I assumed it was selinux. Unless those commands are identical.

# ? May 7, 2024 09:47

VictualSquid: Feb 29, 2012; Gently enveloping the target with indiscriminate love.

FAT32 SHAMER posted:

You guys are really starting to sell me on podman over docker for my fast-approaching server build

Do it. I was using an app called podlet to convert my commands to quadlets and it worked great. Use it before it becomes outdated.

Just remember to set the install option. Which doesn't do what you think, it enables the quadlets.

E: add Android's spellcheck to people who hate podman and quadlets.

VictualSquid fucked around with this message at 09:55 on May 7, 2024

# ? May 7, 2024 09:52

NihilCredo: Jun 6, 2011; iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

VictualSquid posted:

SE-Linux on the other hand does not work well and is not easy. I suspect I have to give up on using rootless podman, or switch off SELinux.

Nitrousoxide posted:

Just use the
code:
--security-opt label=disable
flag in your podman run command if you want it ignore a SELinux headache that you can't figure out.

The only thing I ever had to do was add ":z" at the end of bind mounts, and that took care of SELinux.

# ? May 7, 2024 13:27

Nitrousoxide: May 30, 2011; do not buy a oneplus phone

That usually works yeah. Though if you want to do certain things like mount all of /home/$USER SELinux will refuse and you have to tell it to gently caress off for this container.

Edit: variables for the mount path can also demand that too like pwd if you want a container to manipulate some file in the current directory since you don't know what SELinux flags would be set for an arbitrary directory on your system.

Nitrousoxide fucked around with this message at 13:45 on May 7, 2024

# ? May 7, 2024 13:42

cruft: Oct 25, 2007

NihilCredo posted:

The only thing I ever had to do was add ":z" at the end of bind mounts, and that took care of SELinux.

LET ME TELL YOU A STORY

So we had this 3.2PB cephfs with user home directories in it, and we were trying to spin up a sort of "Shell As A Service" that users could provision to do science or whatever they want, with their home directory mounted.

Turns out Docker, even with :z, will do a recursive directory listing to "fix" SELinux contexts (or whatever they're called) on files. There is no option to disable this behavior, it's hard-coded.

And that is why it took 16 hours to spin up a shell until we disabled SELinux.

# ? May 7, 2024 15:51

Yaoi Gagarin: Feb 20, 2014

I worked at a place that similar program, developers could click a button on a website and in a few minutes they'd get an ip address to vnc/ssh to with a pre-prepared checkout of all the source code, tools, etc. It used lxc I think

# ? May 7, 2024 18:08

NihilCredo: Jun 6, 2011; iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

So it seems one of my external hard drives has given up the ghost. I have backups so no worries. What i'm trying to learn here, instead, is how I would be supposed to diagnose it. The drive appears in lsusb but doesn't make it to lsblk, which is why I figure it for a hardware issue.

Dmesg gives pretty clear logs, except for one thing - how am I supposed to find out what error code -71 stands for? All I found while googling was this ancient thread where the guy only got anywhere by finding the source code for the drivers and finding a luckily commented enum.

Is that still the way to go in 2024? It would be really nice if I could, for example, judge whether it could be a problem with the SATA drive or with the SATA-USB connector.

quote:

[21625.541276] usb 2-2: new SuperSpeed USB device number 18 using xhci_hcd
[21625.555018] usb 2-2: New USB device found, idVendor=0bc2, idProduct=61b7, bcdDevice= 0.00
[21625.555020] usb 2-2: New USB device strings: Mfr=2, Product=3, SerialNumber=1
[21625.555021] usb 2-2: Product: M3 Portable
[21625.555022] usb 2-2: Manufacturer: Seagate
[21625.555023] usb 2-2: SerialNumber: NM16T0HM
[21625.612048] scsi host9: uas
[21625.612439] scsi 9:0:0:0: Direct-Access Seagate M3 Portable 9300 PQ: 0 ANSI: 6
[21625.613657] sd 9:0:0:0: Attached scsi generic sg4 type 0
[21625.613828] sd 9:0:0:0: [sde] Spinning up disk...
[21626.658544] ..........ready
[21665.908007] sd 9:0:0:0: [sde] tag#1 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN
[21665.908010] sd 9:0:0:0: [sde] tag#1 CDB: Read(10) 28 00 00 00 00 00 00 00 01 00
[21665.914016] scsi host9: uas_eh_device_reset_handler start
[21670.291649] usb 2-2: Device not responding to setup address.
[21674.461134] usb 2-2: Device not responding to setup address.
[21674.668110] usb 2-2: device not accepting address 18, error -71
[21678.791005] usb 2-2: Device not responding to setup address.
[21682.960651] usb 2-2: Device not responding to setup address.
[21683.172222] usb 2-2: device not accepting address 18, error -71
[21687.290319] usb 2-2: Device not responding to setup address.
[21691.459984] usb 2-2: Device not responding to setup address.
[21691.667325] usb 2-2: device not accepting address 18, error -71
[21695.789836] usb 2-2: Device not responding to setup address.
[21699.959344] usb 2-2: Device not responding to setup address.
[21700.171415] usb 2-2: device not accepting address 18, error -71
[21700.184444] scsi host9: uas_eh_device_reset_handler FAILED err -19
[21700.184449] sd 9:0:0:0: Device offlined - not ready after error recovery
[21700.184454] usb 2-2: USB disconnect, device number 18
[21700.184468] sd 9:0:0:0: [sde] 7814037167 512-byte logical blocks: (4.00 TB/3.64 TiB)
[21700.184474] sd 9:0:0:0: rejecting I/O to offline device
[21700.184477] sd 9:0:0:0: [sde] Test WP failed, assume Write Enabled
[21700.184479] sd 9:0:0:0: [sde] Asking for cache data failed
[21700.184480] sd 9:0:0:0: [sde] Assuming drive cache: write through
[21700.184483] sd 9:0:0:0: [sde] Preferred minimum I/O size 512 bytes
[21700.184484] sd 9:0:0:0: [sde] Optimal transfer size 33553920 bytes
[21700.184686] sd 9:0:0:0: [sde] Attached SCSI disk
[21704.289003] usb 2-2: Device not responding to setup address.
[21708.458739] usb 2-2: Device not responding to setup address.
[21708.667526] usb 2-2: device not accepting address 19, error -71
[21712.788330] usb 2-2: Device not responding to setup address.
[21716.957818] usb 2-2: Device not responding to setup address.
[21717.171629] usb 2-2: device not accepting address 20, error -71
[21717.184532] usb usb2-port2: attempt power cycle
[21722.249861] usb 2-2: Device not responding to setup address.
[21722.460062] usb 2-2: Device not responding to setup address.
[21722.667864] usb 2-2: device not accepting address 21, error -71

# ? May 7, 2024 19:02

pseudorandom name: May 6, 2007

NihilCredo posted:

So it seems one of my external hard drives has given up the ghost. I have backups so no worries. What i'm trying to learn here, instead, is how I would be supposed to diagnose it. The drive appears in lsusb but doesn't make it to lsblk, which is why I figure it for a hardware issue.

Dmesg gives pretty clear logs, except for one thing - how am I supposed to find out what error code -71 stands for? All I found while googling was this ancient thread where the guy only got anywhere by finding the source code for the drivers and finding a luckily commented enum.

Is that still the way to go in 2024? It would be really nice if I could, for example, judge whether it could be a problem with the SATA drive or with the SATA-USB connector.

It doesn't really matter, the interesting stuff is the text of the message and all the surrounding messages. The USB stack is complaining that it can't talk to the device.

(You look for 71 in /usr/include/errno.h and follow the includes to /usr/include/asm-generic/errno.h and /usr/include/asm-generic/errno.h and see that is EPROTO.)

# ? May 7, 2024 19:13

Klyith: Aug 3, 2007; GBS Pledge Week

NihilCredo posted:

So it seems one of my external hard drives has given up the ghost. I have backups so no worries. What i'm trying to learn here, instead, is how I would be supposed to diagnose it. The drive appears in lsusb but doesn't make it to lsblk, which is why I figure it for a hardware issue.

Dmesg gives pretty clear logs, except for one thing - how am I supposed to find out what error code -71 stands for? All I found while googling was this ancient thread where the guy only got anywhere by finding the source code for the drivers and finding a luckily commented enum.

Is that still the way to go in 2024? It would be really nice if I could, for example, judge whether it could be a problem with the SATA drive or with the SATA-USB connector.

I'm able to see a lot more results on google, but being selective about quotes with search terms. linux usb "error -71" gives a bunch of miscellaneous plausible stuff like:
https://daniel-lange.com/archives/183-Linux-kernel-USB-errors-71-and-110.html
https://askubuntu.com/questions/262141/usb-error-71-eproto-with-a-gamepad
and "device not accepting address" has:
https://paulphilippov.com/articles/how-to-fix-device-not-accepting-address-error

From which I'd say that not all hope is lost. Plausibly you just need to try a different USB port, if you used one that doesn't have enough juice to spin up a full-size HDD -- that model doesn't have an external power brick right? Does the drive actually spin up?

Also plausible that the controller is bad but the drive inside the box is ok. I think I'd expect a dead HDD in an external box to not fail like that. Like, if the controller is ok it should be able to negotiate a connection, but then mounting the drive would fail. Unless maybe the drive failed so badly that the motor is hosed and trying to spin it makes the controller brown out or something.

# ? May 7, 2024 19:24

NihilCredo: Jun 6, 2011; iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

pseudorandom name posted:

It doesn't really matter, the interesting stuff is the text of the message and all the surrounding messages. The USB stack is complaining that it can't talk to the device.

(You look for 71 in /usr/include/errno.h and follow the includes to /usr/include/asm-generic/errno.h and /usr/include/asm-generic/errno.h and see that is EPROTO.)

Thanks! So is it safe to assume that those error codes are standardized and anything from the kernel (i.e. all drivers) will use them? I thought every driver would have their own set, or at least specific to the device class they support.

Klyith posted:

From which I'd say that not all hope is lost. Plausibly you just need to try a different USB port, if you used one that doesn't have enough juice to spin up a full-size HDD -- that model doesn't have an external power brick right? Does the drive actually spin up?

Also plausible that the controller is bad but the drive inside the box is ok. I think I'd expect a dead HDD in an external box to not fail like that. Like, if the controller is ok it should be able to negotiate a connection, but then mounting the drive would fail. Unless maybe the drive failed so badly that the motor is hosed and trying to spin it makes the controller brown out or something.

I happen to own a separate USB-SATA adapter (a powered one as well, to support 3.5" drives, even though the presumed-dead drive is 2.5"), so a few minutes ago I shucked the drive out and connected it using the other adapter. Still no go, although the dmesg log was a little different:

quote:

[23530.538228] usb 1-3: new high-speed USB device number 9 using xhci_hcd
[23530.758212] usb 1-3: New USB device found, idVendor=2109, idProduct=0715, bcdDevice= 0.00
[23530.758216] usb 1-3: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[23530.758217] usb 1-3: Product: HDD Enclosure
[23530.758218] usb 1-3: Manufacturer: Inateck Technology Inc
[23530.758219] usb 1-3: SerialNumber: ABCDEFA74566
[23530.811174] scsi host9: uas
[23541.300423] scsi 9:0:0:0: Direct-Access VirtualDisk PQ: 0 ANSI: 6
[23541.303995] sd 9:0:0:0: Attached scsi generic sg4 type 0
[23541.304384] sd 9:0:0:0: [sde] Read Capacity(10) failed: Result: hostbyte=DID_OK driverbyte=DRIVER_OK
[23541.304387] sd 9:0:0:0: [sde] Sense Key : Illegal Request [current]
[23541.304389] sd 9:0:0:0: [sde] Add. Sense: Invalid command operation code
[23541.304814] sd 9:0:0:0: [sde] 0 512-byte logical blocks: (0 B/0 B)
[23541.304815] sd 9:0:0:0: [sde] 0-byte physical blocks
[23541.305058] sd 9:0:0:0: [sde] Test WP failed, assume Write Enabled
[23541.305144] sd 9:0:0:0: [sde] Asking for cache data failed
[23541.305145] sd 9:0:0:0: [sde] Assuming drive cache: write through
[23541.305225] sd 9:0:0:0: [sde] Preferred minimum I/O size 4096 bytes not a multiple of physical block size (0 bytes)
[23541.305226] sd 9:0:0:0: [sde] Optimal transfer size 33553920 bytes not a multiple of physical block size (0 bytes)
[23541.305371] sd 9:0:0:0: [sde] Attached SCSI disk
[23605.255620] usb 1-3: USB disconnect, device number 9

The drive also now appears in lsblk, but with 0MB capacity and fails to unlock with cryptsetup, so no good.

I admit I'm a little surprised that the behaviour would change with a different adapter, but it still sounds like it has given up the ghost. It's not impossible (though unlikely) I might have damaged it further while shucking it out, too.

The next step in a proper investigation ought to be trying the shucked adapter with a known-good drive, but I'm not gonna put more drives in harm's way until I've acquired a new backup.

# ? May 7, 2024 19:48

Klyith: Aug 3, 2007; GBS Pledge Week

NihilCredo posted:

I happen to own a separate USB-SATA adapter (a powered one as well, to support 3.5" drives, even though the presumed-dead drive is 2.5"), so a few minutes ago I shucked the drive out and connected it using the other adapter. Still no go, although the dmesg log was a little different:

The drive also now appears in lsblk, but with 0MB capacity and fails to unlock with cryptsetup, so no good.

That might be responsive enough to get smartctl to read from it, if you were interested in a post-mortem. (Though it's also pretty much a coin-flip whether a drive-killing problem shows up in smart. Spinning rust, what a medium.)

NihilCredo posted:

The next step in a proper investigation ought to be trying the shucked adapter with a known-good drive, but I'm not gonna put more drives in harm's way until I've acquired a new backup.

Eh probably not worth it, trash both.

# ? May 7, 2024 20:05

pseudorandom name: May 6, 2007

NihilCredo posted:

Thanks! So is it safe to assume that those error codes are standardized and anything from the kernel (i.e. all drivers) will use them? I thought every driver would have their own set, or at least specific to the device class they support.

Nope, the numbers have an assigned name and sort of have a meaning, but what individual drivers or subsystems actually use them to signify (or whether they use them at all) is entirely up to them.

# ? May 7, 2024 21:51

Adbot: ADBOT LOVES YOU

# ? May 8, 2024 03:49

Subjunctive: Sep 12, 2006; ✨sparkle and shine✨

not every error code is meant to be mapped to errno.h. that�s really just for the kernel and libc

# ? May 7, 2024 23:10

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Linux Questions Thread: a bunch of pitfalls, but technically it's possible

«‹›1028 »