#StackBounty: #linux #debian #kernel #nfs Poor NFS performance under high load

Bounty: 100

I have two Debian servers connected to a shared NFS server.

Mounts:

nfs.internal.com:/volume1/SHARE on /share type nfs (rw,nosuid,relatime,sync,vers=3,rsize=131072,wsize=131072,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,soft,noac,nordirplus,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.0.100,mountvers=3,mountport=892,mountproto=tcp,fsc,lookupcache=none,local_lock=none,addr=172.16.0.100)

Sometimes, the load on the server will increase. There seems to be a breakpoint where the load causes NFS operations to slow down enough that a runaway load condition occurs and we eventually reach a limit on processes configured for the service run on this server (1500).

When the load increases on one server, the other is not affected, so I don’t believe networking or the NFS server to be the culprit.

Under normal conditions, load ~10

ops/s       rpc bklog
20637.000           0.000

getattr:           ops/s            kB/s           kB/op         retrans    avg RTT (ms)    avg exe (ms)
               10453.000        2334.176           0.223        0 (0.0%)           0.339           0.467
access:            ops/s            kB/s           kB/op         retrans    avg RTT (ms)    avg exe (ms)
                4893.000        1139.078           0.233        0 (0.0%)           0.338           0.469

Under "medium" load of ~100:

ops/s       rpc bklog
13374.200           0.000

read:              ops/s            kB/s           kB/op         retrans    avg RTT (ms)    avg exe (ms)
                  10.600        1202.169         113.412        0 (0.0%)           0.623          26.868
write:             ops/s            kB/s           kB/op         retrans    avg RTT (ms)    avg exe (ms)
                   7.600           5.462           0.719        0 (0.0%)           0.553           7.395

I don’t have it saved, but at the highest load levels of 1500, the avg exe time will reach 4-5 seconds.

From what I understand, an exe value that is much higher than RTT means that the requests are queuing on the local server rather than waiting on the NFS server.

It sounds like the number of simultaneous requests to the NFS server is specified by sunrpc.tcp_slot_table_entries. In recent kernel versions this value is dynamically increased up tosunrpc.tcp_max_slot_table_entries, however I have never seen it go above the default of two.

Could this be the cause of the performance issues, and what would prevent this value from scaling like it should be?


Get this bounty!!!

#StackBounty: #sound #kernel #alsa #hdmi No HDMI audio device on intel_snd_hda (Ubuntu 20.04)

Bounty: 100

I have a home server that I want to use as a media server.

But when I connect to a TV, there is no sound.

I’ve tried to update to latest available kernel, and also switch kernel module snd-hda-intel to generic. This fixes issues with no sound device, and now 3.5 jack works, but still no HDMI.

Some extra info:

inxi

System:    Kernel: 5.13.13-051313-generic x86_64 bits: 64 Console: tty 1 
           Distro: Ubuntu 20.04.3 LTS (Focal Fossa) 
Machine:   Type: Desktop Mobo: ASRock model: Z590M-ITX/ax serial: M80-E1004401592 UEFI: American Megatrends LLC. v: P1.00 
           date: 01/11/2021 
Audio:     Device-1: Intel driver: snd_hda_intel 
           Sound Server: ALSA v: k5.13.13-051313-generic

aplay -l

**** List of PLAYBACK Hardware Devices ****
card 0: PCH [HDA Intel PCH], device 0: Generic Analog [Generic Analog]
  Subdevices: 1/1
  Subdevice #0: subdevice #0

videocard info (it is integrated GPU in i5-10400):

Intel Corporation Device 9bc8 (rev 03)

  *-display UNCLAIMED       
       description: VGA compatible controller
       product: Intel Corporation
       vendor: Intel Corporation
       physical id: 2
       bus info: pci@0000:00:02.0
       version: 03
       width: 64 bits
       clock: 33MHz
       capabilities: pciexpress msi pm vga_controller bus_master cap_list
       configuration: latency=0
       resources: memory:a0000000-a0ffffff memory:90000000-9fffffff ioport:4000(size=64) memory:c0000-dffff


Get this bounty!!!

#StackBounty: #linux #rhel #kernel #syslog #logrotate dmesg + how to enable dmesg history logs

Bounty: 50

We have RHEL server version 7.2 and we noticed that dmesg files from previous sessions under /var/log are not created

what we have under /var/log are only

ls -ltr | grep dmesg

-rw-r--r--  1 root   root    123011 Jan  3 04:03 dmesg

instead to get like:

    -rw-r--r--  1 root   root    123011 Jan  3 04:03 dmesg.0
    -rw-r--r--  1 root   root    123011 Jan  2 04:03 dmesg.1
    -rw-r--r--  1 root   root    123011 Jan  1 04:03 dmesg.2
.
.
.

what is the configuration that enable to save the old kernel messaged in backup files?


Get this bounty!!!

#StackBounty: #kernel #memory #amd-graphics #amd #kernel-parameters Disable GTT on amdgpu?

Bounty: 50

How can I disable GTT on amdgpu?

According to https://www.kernel.org/doc/html/v4.20/gpu/amdgpu.html, adding amdgpu.gttsize=0 should work but that has no effect.

The reason why I am asking is, that my Vega 8 graphics (of my AMD Ryzen 7 PRO 5850U cpu) already has 4GB VRAM and takes another 4GB of GTT. As the GTT memory is cut off from my "regular" 16 GB memory but I mainly use my CPU over my GPU (compiling, databases…), I only can use 12GB of it.


Get this bounty!!!

#StackBounty: #debian #kernel #rsyslog #debian-buster #journald rsyslog seems to be triggering sdhci dumps when writing in external sto…

Bounty: 100

Acording what I read, rsyslog is usually used to process logs and send them to another locations, either local (external storage, specific partition, etc.) or remote (logging server, for example). However I’m trying to configure rsyslog to store the logs in an external storage device (SD card) but I’m having problems with the sdhci driver in the kernel. First you have here the rsyslog configuration.

As you can see logs are being stored in /data/logs, which is actually the SD card as lsblk output shows. Nevertheless, I can see a strange thing in the dmesg. Apparently the kernel is creating the node /dev/mmcblk0 for external storage which is listed as mmc0:

jul 26 11:03:40 pabx2 kernel: mmc0: new ultra high speed SDR104 SDXC card at address 59b4
jul 26 11:03:40 pabx2 kernel: mmcblk0: mmc0:59b4 SD    58.9 GiB 
jul 26 11:03:40 pabx2 kernel:  mmcblk0: p1

However the sdhci dumps are regarding mmc1! which is the internal storage, not the external one!

jul 26 11:50:18 pabx2 kernel: mmc1: Timeout waiting for hardware interrupt.
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: Sys addr:  0x00000008 | Version:  0x00001002
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: Blk size:  0x00007200 | Blk cnt:  0x00000008
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: Argument:  0x0056c808 | Trn mode: 0x0000002b
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: Present:   0x1fff0001 | Host ctl: 0x0000003c
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: Power:     0x0000000a | Blk gap:  0x00000080
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: Wake-up:   0x00000000 | Clock:    0x00000207
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: Timeout:   0x00000006 | Int stat: 0x00000000
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: Int enab:  0x03ff000b | Sig enab: 0x03ff000b
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: Caps:      0x546ec881 | Caps_1:   0x00000805
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: Cmd:       0x0000193a | Max curr: 0x00000000
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: Resp[0]:   0x00000000 | Resp[1]:  0x00000000
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: Resp[2]:   0x00000000 | Resp[3]:  0x00000000
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: Host ctl2: 0x0000000c
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: ADMA Err:  0x00000000 | ADMA Ptr: 0x0000000182ff6200
jul 26 11:50:18 pabx2 kernel: mmc1: sdhci: ============================================

Why do you think is this? What has the hardware interrupt for the internal storage device to do with the external storage? When these dumps are triggered the system (Debian 10 Buster) becomes extremely slow, to the point it’s inoperative, out-of-use.

Thank you all.


Get this bounty!!!

#StackBounty: #kernel #raspberrypi #21.04 Ubuntu 21.04 Failed to apply overlay '0_rpi-poe' (kernel)

Bounty: 50

I have Ubuntu 21.04 installed on a Raspberry Pi 4. I want to control the fan on the POE+ HAT (https://www.raspberrypi.org/products/poe-plus-hat/)

By default the fan does not spin at all. When I add dtoverlay=rpi-poe to /boot/firmware/config.txt the fan works in a seemingly default mode.
But when I do sudo dtoverlay -l there are no overlays loaded so I don’t know why it makes a difference.

Custom settings like these do not work as expected. When I add these the fan stops completely or falls back to default mode.

dtparam=poe_fan_temp0=50000
dtparam=poe_fan_temp1=58000
dtparam=poe_fan_temp2=64000
dtparam=poe_fan_temp3=68000

When I try to load the overlay rpi-poe I get the response * Failed to apply overlay '0_rpi-poe' (kernel) which maybe means it’s not available in this kernel?
GNU/Linux 5.11.0-1012-raspi aarch64


Get this bounty!!!

#StackBounty: #kernel #raspberrypi #21.04 Ubuntu 21.04 Failed to apply overlay '0_rpi-poe' (kernel)

Bounty: 50

I have Ubuntu 21.04 installed on a Raspberry Pi 4. I want to control the fan on the POE+ HAT (https://www.raspberrypi.org/products/poe-plus-hat/)

By default the fan does not spin at all. When I add dtoverlay=rpi-poe to /boot/firmware/config.txt the fan works in a seemingly default mode.
But when I do sudo dtoverlay -l there are no overlays loaded so I don’t know why it makes a difference.

Custom settings like these do not work as expected. When I add these the fan stops completely or falls back to default mode.

dtparam=poe_fan_temp0=50000
dtparam=poe_fan_temp1=58000
dtparam=poe_fan_temp2=64000
dtparam=poe_fan_temp3=68000

When I try to load the overlay rpi-poe I get the response * Failed to apply overlay '0_rpi-poe' (kernel) which maybe means it’s not available in this kernel?
GNU/Linux 5.11.0-1012-raspi aarch64


Get this bounty!!!

#StackBounty: #kernel #raspberrypi #21.04 Ubuntu 21.04 Failed to apply overlay '0_rpi-poe' (kernel)

Bounty: 50

I have Ubuntu 21.04 installed on a Raspberry Pi 4. I want to control the fan on the POE+ HAT (https://www.raspberrypi.org/products/poe-plus-hat/)

By default the fan does not spin at all. When I add dtoverlay=rpi-poe to /boot/firmware/config.txt the fan works in a seemingly default mode.
But when I do sudo dtoverlay -l there are no overlays loaded so I don’t know why it makes a difference.

Custom settings like these do not work as expected. When I add these the fan stops completely or falls back to default mode.

dtparam=poe_fan_temp0=50000
dtparam=poe_fan_temp1=58000
dtparam=poe_fan_temp2=64000
dtparam=poe_fan_temp3=68000

When I try to load the overlay rpi-poe I get the response * Failed to apply overlay '0_rpi-poe' (kernel) which maybe means it’s not available in this kernel?
GNU/Linux 5.11.0-1012-raspi aarch64


Get this bounty!!!

#StackBounty: #kernel #raspberrypi #21.04 Ubuntu 21.04 Failed to apply overlay '0_rpi-poe' (kernel)

Bounty: 50

I have Ubuntu 21.04 installed on a Raspberry Pi 4. I want to control the fan on the POE+ HAT (https://www.raspberrypi.org/products/poe-plus-hat/)

By default the fan does not spin at all. When I add dtoverlay=rpi-poe to /boot/firmware/config.txt the fan works in a seemingly default mode.
But when I do sudo dtoverlay -l there are no overlays loaded so I don’t know why it makes a difference.

Custom settings like these do not work as expected. When I add these the fan stops completely or falls back to default mode.

dtparam=poe_fan_temp0=50000
dtparam=poe_fan_temp1=58000
dtparam=poe_fan_temp2=64000
dtparam=poe_fan_temp3=68000

When I try to load the overlay rpi-poe I get the response * Failed to apply overlay '0_rpi-poe' (kernel) which maybe means it’s not available in this kernel?
GNU/Linux 5.11.0-1012-raspi aarch64


Get this bounty!!!

#StackBounty: #kernel #raspberrypi #21.04 Ubuntu 21.04 Failed to apply overlay '0_rpi-poe' (kernel)

Bounty: 50

I have Ubuntu 21.04 installed on a Raspberry Pi 4. I want to control the fan on the POE+ HAT (https://www.raspberrypi.org/products/poe-plus-hat/)

By default the fan does not spin at all. When I add dtoverlay=rpi-poe to /boot/firmware/config.txt the fan works in a seemingly default mode.
But when I do sudo dtoverlay -l there are no overlays loaded so I don’t know why it makes a difference.

Custom settings like these do not work as expected. When I add these the fan stops completely or falls back to default mode.

dtparam=poe_fan_temp0=50000
dtparam=poe_fan_temp1=58000
dtparam=poe_fan_temp2=64000
dtparam=poe_fan_temp3=68000

When I try to load the overlay rpi-poe I get the response * Failed to apply overlay '0_rpi-poe' (kernel) which maybe means it’s not available in this kernel?
GNU/Linux 5.11.0-1012-raspi aarch64


Get this bounty!!!