Thanks for checking this one out.
jonathan@melange:~$ top top - 05:21:08 up 44 min, 2 users, load average: 1.21, 1.68, 1.98 Tasks: 351 total, 2 running, 349 sleeping, 0 stopped, 0 zombie %Cpu(s): 4.3 us, 14.0 sy, 2.1 ni, 70.4 id, 8.9 wa, 0.0 hi, 0.3 si, 0.0 st GiB Mem : 15.579 total, 0.173 free, 4.141 used, 11.264 buff/cache GiB Swap: 15.910 total, 15.868 free, 0.042 used. 11.014 avail Mem PID PPID UID USER RUSER TTY TIME+ %CPU %MEM S COMMAND 67 2 0 root root ? 22:22.40 100.0 0.0 R kworker/0:1
The setup – ubuntu 16.10. 4.8.0-41-generic. Modern intel based laptop with Nvidia drivers and not quite perfect wifi. Let me know and I can provide you with whatever info you need. I have these working acceptably and I don’t see any reason to believe these are involved in this issue.
I’ve actually already asked this on askubuntu & a couple of times over at Freenode` #ubuntu over the last week but no one will even respond to my question 🙁
I’ve taken some perf reports with
sudo perf record -a -g sleep 10 sudo perf report
With some results
Samples: 92K of event 'cycles:ppp', Event count (approx.): 58330337004406 Children Self Command Shared Object Symbol ◆ + 94.27% 0.00% swapper [kernel.kallsyms] [k] cpu_startup_entry ▒ + 94.27% 0.00% swapper [kernel.kallsyms] [k] start_secondary ▒ + 77.29% 0.00% swapper [kernel.kallsyms] [k] schedule_preempt_disabled ▒ - 77.29% 77.29% swapper [kernel.kallsyms] [k] __schedule ▒ 77.29% start_secondary ▒ cpu_startup_entry ▒ - schedule_preempt_disabled ▒ - 77.29% schedule ▒ __schedule ▒ + 77.29% 0.00% swapper [kernel.kallsyms] [k] schedule ▒ + 16.99% 0.00% swapper [kernel.kallsyms] [k] call_cpuidle ▒ + 16.99% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter ▒ + 16.99% 0.00% swapper [kernel.kallsyms] [k] cpuidle_enter_state ▒ - 16.99% 16.99% swapper [kernel.kallsyms] [k] intel_idle ▒ 16.98% start_secondary ▒ cpu_startup_entry ▒ call_cpuidle ▒ - cpuidle_enter ▒ - 16.98% cpuidle_enter_state ▒ intel_idle ▒ + 5.65% 0.00% pool [unknown] [.] 0000000000000000 ▒ + 5.65% 5.65% pool libc-2.24.so [.] re_compile_internal ▒ + 5.65% 0.00% pool [unknown] [.] 0x00007f049804d628 ▒ + 5.65% 0.00% pool [unknown] [.] 0x00007f049804d6a8 ▒ + 5.65% 0.00% pool [unknown] [.] 0x00007f049804d3d8 ▒ + 5.65% 0.00% pool [unknown] [.] 0x00007f049804d768 ▒ Cannot load tips.txt file, please install perf!
I’ve checked dmesg, over heating messages (thats why I’m here) and some other messages about MSFT0101:00 which I believe is something todo with the kernel not recognising my bios enabled TPM module. I think that this should be insignificant in this matter.
$ echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event $ cat /sys/kernel/debug/tracing/trace_pipe > out.txt (wait a few secs) ^C
but it doesn’t work!
jonathan@melange:~$ sudo mount -t debugfs nodev /sys/kernel/debug mount: nodev is already mounted or /sys/kernel/debug busy jonathan@melange:~$ sudo echo workqueue:workqueue_queue_work > /sys/kernel/debug/tracing/set_event bash: /sys/kernel/debug/tracing/set_event: Permission denied jonathan@melange:~$ sudo cat /proc/67/stack [<ffffffffffffffff>] 0xffffffffffffffff
Before submitting this question I had been using Kworker, what is it and why is it hogging so much CPU? as reference. So I had tried disabling/uninstalling long running processes such as dropbox, insync (google drive), crashplan, keybase, Variety background, multiload indicator, psensor, guake. (I feel like I have a pretty slick setup most of the time…) but nothing seemed to help.
There had been other questions lurking around suggesting malfunctioning wifi, nvidia drivers or usb drivers. But nothing in my logs were suggesting this either. Somewhat thankful as almost always the solution in those was simply find newer nvidia drivers, update the kernel, or “Deal with it.” My laptop is pretty up to date already, I have no enterprise reason to stay on 16.04 and I already have the nvidia ppa activated, as with the intel drivers, so this wasn’t much help.
Perhaps the kworker was actually the result of the laptop overheating -> cpu throttling + cpu fan management. Not the cause. As suggested by Stop cpu from overheating So I’ve just used some compressed air to clean out the fans (didn’t think this would be a problem on a laptop only 9 months old yet there was actually a bit of dust) and investigating the thermal-conf.xml which suggests that the fan kicks in at 55°C (although still working on what I can do here)
Thinking this may actually be the solution. Will report back soon.
So doing the Acer bios update totally ruined everything related to my secureboot setup and corrupted the the efi files so it took me a few days to work out how to regenerate the ubuntu efi keys and the and windows efi keys.
I tried cleaning out the dust, and it definitely helped for the two days until I started with the bios issues.
But the kworker is back (and yes it is the same as far as I can tell). I also have some more information now. I can see that the cpu is not throttling down, but rather staying at the maximum. The fan is running, but the device is only sitting around the 60degree mark, so i wouldn’t call this serious over heating.
The commands from the other thread require raising to the root user, not just using sudo. so sudo su and then getting the stack trace gives the following.
[<ffffffff98a9dcea>] worker_thread+0xca/0x500 [<ffffffff98aa40d8>] kthread+0xd8/0xf0 [<ffffffff992a071f>] ret_from_fork+0x1f/0x40 [<ffffffffffffffff>] 0xffffffffffffffff
Doesn’t look particularly helpful to me.