T3 Memory Leak / Crash...
T3 Memory Leak / Crash...
Hi,
I am experiencing a memory leak freeze/crash on my T3. I have attached the serial port log just prior to the crash.
During normal TV or video watching I am concerned by the number of times the spinner appears when there is no user interaction or apparent need or cause for the spinner.
I am running the 2017-04-22 08:49 version of the firmware.
Any suggestions on what is going wrong? Would any additional information be helpful?
Regards,
Ian.
I am experiencing a memory leak freeze/crash on my T3. I have attached the serial port log just prior to the crash.
During normal TV or video watching I am concerned by the number of times the spinner appears when there is no user interaction or apparent need or cause for the spinner.
I am running the 2017-04-22 08:49 version of the firmware.
Any suggestions on what is going wrong? Would any additional information be helpful?
Regards,
Ian.
- Attachments
-
- Leak.txt
- Serial port log from problematic T3.
- (34.45 KiB) Downloaded 67 times
-
- Wizard God
- Posts: 32709
- Joined: Tue Sep 04, 2007 13:49
- Location: Canberra; Black Mountain Tower transmitters
Re: T3 Memory Leak / Crash...
Why do you think it's a memory leak?
The crash info in the log is for a kernel panic, though the process that invoked the system call that caused the panic is enigma2.
The panic message is "CPU 0 Unable to handle kernel paging request at virtual address 0000001c, epc == e09c7820, ra == e09daa98".
The crash info in the log is for a kernel panic, though the process that invoked the system call that caused the panic is enigma2.
The panic message is "CPU 0 Unable to handle kernel paging request at virtual address 0000001c, epc == e09c7820, ra == e09daa98".
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV
Re: T3 Memory Leak / Crash...
I'd be worried if I *ever* saw the spinner without any user interaction let alone on a frequent basis.
Logitech Harmony Ultimate+Elite RCs
Beyonwiz T2/3/U4/V2, DP-S1 PVRs
Denon AVR-X3400h, LG OLED65C7T TV
QNAP TS-410 NAS, Centos File Server (Hosted under KVM)
Ubiquiti UniFi Managed LAN/WLAN, Draytek Vigor130/Asus RT-AC86U Internet
Pixel 4,5&6, iPad 3 Mobile Devices
Beyonwiz T2/3/U4/V2, DP-S1 PVRs
Denon AVR-X3400h, LG OLED65C7T TV
QNAP TS-410 NAS, Centos File Server (Hosted under KVM)
Ubiquiti UniFi Managed LAN/WLAN, Draytek Vigor130/Asus RT-AC86U Internet
Pixel 4,5&6, iPad 3 Mobile Devices
-
- Wizard God
- Posts: 32709
- Joined: Tue Sep 04, 2007 13:49
- Location: Canberra; Black Mountain Tower transmitters
Re: T3 Memory Leak / Crash...
My first suspect would be AutoTimer and/or its interaction with EPGRefresh, memory leak or not.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV
Re: T3 Memory Leak / Crash...
Looking at the log, I can see this:
I suspect that this is a rare crash scenario that you will not be able to reproduce.
I also see that you are using the Beyonwiz USB tuner. Do you happen to know if either the service prior or after the channel change was being received using the USB tuner? It may or may not be related. I don't think the WiFi dongle would have been involved.
- You pressed FAV, selected a new service and pressed OK.
- As soon as you have done this, the audio decoder crashed. This was deep in the driver code.
- About 4 seconds later the fatal spinner appeared.
(All the other spinner instances appear to be unrelated to this. I'd be suspecting autotimer, but there isn't enough evidence in the log to squarely point a finger at it.) - Once the driver crashes, the UI is bound to lock up pretty quickly and it looks like it did that.
I suspect that this is a rare crash scenario that you will not be able to reproduce.
I also see that you are using the Beyonwiz USB tuner. Do you happen to know if either the service prior or after the channel change was being received using the USB tuner? It may or may not be related. I don't think the WiFi dongle would have been involved.
Re: T3 Memory Leak / Crash...
Hi,
I just stopped home and am on my way out again but while I turned on the screen to update some timers when I noted the frequent spinners have returned every minute or so.
I have attached some more serial port logs showing some odd errors and lots of spinners.:
UPDATE: While I was typing this message the unit spontaneously rebooted while recording StarTrek. I will grab the extra logs in case they help identify the issue. I believe the reboot is the RAM leak problem I reported originally.
Regards,
Ian.
I just stopped home and am on my way out again but while I turned on the screen to update some timers when I noted the frequent spinners have returned every minute or so.
I have attached some more serial port logs showing some odd errors and lots of spinners.:
Code: Select all
{18029}<100073.466> [eDVBPESReader] ERROR reading PES (fd=67): Value too large for defined data type
{18029}<100073.466> [eMainloop::processOneEvent] unhandled POLLERR/HUP/NVAL for fd 67(8)
{18030}<100127.545> [gRC] main thread is non-idle! display spinner!
Regards,
Ian.
- Attachments
-
- Leak2.txt
- More serial port log from problematic T3.
- (144.77 KiB) Downloaded 63 times
Re: T3 Memory Leak / Crash...
Hi PeterU,
Regards,
Ian.
Unfortunately it has been happening a lot recently. Could this be a hint of the T3 power supply problem?
To be honest the USB tuner was plugged in when I got the T3 and it has been sitting there ever since. I doubt it has ever been needed or called into service. This is the T3 I use for development so it doesn't have significant recording duties.
Regards,
Ian.
-
- Wizard God
- Posts: 32709
- Joined: Tue Sep 04, 2007 13:49
- Location: Canberra; Black Mountain Tower transmitters
Re: T3 Memory Leak / Crash...
The boot seems to be because of an OOM (Out Of Memory) error in the kernel.
The total VM size reported for enigma2 (87463 pages, 342MB) is about double what it is for a reasonably recently rebooted T4 (42371 pages, 166MB).
The sizes of the next two largest VM processes, djmount and hbbtv.app, don't seem to be out of the ordinary compared to the T4.
If this is a test machine, can you try disabling AutoTimer for a bit and see whether that gets rid of the memory use problem? Alternatively, put the active machine's autotimer.xml file on a test machine and see whether that causes a memory blowout.
You can check the virtual memory use of enigma2 with:
Code: Select all
root@beyonwizt4:~# ps lx | egrep 'PI\D|enigm\a2$'
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
4 0 552 535 20 0 169484 67844 poll_s Sl ? 0:42 /usr/bin/enigma2
root@beyonwizt4:~#
VSZ is the total virtual size in KM. RSS is the resident set size, also kB.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV
Re: T3 Memory Leak / Crash...
Hi,
It seems that this machine has more AutoTimers than the other machines. When I have a chance I will kill all the AutoTimers and see what the effect may be on memory usage.
I should also confirm that this machine is not currently running any custom or development code. The code is present on the disk but not active.
Regards,
Ian.
It seems that this machine has more AutoTimers than the other machines. When I have a chance I will kill all the AutoTimers and see what the effect may be on memory usage.
I should also confirm that this machine is not currently running any custom or development code. The code is present on the disk but not active.
Regards,
Ian.
Re: T3 Memory Leak / Crash...
Hi Prl,
This is the current data for the "ps":
I will check again later tonight.
Regards,
Ian.
This is the current data for the "ps":
Code: Select all
root:~# ps lx | egrep 'PI\D|enigm\a2$'
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
4 0 2965 2955 20 0 160008 57740 215578 Sl ? 2:28 /usr/bin/enigma2
root:~#
Regards,
Ian.
Re: T3 Memory Leak / Crash...
Hi Prl,
Just ran the "ps" again:
Regards,
Ian.
Just ran the "ps" again:
Code: Select all
root:~# date
Wed Jun 21 18:48:35 AEST 2017
root:~# ps lx | egrep 'PI\D|enigm\a2$'
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
4 0 2965 2955 20 0 162224 60192 215578 Sl ? 3:37 /usr/bin/enigma2
root:~#
Ian.
-
- Wizard God
- Posts: 32709
- Joined: Tue Sep 04, 2007 13:49
- Location: Canberra; Black Mountain Tower transmitters
Re: T3 Memory Leak / Crash...
It's ~2MB in 10-15 min. If it keeps growing like that, it could be a problem. More data points needed
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV
Re: T3 Memory Leak / Crash...
Hi Prl,
Just got home. Here is the latest 'ps':
Regards,
Ian.
Just got home. Here is the latest 'ps':
Code: Select all
root:~# date
Wed Jun 21 22:32:41 AEST 2017
root:~# ps lx | egrep 'PI\D|enigm\a2$'
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
4 0 3750 3737 20 0 180364 94740 - Rl ? 32:37 /usr/bin/enigma2
root:~#
Ian.
Re: T3 Memory Leak / Crash...
I think you may have a couple of issues here.
First of all, it looks like you are getting a lot of noise on the serial port. This could be due to any number of factors, including bad SMPS, long cables, poor connections or external noise. I'd probably try to address this on my development system so that I can work on a platform that isn't introducing some kind of random issues into the mix.
However, it's unlikely that those hardware issues would be related to what looks like OOM (Out Of Memory) errors. Something is not freeing memory and also possibly causing memory fragmentation. It's good that you can reproduce the problem, but now we need to figure out how to track it down. I think it is definitely worthwhile to disable some autotimers. It also looks like you have quite a lot of timeshifted material in both of the crash logs, so perhaps a timeshift buffer of a day or so could also be involved.
If you want to dig deeper, be aware that enigma2 can be configured with --with-memcheck, which may be of use. I haven't tried that option, but it's there if you want to give it a go.
First of all, it looks like you are getting a lot of noise on the serial port. This could be due to any number of factors, including bad SMPS, long cables, poor connections or external noise. I'd probably try to address this on my development system so that I can work on a platform that isn't introducing some kind of random issues into the mix.
However, it's unlikely that those hardware issues would be related to what looks like OOM (Out Of Memory) errors. Something is not freeing memory and also possibly causing memory fragmentation. It's good that you can reproduce the problem, but now we need to figure out how to track it down. I think it is definitely worthwhile to disable some autotimers. It also looks like you have quite a lot of timeshifted material in both of the crash logs, so perhaps a timeshift buffer of a day or so could also be involved.
If you want to dig deeper, be aware that enigma2 can be configured with --with-memcheck, which may be of use. I haven't tried that option, but it's there if you want to give it a go.
Re: T3 Memory Leak / Crash...
Hi Prl,
Here is the another 'ps':
Regards,
Ian.
Here is the another 'ps':
Code: Select all
root:~# date
Thu Jun 22 02:32:07 AEST 2017
root:~# ps lx | egrep 'PI\D|enigm\a2$'
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
4 0 3750 3737 20 0 207976 122412 215578 Sl ? 57:25 /usr/bin/enigma2
root:~#
Ian.
Re: T3 Memory Leak / Crash...
Hi PeterU,
The serial logs never used to feature this much noise. I used to think this related to serial port buffer overrun. This seems to have gotten worse when the T3 started to misbehave. Could this be related to the T3 power supply?
Regards,
Ian.
The development T3 is in my study and near the PC capturing the logs. There is, however a serious number of computers and other devices in close proximity.peteru wrote: ↑Thu Jun 22, 2017 02:00First of all, it looks like you are getting a lot of noise on the serial port. This could be due to any number of factors, including bad SMPS, long cables, poor connections or external noise. I'd probably try to address this on my development system so that I can work on a platform that isn't introducing some kind of random issues into the mix.
The serial logs never used to feature this much noise. I used to think this related to serial port buffer overrun. This seems to have gotten worse when the T3 started to misbehave. Could this be related to the T3 power supply?
I will organise to remove all AutoTimers over the next few days and see how this affects the problem.peteru wrote: ↑Thu Jun 22, 2017 02:00However, it's unlikely that those hardware issues would be related to what looks like OOM (Out Of Memory) errors. Something is not freeing memory and also possibly causing memory fragmentation. It's good that you can reproduce the problem, but now we need to figure out how to track it down. I think it is definitely worthwhile to disable some autotimers. It also looks like you have quite a lot of timeshifted material in both of the crash logs, so perhaps a timeshift buffer of a day or so could also be involved.
Is this a command line or compile time option?
Regards,
Ian.
Re: T3 Memory Leak / Crash...
It's possible. Could be a deteriorating cable or even a cold solder joint somewhere. Buffer overruns are unlikely.
It is an option you can give to configure. The build system will generate binaries with memory leak checks. Have a look at configure.ac in the root of the enigma2 source tree. I think from memory it just defines MEMLEAK_CHECK in enigma2_config.hIs this a command line or compile time option?enigma2 can be configured with --with-memcheck, which may be of use. I haven't tried that option, but it's there if you want to give it a go.
Re: T3 Memory Leak / Crash...
Hi,
Here is the another 'ps':
I don't have the ability to build an Enigma2 binary.
Regards,
Ian.
Here is the another 'ps':
Code: Select all
root:~# date
Fri Jun 23 13:58:41 AEST 2017
root:~# ps lx | egrep 'PI\D|enigm\a2$'
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
4 0 3750 3737 20 0 293412 182452 215578 Sl ? 220:30 /usr/bin/enigma2
root:~#
Regards,
Ian.
-
- Wizard God
- Posts: 32709
- Joined: Tue Sep 04, 2007 13:49
- Location: Canberra; Black Mountain Tower transmitters
Re: T3 Memory Leak / Crash...
You could try disabling AutoTimer (MENU>AutoTimer>Setup>Poll automatically) and see whether that stops the memory leaks. Obviously there's a tradeoff between how long you test it for and missing timer updates.
If that does identify AutoTimer as the culprit, perhaps then a binary search disabling parts of the autotimer.xml file to see whether it's a particular timer.
If that does identify AutoTimer as the culprit, perhaps then a binary search disabling parts of the autotimer.xml file to see whether it's a particular timer.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV
Re: T3 Memory Leak / Crash...
Hi Prl,
I am hoping to find some time over the weekend to play around with the AutoTimers.
Regards,
Ian.
I am hoping to find some time over the weekend to play around with the AutoTimers.
Regards,
Ian.