T3 Memory Leak / Crash...

Moderators: Gully, peteru

Post Reply
IanSav
Uber Wizard
Posts: 16846
Joined: Tue May 29, 2007 15:00
Location: Melbourne, Australia

T3 Memory Leak / Crash...

Post by IanSav » Tue Jun 20, 2017 13:56

Hi,

I am experiencing a memory leak freeze/crash on my T3. I have attached the serial port log just prior to the crash.

During normal TV or video watching I am concerned by the number of times the spinner appears when there is no user interaction or apparent need or cause for the spinner.

I am running the 2017-04-22 08:49 version of the firmware.

Any suggestions on what is going wrong? Would any additional information be helpful?

Regards,
Ian.
Attachments
Leak.txt
Serial port log from problematic T3.
(34.45 KiB) Downloaded 67 times

prl
Wizard God
Posts: 32709
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: T3 Memory Leak / Crash...

Post by prl » Tue Jun 20, 2017 14:17

Why do you think it's a memory leak?

The crash info in the log is for a kernel panic, though the process that invoked the system call that caused the panic is enigma2.

The panic message is "CPU 0 Unable to handle kernel paging request at virtual address 0000001c, epc == e09c7820, ra == e09daa98".
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

User avatar
MrQuade
Uber Wizard
Posts: 11844
Joined: Sun Jun 24, 2007 13:40
Location: Perth

Re: T3 Memory Leak / Crash...

Post by MrQuade » Tue Jun 20, 2017 16:27

IanSav wrote:
Tue Jun 20, 2017 13:56
I am concerned by the number of times the spinner appears when there is no user interaction or apparent need or cause for the spinner.
I'd be worried if I *ever* saw the spinner without any user interaction let alone on a frequent basis.
Logitech Harmony Ultimate+Elite RCs
Beyonwiz T2/3/U4/V2, DP-S1 PVRs
Denon AVR-X3400h, LG OLED65C7T TV
QNAP TS-410 NAS, Centos File Server (Hosted under KVM)
Ubiquiti UniFi Managed LAN/WLAN, Draytek Vigor130/Asus RT-AC86U Internet
Pixel 4,5&6, iPad 3 Mobile Devices

prl
Wizard God
Posts: 32709
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: T3 Memory Leak / Crash...

Post by prl » Tue Jun 20, 2017 16:52

MrQuade wrote:
Tue Jun 20, 2017 16:27
IanSav wrote:
Tue Jun 20, 2017 13:56
I am concerned by the number of times the spinner appears when there is no user interaction or apparent need or cause for the spinner.
I'd be worried if I *ever* saw the spinner without any user interaction let alone on a frequent basis.

My first suspect would be AutoTimer and/or its interaction with EPGRefresh, memory leak or not.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: T3 Memory Leak / Crash...

Post by peteru » Wed Jun 21, 2017 04:38

Looking at the log, I can see this:
  • You pressed FAV, selected a new service and pressed OK.
  • As soon as you have done this, the audio decoder crashed. This was deep in the driver code.
  • About 4 seconds later the fatal spinner appeared.
    (All the other spinner instances appear to be unrelated to this. I'd be suspecting autotimer, but there isn't enough evidence in the log to squarely point a finger at it.)
  • Once the driver crashes, the UI is bound to lock up pretty quickly and it looks like it did that.
It's hard to tell what caused the audio decoder to crash, but judging by the values in the registers, the ultimate crash cause was a NULL pointer. This would have been in response to enigma2 issuing an AUDIO_PLAY ioctl.

I suspect that this is a rare crash scenario that you will not be able to reproduce.

I also see that you are using the Beyonwiz USB tuner. Do you happen to know if either the service prior or after the channel change was being received using the USB tuner? It may or may not be related. I don't think the WiFi dongle would have been involved.

"Beauty lies in the hands of the beer holder."
Blog.

IanSav
Uber Wizard
Posts: 16846
Joined: Tue May 29, 2007 15:00
Location: Melbourne, Australia

Re: T3 Memory Leak / Crash...

Post by IanSav » Wed Jun 21, 2017 17:12

Hi,

I just stopped home and am on my way out again but while I turned on the screen to update some timers when I noted the frequent spinners have returned every minute or so.

I have attached some more serial port logs showing some odd errors and lots of spinners.:

Code: Select all

{18029}<100073.466> [eDVBPESReader] ERROR reading PES (fd=67): Value too large for defined data type
{18029}<100073.466> [eMainloop::processOneEvent] unhandled POLLERR/HUP/NVAL for fd 67(8)
{18030}<100127.545> [gRC] main thread is non-idle! display spinner!
UPDATE: While I was typing this message the unit spontaneously rebooted while recording StarTrek. I will grab the extra logs in case they help identify the issue. I believe the reboot is the RAM leak problem I reported originally.

Regards,
Ian.
Attachments
Leak2.txt
More serial port log from problematic T3.
(144.77 KiB) Downloaded 63 times

IanSav
Uber Wizard
Posts: 16846
Joined: Tue May 29, 2007 15:00
Location: Melbourne, Australia

Re: T3 Memory Leak / Crash...

Post by IanSav » Wed Jun 21, 2017 17:18

Hi PeterU,
peteru wrote:
Wed Jun 21, 2017 04:38
I suspect that this is a rare crash scenario that you will not be able to reproduce.
Unfortunately it has been happening a lot recently. Could this be a hint of the T3 power supply problem?
peteru wrote:
Wed Jun 21, 2017 04:38
I also see that you are using the Beyonwiz USB tuner. Do you happen to know if either the service prior or after the channel change was being received using the USB tuner? It may or may not be related. I don't think the WiFi dongle would have been involved.
To be honest the USB tuner was plugged in when I got the T3 and it has been sitting there ever since. I doubt it has ever been needed or called into service. This is the T3 I use for development so it doesn't have significant recording duties.

Regards,
Ian.

prl
Wizard God
Posts: 32709
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: T3 Memory Leak / Crash...

Post by prl » Wed Jun 21, 2017 18:16

IanSav wrote:
Wed Jun 21, 2017 17:12
... I believe the reboot is the RAM leak problem I reported originally. ...

The boot seems to be because of an OOM (Out Of Memory) error in the kernel.

The total VM size reported for enigma2 (87463 pages, 342MB) is about double what it is for a reasonably recently rebooted T4 (42371 pages, 166MB).

The sizes of the next two largest VM processes, djmount and hbbtv.app, don't seem to be out of the ordinary compared to the T4.

If this is a test machine, can you try disabling AutoTimer for a bit and see whether that gets rid of the memory use problem? Alternatively, put the active machine's autotimer.xml file on a test machine and see whether that causes a memory blowout.

You can check the virtual memory use of enigma2 with:

Code: Select all

root@beyonwizt4:~# ps lx | egrep 'PI\D|enigm\a2$'
F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
4     0   552   535  20   0 169484 67844 poll_s Sl   ?          0:42 /usr/bin/enigma2
root@beyonwizt4:~#
[the backslashes in the egrep RE stop egrep from matching itself in the ps listing 8)]

VSZ is the total virtual size in KM. RSS is the resident set size, also kB.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

IanSav
Uber Wizard
Posts: 16846
Joined: Tue May 29, 2007 15:00
Location: Melbourne, Australia

Re: T3 Memory Leak / Crash...

Post by IanSav » Wed Jun 21, 2017 18:33

Hi,

It seems that this machine has more AutoTimers than the other machines. :o When I have a chance I will kill all the AutoTimers and see what the effect may be on memory usage.

I should also confirm that this machine is not currently running any custom or development code. The code is present on the disk but not active.

Regards,
Ian.

IanSav
Uber Wizard
Posts: 16846
Joined: Tue May 29, 2007 15:00
Location: Melbourne, Australia

Re: T3 Memory Leak / Crash...

Post by IanSav » Wed Jun 21, 2017 18:37

Hi Prl,

This is the current data for the "ps":

Code: Select all

root:~# ps lx | egrep 'PI\D|enigm\a2$'
F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
4     0  2965  2955  20   0 160008 57740 215578 Sl   ?          2:28 /usr/bin/enigma2
root:~#
I will check again later tonight.

Regards,
Ian.

IanSav
Uber Wizard
Posts: 16846
Joined: Tue May 29, 2007 15:00
Location: Melbourne, Australia

Re: T3 Memory Leak / Crash...

Post by IanSav » Wed Jun 21, 2017 18:50

Hi Prl,

Just ran the "ps" again:

Code: Select all

root:~# date
Wed Jun 21 18:48:35 AEST 2017
root:~# ps lx | egrep 'PI\D|enigm\a2$'
F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
4     0  2965  2955  20   0 162224 60192 215578 Sl   ?          3:37 /usr/bin/enigma2
root:~#
Regards,
Ian.

prl
Wizard God
Posts: 32709
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: T3 Memory Leak / Crash...

Post by prl » Wed Jun 21, 2017 19:01

It's ~2MB in 10-15 min. If it keeps growing like that, it could be a problem. More data points needed :)
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

IanSav
Uber Wizard
Posts: 16846
Joined: Tue May 29, 2007 15:00
Location: Melbourne, Australia

Re: T3 Memory Leak / Crash...

Post by IanSav » Wed Jun 21, 2017 22:34

Hi Prl,

Just got home. Here is the latest 'ps':

Code: Select all

root:~# date
Wed Jun 21 22:32:41 AEST 2017
root:~# ps lx | egrep 'PI\D|enigm\a2$'
F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
4     0  3750  3737  20   0 180364 94740 -      Rl   ?         32:37 /usr/bin/enigma2
root:~#
Regards,
Ian.

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: T3 Memory Leak / Crash...

Post by peteru » Thu Jun 22, 2017 02:00

I think you may have a couple of issues here.

First of all, it looks like you are getting a lot of noise on the serial port. This could be due to any number of factors, including bad SMPS, long cables, poor connections or external noise. I'd probably try to address this on my development system so that I can work on a platform that isn't introducing some kind of random issues into the mix.

However, it's unlikely that those hardware issues would be related to what looks like OOM (Out Of Memory) errors. Something is not freeing memory and also possibly causing memory fragmentation. It's good that you can reproduce the problem, but now we need to figure out how to track it down. I think it is definitely worthwhile to disable some autotimers. It also looks like you have quite a lot of timeshifted material in both of the crash logs, so perhaps a timeshift buffer of a day or so could also be involved.

If you want to dig deeper, be aware that enigma2 can be configured with --with-memcheck, which may be of use. I haven't tried that option, but it's there if you want to give it a go.

"Beauty lies in the hands of the beer holder."
Blog.

IanSav
Uber Wizard
Posts: 16846
Joined: Tue May 29, 2007 15:00
Location: Melbourne, Australia

Re: T3 Memory Leak / Crash...

Post by IanSav » Thu Jun 22, 2017 02:37

Hi Prl,

Here is the another 'ps':

Code: Select all

root:~# date
Thu Jun 22 02:32:07 AEST 2017
root:~# ps lx | egrep 'PI\D|enigm\a2$'
F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
4     0  3750  3737  20   0 207976 122412 215578 Sl  ?         57:25 /usr/bin/enigma2
root:~#
Regards,
Ian.

IanSav
Uber Wizard
Posts: 16846
Joined: Tue May 29, 2007 15:00
Location: Melbourne, Australia

Re: T3 Memory Leak / Crash...

Post by IanSav » Thu Jun 22, 2017 02:41

Hi PeterU,
peteru wrote:
Thu Jun 22, 2017 02:00
First of all, it looks like you are getting a lot of noise on the serial port. This could be due to any number of factors, including bad SMPS, long cables, poor connections or external noise. I'd probably try to address this on my development system so that I can work on a platform that isn't introducing some kind of random issues into the mix.
The development T3 is in my study and near the PC capturing the logs. There is, however a serious number of computers and other devices in close proximity.

The serial logs never used to feature this much noise. I used to think this related to serial port buffer overrun. This seems to have gotten worse when the T3 started to misbehave. Could this be related to the T3 power supply?
peteru wrote:
Thu Jun 22, 2017 02:00
However, it's unlikely that those hardware issues would be related to what looks like OOM (Out Of Memory) errors. Something is not freeing memory and also possibly causing memory fragmentation. It's good that you can reproduce the problem, but now we need to figure out how to track it down. I think it is definitely worthwhile to disable some autotimers. It also looks like you have quite a lot of timeshifted material in both of the crash logs, so perhaps a timeshift buffer of a day or so could also be involved.
I will organise to remove all AutoTimers over the next few days and see how this affects the problem.
peteru wrote:
Thu Jun 22, 2017 02:00
If you want to dig deeper, be aware that enigma2 can be configured with --with-memcheck, which may be of use. I haven't tried that option, but it's there if you want to give it a go.
Is this a command line or compile time option?

Regards,
Ian.

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: T3 Memory Leak / Crash...

Post by peteru » Thu Jun 22, 2017 04:39

IanSav wrote:
Thu Jun 22, 2017 02:41
The serial logs never used to feature this much noise. I used to think this related to serial port buffer overrun. This seems to have gotten worse when the T3 started to misbehave. Could this be related to the T3 power supply?
It's possible. Could be a deteriorating cable or even a cold solder joint somewhere. Buffer overruns are unlikely.

enigma2 can be configured with --with-memcheck, which may be of use. I haven't tried that option, but it's there if you want to give it a go.
Is this a command line or compile time option?
It is an option you can give to configure. The build system will generate binaries with memory leak checks. Have a look at configure.ac in the root of the enigma2 source tree. I think from memory it just defines MEMLEAK_CHECK in enigma2_config.h

"Beauty lies in the hands of the beer holder."
Blog.

IanSav
Uber Wizard
Posts: 16846
Joined: Tue May 29, 2007 15:00
Location: Melbourne, Australia

Re: T3 Memory Leak / Crash...

Post by IanSav » Fri Jun 23, 2017 14:01

Hi,

Here is the another 'ps':

Code: Select all

root:~# date
Fri Jun 23 13:58:41 AEST 2017
root:~# ps lx | egrep 'PI\D|enigm\a2$'
F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
4     0  3750  3737  20   0 293412 182452 215578 Sl  ?        220:30 /usr/bin/enigma2
root:~#
I don't have the ability to build an Enigma2 binary.

Regards,
Ian.

prl
Wizard God
Posts: 32709
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: T3 Memory Leak / Crash...

Post by prl » Fri Jun 23, 2017 14:27

You could try disabling AutoTimer (MENU>AutoTimer>Setup>Poll automatically) and see whether that stops the memory leaks. Obviously there's a tradeoff between how long you test it for and missing timer updates.

If that does identify AutoTimer as the culprit, perhaps then a binary search disabling parts of the autotimer.xml file to see whether it's a particular timer.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

IanSav
Uber Wizard
Posts: 16846
Joined: Tue May 29, 2007 15:00
Location: Melbourne, Australia

Re: T3 Memory Leak / Crash...

Post by IanSav » Fri Jun 23, 2017 15:13

Hi Prl,

I am hoping to find some time over the weekend to play around with the AutoTimers.

Regards,
Ian.

Post Reply

Return to “Bug Reporting and Feature Requests”