IceTV data causing denial of service

Discuss the IceTV EPG and Recording Apps here

Moderators: Gully, peteru

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

IceTV data causing denial of service

Post by peteru » Thu Oct 29, 2020 22:39

There is some evidence of IceTV servers sending out data that causes the IceTV plugin to consume large amounts of memory. On machines with low free memory, such as the T2, this results in the machine becoming unresponsive or crashing. A classic case of denial of service. Presumably this is accidental and IceTV can not figure out the root cause and thus prevent the bad data from being transmitted.

I'd like to see if the IceTV plugin could be made more resilient to this problem. However, I do not have access to IceTV and therefore am not in a position to investigate.

Has anyone figured out the cause of this? What is the IceTV server sending that causes the plugin to chew up memory? Is it in fact a case of memory use?

Should I push out a firmware update that disables IceTV for all users and gives them a warning that continuing to use IceTV could result in their system crashing? Users would have the option to force-enable IceTV to continue using the service at their own risk.

"Beauty lies in the hands of the beer holder."
Blog.

User avatar
Gully
Moderator
Posts: 7736
Joined: Thu Aug 30, 2007 22:08
Location: Melbourne

Re: IceTV data causing denial of service

Post by Gully » Thu Oct 29, 2020 23:31

peteru wrote:
Thu Oct 29, 2020 22:39

Should I push out a firmware update that disables IceTV for all users and gives them a warning that continuing to use IceTV could result in their system crashing? Users would have the option to force-enable IceTV to continue using the service at their own risk.
That might be a bit too much a jump and cause a lot of confusion. If there was a way of adding a warning without disabling that would be better.

And of course if someone who uses IceTV can help track down the problem better still.
Cheers
Gully
_____________
Beyonwiz U4
Logitech Harmony Elite
Google Pixel 6 Pro

User avatar
Brainz
Apprentice
Posts: 62
Joined: Wed May 14, 2014 21:27
Location: Brisbane

Re: IceTV data causing denial of service

Post by Brainz » Thu Oct 29, 2020 23:46

I spent 30 minutes typing in a reply, hit submit and it disappeared, so here is a short version.

When I try to upgrade to 19.3 my T2 becomes flaky and I revert to 17.5. This has happened many times.
The symptoms are similar to those experienced on the 25th with the IceTV crash.
17.5 has a different version of the plugin to 19.3.
I can't go without IceTV at the moment, so has anyone compared the memory usage of both plugins?

I must learn to copy text before posting! :evil:
V2 1TB (19.3.20211010) w/ Hauppage Dual Tuner
T2 500GB (17.5.20190207) w/ Bluey USB Tuner
DP-P1 500Gb (Retired)

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Fri Oct 30, 2020 00:28

Agreed, a fix would be best. However, from what I am seeing, there are still reports of IceTV users having issues with the stability of their machines. To the best of my knowledge, IceTV never disclosed the nature of the problem, the root cause, mitigation strategies or other pertinent information.

I am concerned about the fact that there is no obvious correlation between the problems and IceTV. To many users this looks like a hardware problem and there have been instances where the suggestion to the users is to reflash the firmware, investigate hardware problems or even replace power supplies.

From my point of view, the risk analysis suggests that leaving IceTV enabled without informing the users of possible crashes or lockups is riskier than forcing the users to choose whether to stop using IceTV or take a gamble. My proposal is to only have this warning enabled until the root cause has been tracked down and the underlying problem resolved.

And, yes, I'd much rather invest the effort towards a proper fix, rather than a giant warning sign. :roll:

"Beauty lies in the hands of the beer holder."
Blog.

Paul_oz53
Wizard
Posts: 2791
Joined: Sat Jun 13, 2009 02:34
Location: Melbourne

Re: IceTV data causing denial of service

Post by Paul_oz53 » Fri Oct 30, 2020 02:56

Peteru,

Now that it is apparent that T2 users with IceTV are seeing ongoing problems your proposed approach seems reasonable. My only concern would be how effectively this will reach all affected users. There are few options and probably none better so I support it as a temporary measure pending a fix by IceTV.

I haven't had any problems with the T3 or T4 but one of the U4s did reboot when the problem first appeared.

As I now know to look out for it, I will probably ignore the update should you do so, unless something material is also included. But that's just a personal choice.
__________________________________
Paul
Beyonwiz T4, 2 x U4: FW - 19.3 20211010
Samsung QA85Q80BAWXXY 4K TV
Samsung QA65Q80TAWXXY 4K TV
Samsung HW Q800BXY soundbar
OverlayHD 1.70, IceTV, Foxtel IQ4
2 x Win7 PCs, 2 x Win10 PCs
Denon AVR -X2400H

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Fri Oct 30, 2020 08:50

peteru wrote:
Thu Oct 29, 2020 22:39
Has anyone figured out the cause of this? What is the IceTV server sending that causes the plugin to chew up memory? Is it in fact a case of memory use?

I was contacted by Daniel Hall@IceTV on Sunday afternoon to see why IceTV seemed to be failing for a lot of Beyonwiz users. I was just starting getting going trying to find the source of the problem when Daniel restarted the server and the most obvious problem, the EPG importer throwing an exception because of an out-of-range Python int conversion to a C long, went away.
peteru wrote:
Thu Oct 29, 2020 22:39
Should I push out a firmware update that disables IceTV for all users and gives them a warning that continuing to use IceTV could result in their system crashing? Users would have the option to force-enable IceTV to continue using the service at their own risk.

Since Daniel's restart of the IceTV server, I haven't noticed any problems, apart from the introduction of some new genres, on either my test V2 or on our in-use T4. I'm happy to look into the problem if I can get some idea of how to replicate the problem.

I just did two full fetches of the IceTV EPG on my test V2 without any issues. That's not saying that there aren't lingering problems, but if I can't see the problem it's hard to work out what it might be.

Crash logs from affected machines (if crash logs are created) would be a start.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Fri Oct 30, 2020 08:57

I just checked the logs on our in-use T4 and there are no crash files, and no hint of any IceTV failures (apart from the new genres) in the logs.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Fri Oct 30, 2020 09:14

Following up on peteru's post in another topic:
peteru wrote:
Thu Oct 29, 2020 22:22
Apparently the IceTV servers are sending out data that causes the IceTV plugin to consume large amounts of memory. On machines with low free memory, such as the T2, this results in the machine becoming unresponsive or crashing. A classic case of denial of service, but probably not worth raising a CVE.

I tried IceTV on my test T2, but I can't see any problems with it there, either.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

Grumpy_Geoff
Uber Wizard
Posts: 6490
Joined: Thu Mar 05, 2009 22:54
Location: Perth

Re: IceTV data causing denial of service

Post by Grumpy_Geoff » Fri Oct 30, 2020 11:38

prl wrote:
Fri Oct 30, 2020 09:14
I tried IceTV on my test T2, but I can't see any problems with it there, either.


Our T2 was booted this morning and has been up for more than 3 hours and has done many IceTV fetches. There were 3 full EPG fetches, the first at startup, the other 2 forced by disabling/re-enabling IceTV then fetching.
For those 2 "forced full guide fetches" I observed through "top" that the %MEM for the enigma2 process peaked at 43.4% but quicky dropped to 39.9% for the fetch duration and then down again to 34.7% once the fetched had finished.

Currently, it's looking fine -

Code: Select all

root@beyonwizt2:~# top -bn1 -o '%MEM' | head -n 15
top - 09:32:29 up  3:21,  2 users,  load average: 0.15, 0.10, 0.09
Tasks:  95 total,   1 running,  94 sleeping,   0 stopped,   0 zombie
%Cpu(s):  5.3 us,  5.3 sy,  0.0 ni, 86.8 id,  0.0 wa,  0.0 hi,  2.6 si,  0.0 st
MiB Mem :    226.2 total,     14.9 free,    146.9 used,     64.4 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.     55.8 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  599 root      20   0  189544  80016   5572 S   0.0  34.5   5:55.53 enigma2
  860 root      20   0  120104   6904   3292 S   0.0   3.0   0:24.01 hbbtv.app
  737 root      20   0   49452   4904   2948 S   0.0   2.1   0:00.45 smbd
  532 root      20   0   39840   3452   1776 S   0.0   1.5   0:00.48 smbd
  519 root      20   0   52872   2684   1364 S   0.0   1.2   0:02.36 minidlnad
 5165 root      20   0    5428   2596   2256 S   0.0   1.1   0:00.88 sshd
  540 root      20   0   26124   2548   1252 S   0.0   1.1   0:00.20 nmbd
  535 root      20   0   37356   2352    744 S   0.0   1.0   0:00.01 smbd-notifyd
root@beyonwizt2:~#

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Fri Oct 30, 2020 12:08

Is there any evidence of a problem that didn't exist before 24/25 Oct continuing after 26 Oct? The (possibly) OOM problems that csutak40 (Judy) reported existed before the IceTV problems on 25/5 Oct, and continued after the IceTV server restart at around 17:00 on 26 Oct.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Fri Oct 30, 2020 15:27

The date range you quote has seen a large number of users being impacted severely. This would suggest that whatever the trigger condition was, it was widespread. However, that does not mean that the same issue has not been present in a limited number of configurations for a much longer period. There is at least one user that suggests that his system becomes unstable with 19.3, but is fine with 17.5 and now thinks that IceTV may be a factor. I know, I know, single data point with no solid evidence...

And therein lies the problem. We know that there is potential for IceTV server to send data that causes denial of service on Beyonwiz machines. No root cause is known and there is not much evidence present to figure out what went wrong. My enquiries at IceTV revealed that in order to fix the problem, IceTV restarted the API service and as a result lost any data that would enable further troubleshooting or analysis.

I have now reconsidered the need for the update that would warn users and suggest disabling IceTV. Earlier in the week it appeared as if the issue was still present for some users, but perhaps it was a case of delayed reporting of the issues experienced last weekend. I think that for now there's no need to make changes, but the situation should be monitored.

I will try to build a new beta firmware release that contains the recent IceTV plugin updates and release it sometime this weekend.

It would be helpful if some IceTV users could run with debug logging enabled over the next few weeks and occasionally examine the logs for signs of anything out of the ordinary.

"Beauty lies in the hands of the beer holder."
Blog.

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Fri Oct 30, 2020 15:45

peteru wrote:
Fri Oct 30, 2020 15:27
I will try to build a new beta firmware release that contains the recent IceTV plugin updates and release it sometime this weekend.

I wouldn't expect those changes to make any difference, apart from allowing the new IceTV genres to appear, but I think it's better to have them than not.
peteru wrote:
Fri Oct 30, 2020 15:27
It would be helpful if some IceTV users could run with debug logging enabled over the next few weeks and occasionally examine the logs for signs of anything out of the ordinary.

+1

I already do this, and apart from the known IceTV problems on 24/5 Oct and the known issues surrounding the new genres, I've seen nothing in my debug logs.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Fri Oct 30, 2020 17:24

It may be worthwhile to perhaps audit the IceTV plugin code and proactively free any data structures as soon as they are not needed any more. It could be a case of some reference being held for too long.

Given some of the descriptions and log archaeology, it sounds like the amount of data received from IceTV and the number of times IceTV is contacted makes a difference. That would suggest that memory usage is accumulating and user specific configuration could be a significant factor.

The bug could be anywhere, IceTV plugin, the Python/C++ interface or C++ code. Or it may not be a bug, it could be the nature of the data being sent combined with the algorithm selection that results in a very large footprint.

"Beauty lies in the hands of the beer holder."
Blog.

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Sat Oct 31, 2020 09:54

The two largest data structures in the IceTV plugin are shows (representation of the JSON EPG) and channel_show_map (EPG converted ready to import into the EPG cache). They are both variables local to EPGFetcher.doWork(), so I would normally expect them to be dereferenced and made available for garbage collection as soon as doWork() exits. The only thing that might cause problems there would be if somehow the EPG cacce imported held references into channel_show_map (or at least the Python garbage collector thought it did).

However, I also have code that I developed for the ATV IceTV plugin that reduces the amount of data space required to do an IceTV load. It was developed for the German IceTV, which is 14 days, and can be up to 150 channels (though if the user configures properly for their region, it's usually quite a lot less).

That code was developed and initially tested on the Beyonwiz version of the plugin, so it should merge in quite easily.

The two main changes are:
  • Fetch the EPG in channel batches. The default for this is 30 channels at a time, which wouldn't make much difference for AU IceTV, but it's easy to change (even dynamically on the basis of things like available memory).
  • The incoming JSON EPG is processed and imported channel at a time, so that the new code equivalent of channel_show_map only ever has data for a single channel in it.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Sat Oct 31, 2020 14:51

Trying to work backwards from the exception error message...

There was a value supplied that was expected to fit in a long, but was too large. It looks like the conversion failed in the C++ code for eEPGCache::importEvents. There are several possible values that fit that profile:
  • single serviceReference as a tuple:
    • onid
    • tsid
    • sid
  • serviceReferences as a list using tuples:
    • onid
    • tsid
    • sid
  • for each event in list of events
    • start
    • duration
    • event type
    • event id
    • parental rating
That information should reduce the search space, assuming that this is actually where the problem originated.

"Beauty lies in the hands of the beer holder."
Blog.

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Sat Oct 31, 2020 15:02

I had already worked back through that and was starting to add debug code to localise where the error was when Daniel Hall restarted the IceTV server and I lost any chance of finding the problem.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Sat Oct 31, 2020 15:08

If IceTV got a list of the fields that were likely contributors, could they figure out where things went wrong?

One would hope that their logs may have some information that could potentially remove some of the suspects from the list.

"Beauty lies in the hands of the beer holder."
Blog.

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Sat Oct 31, 2020 15:10

My guess would be that duration is the most likely suspect, followed by start and then the DVB triplet.

"Beauty lies in the hands of the beer holder."
Blog.

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Sat Oct 31, 2020 15:14

I have my code from ATV ready to submit if you think the changes are worthwhile. There are also some speed enhancements as well as the reduction in processing space requirements.

The code as it is now would always use a channel batch size of 30, which would fetch all channels at once in most regions.

I've been thinking of changing that fixed size something like:

Code: Select all

# Fetch EPG in batches of channels no larger than this size.
# 0 disables batching - fetch EPG for all channels in 1 batch

batchsize = 30
try:
    for l in file("/proc/meminfo"):
        f = l.split()
        if len(f) >= 2 and f[0] =="MemTotal:" and int(f[1]) < 512 * 1024:
            batchsize = 10
except IOError:
    pass

config.plugins.icetv.batchsize = ConfigSelectionNumber(0, 50, 1, default=batchsize)
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Sat Oct 31, 2020 15:58

peteru wrote:
Sat Oct 31, 2020 15:10
My guess would be that duration is the most likely suspect, followed by start and then the DVB triplet.
When I was talking to Daniel, he checked that the durations and start times were reasonable.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Sat Oct 31, 2020 15:59

peteru wrote:
Sat Oct 31, 2020 15:08
One would hope that their logs may have some information that could potentially remove some of the suspects from the list.

I emailed Daniel Hall earlier today asking about whether he had a handle on what had gone wrong with the IceTV servers on 25 Oct.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

User avatar
grampus
Wizard
Posts: 1553
Joined: Sun Sep 16, 2007 07:16
Location: Williamstown Vic

Re: IceTV data causing denial of service

Post by grampus » Sat Oct 31, 2020 16:45

Hope this doesn’t send you guys off in the wrong direction,
But
Was having a problem on both my devices that required me to renter my Icetv password.
it seemed that the password field was being cleared.
however this all was resolved for a day or so.
when it would appear again.

I raised a problem report with Ice, and it seemed it was a problem with a number of folk.

After this seemed to be resolved.
The next iteration in the ice problems were that i would have reboots which appears to be coincidental when i was doing a manual “pull” for epg and timers.
Naturally, this would affect any current active recordings. i.e the same recording being split and restarted.
I have no crash reports to share unfortunately.

Currently there are no indications of either problems reoccurring.
Maybe wasvnfluenced by the recent server reboot?


But just put this out there for What it’s worth,
Screen: Panasonic TH-60ST60A; BeyonWiz U4; T3; Panasonic BD-35 Blu_Ray player

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Sat Oct 31, 2020 17:21

Every now an again, IceTV makes me log in for no apparent reason. It happens infrequently enough that I've never bothered to try to work out why.

It wouldn't surprise me if the server restart dropped current logins, though.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Sun Nov 01, 2020 00:42

prl wrote:
Sat Oct 31, 2020 15:14
...
I've been thinking of changing that fixed size something like:

Code: Select all

...
        if len(f) >= 2 and f[0] =="MemTotal:" and int(f[1]) < 512 * 1024:
...

MemTotal is definitely not a good metric. It does not take into account the individual system configuration. If you are going to dynamically read the memory usage, you might as well use a metric that reflects the current running state of the system.

I'd be inclined to use MemFree as a base metric so that running the IceTV plugin does not result in evicting pages from the buffers/caches if it does not need to. I have no idea how much memory is needed to process a typical EPG channel. If we take 20MB as a wild guess, I'd make the batch size batchsize = max(1, ((MemFree/1024 - 50) / 20)). That is, at least 1 service, reserving 50MB headroom and then allowing one service for each 20MB available. Looking at a V2 (which does not exactly have a huge amount of RAM) would give a batch size of about 26 and a T4 would have a batch size of 25. I expect that the numbers on a T2 would be lower, but not down to 1.

I don't think this needs to be a user configurable value. If you do want to make it user configurable, let the user specify a max limit in configuration and use the smaller value of the dynamically calculated batch size and the user provided max batch size: batchsize = min(config.plugins.icetv.max_batchsize.value, batchsize)

"Beauty lies in the hands of the beer holder."
Blog.

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Sun Nov 01, 2020 08:20

I'll have a bit more of a think about it. If it's to be done on MemFree, it probably needs to be done on each fetch, not just at startup.

Batchsize is a config variable, but it's not made available for user configuration.

I'm going to have a look at trying to simulate the original IceTV data problem on the current plugin and see if I can produce the same responses on memory-limited models and models with more memory.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Sun Nov 01, 2020 11:23

peteru wrote:
Sun Nov 01, 2020 00:42
I have no idea how much memory is needed to process a typical EPG channel. If we take 20MB as a wild guess, I'd make the batch size batchsize = max(1, ((MemFree/1024 - 50) / 20)). That is, at least 1 service, reserving 50MB headroom and then allowing one service for each 20MB available. Looking at a V2 (which does not exactly have a huge amount of RAM) would give a batch size of about 26 and a T4 would have a batch size of 25. I expect that the numbers on a T2 would be lower, but not down to 1.
From awk '/MemFree:/ { print int((int($2/1024) - 50) / 20) }' /proc/meminfo I get:

Code: Select all

Model   MemFree Batchsize
T2     12128 kB        -1 (1)
T3     86168 kB         1
T4    677320 kB        30
U4    371060 kB        15
V2    216616 kB         8
Which are mostly smaller than you suggest they might be. Assuming 20MB/channel may be too conservative. My T2 didn't have 50MB freeboard, so it returned -1, which would be crowbarred to 1.

Also, a per day per channel EPG size estimate might be better, since incremental updates will be much smaller than a full update.

The EPG cache file for my 27 IceTV channels is 890kB, which is only about 32kB/channel. A JSON capture of the ACT IceTV EPG (slightly older, with 25 channels) is 5 MB, or about 178kB/channel.

I'll try to capture the actual memory increment for an IceTV load. I've done it before, but I now can't remember how I did it.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Sun Nov 01, 2020 12:45

For my 26 channels, processing IceTV for a full fetch the first time after a GUI start, it needs about 34MB peak (measured on the process's VmRSS in /proc/self/status), so ~1.3MB/chan.

Subsequent full fetches (Disable IceTV/Enable IceTV/Fetch EPG etc) taper off in the peak extra memory needed: 25MB, 15MB, 13MB, 12MB, 12MB.

There also seems to be an odd thing in the RSS after doWork() has completed (and its local variables been made potentially available for garbage collection, with further memory being retained after subsequent full fetches:

Code: Select all

Additional RSS use
after doWork()
completes (kB)
              16060 (normal initial full EPG load)
              12360 (subsequent manual full loads...)
               2072
               1548
                508
               1024
Is it worth putting a gc.collect() at the end of doWork()?
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Sun Nov 01, 2020 15:54

OK, interesting.

I've simulated the "Python int too large" error by changing the start time of the last entry in the EPG to the square of its original value.

With that in place, I was also able to replicate IanL-S's problem of the T2 having a number of OOM kills of enigma2, and going completely non-responsive. In my case, the freeze happened when the T2 was fully running, and the video and audio kept running, but there was no response to the remote control.

IanL-S reported at least one case where his T2 locked, but there was also no video or audio. I'd expect that if the lockup happened if the T2 was in standby when the lockup happened, but I'm not certain of the exact circumstances of IanL-S's lockup.

In the locked state, it's definitely due to memory exhaustion. I happened to have an ssh connection running, and when I tried to list the contents of the log directory, I got:

Code: Select all

root@beyonwizt2:/media/hdd/logs# ls
-bash: fork: Cannot allocate memory
root@beyonwizt2:/media/hdd/logs#
I was then able to run init 4 and look at /var/log/messages, and copy it off for examination. I'm not sure why that worked: I wouldn't expect init to be a shell builtin.

At the time of the last oom event, it printed this memory use info:

Code: Select all

Nov  1 16:18:25 beyonwizt2 user.warn kernel: Mem-Info:
Nov  1 16:18:25 beyonwizt2 user.warn kernel: Normal per-cpu:
Nov  1 16:18:25 beyonwizt2 user.warn kernel: CPU    0: hi:   90, btch:  15 usd:  47
Nov  1 16:18:25 beyonwizt2 user.warn kernel: CPU    1: hi:   90, btch:  15 usd:  78
Nov  1 16:18:25 beyonwizt2 user.warn kernel: active_anon:41502 inactive_anon:30 isolated_anon:0
Nov  1 16:18:25 beyonwizt2 user.warn kernel:  active_file:10 inactive_file:34 isolated_file:0
Nov  1 16:18:25 beyonwizt2 user.warn kernel:  unevictable:0 dirty:0 writeback:0 unstable:0
Nov  1 16:18:25 beyonwizt2 user.warn kernel:  free:2126 slab_reclaimable:452 slab_unreclaimable:2348
Nov  1 16:18:25 beyonwizt2 user.warn kernel:  mapped:4 shmem:246 pagetables:248 bounce:0
Nov  1 16:18:25 beyonwizt2 user.warn kernel:  free_cma:0
Nov  1 16:18:25 beyonwizt2 user.warn kernel: Normal free:8504kB min:8192kB low:10240kB high:12288kB active_anon:166008kB inactive_anon:120kB active_file:40kB inactive_file:136kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:524288kB managed:231644kB mlocke
Nov  1 16:18:25 beyonwizt2 user.warn kernel: lowmem_reserve[]: 0 0
Nov  1 16:18:25 beyonwizt2 user.warn kernel: Normal: 620*4kB (UEM) 7*8kB (EMR) 3*16kB (MR) 3*32kB (R) 1*64kB (R) 1*128kB (R) 0*256kB 1*512kB (R) 1*1024kB (R) 0*2048kB 1*4096kB (R) = 8504kB
Nov  1 16:18:25 beyonwizt2 user.warn kernel: 299 total pagecache pages
Nov  1 16:18:25 beyonwizt2 user.warn kernel: 0 pages in swap cache
Nov  1 16:18:25 beyonwizt2 user.warn kernel: Swap cache stats: add 0, delete 0, find 0/0
Nov  1 16:18:25 beyonwizt2 user.warn kernel: Free swap  = 0kB
Nov  1 16:18:25 beyonwizt2 user.warn kernel: Total swap = 0kB
Nov  1 16:18:25 beyonwizt2 user.warn kernel: 131072 pages RAM
Nov  1 16:18:25 beyonwizt2 user.warn kernel: 0 pages HighMem/MovableOnly
Nov  1 16:18:25 beyonwizt2 user.warn kernel: 73161 pages reserved
Nov  1 16:18:25 beyonwizt2 user.info kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  358]   998   358      793       42       4        0             0 dbus-daemon
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  368]     0   368     1357       72       4        0         -1000 sshd
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  405]     0   405      801       20       3        0             0 udhcpc
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  415]     0   415      558       20       4        0             0 odhcp6c
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  428]   995   428      653       40       3        0             0 rpcbind
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  447]     0   447    13367      104       6        0             0 automount
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  457]   994   457      810      150       3        0             0 rpc.statd
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  466]     0   466    28448      483      19        0             0 djmount
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  479]     0   479      801       20       4        0             0 inetd
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  489]     0   489      511       22       4        0             0 llmnrd
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  492]     0   492    14081      355      14        0             0 minidlnad
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  504]     0   504     9830      414      12        0             0 smbd
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  508]     0   508     9212      395      11        0             0 smbd-notifyd
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  509]     0   509     9210      395      11        0             0 cleanupd
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  515]     0   515     6513      326       9        0             0 nmbd
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  523]     0   523      540       19       4        0             0 wsdd
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  527]     0   527      801       22       4        0             0 syslogd
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  530]     0   530      801       26       3        0             0 klogd
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  534]     0   534      801       20       4        0             0 telnetd
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  541]   999   541     1006      112       4        0             0 avahi-daemon
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  542]   999   542      949       71       4        0             0 avahi-daemon
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  548]     0   548      670       55       5        0             0 vsftpd
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  552]     0   552      818      131       4        0             0 crond
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  725]     0   725     1357       80       4        0             0 sshd
Nov  1 16:18:25 beyonwizt2 user.info kernel: [  727]     0   727      897      306       5        0             0 bash
Nov  1 16:18:25 beyonwizt2 user.info kernel: [ 6886]     0  6886      840      284       6        0             0 enigma2.sh
Nov  1 16:18:25 beyonwizt2 user.info kernel: [ 6899]     0  6899    65863    37143      60        0             0 enigma2
Nov  1 16:18:25 beyonwizt2 user.info kernel: [ 6929]     0  6929    27977      900      25        0             0 hbbtv.app
Nov  1 16:18:25 beyonwizt2 user.err kernel: Out of memory: Kill process 6899 (enigma2) score 623 or sacrifice child
Nov  1 16:18:25 beyonwizt2 user.err kernel: Killed process 6899 (enigma2) total-vm:263452kB, anon-rss:148568kB, file-rss:4kB
Enigma2 has the largest total VM & RSS. None of the other processes with large total VM have large RSS.

I think that there's a possibility of a major difference between this scenario and IanL-S's. Because of the Python int too large" exception, the "last updated" time doesn't get updated in the plugin, so every IceTV fetch is a full fetch. If IanL-S isn't getting that error (and I'm not seeing it), then his IceTV fetches should be normal incremental fetches after the first one.

Next thing to try is to see if I can get the OOMs and the lockup without forcing the "int too large" exception.

For some reason, the AVAHI daemon is being rather noisy, too, for example:

Code: Select all

Nov  1 15:51:58 beyonwizt2 daemon.warn avahi-daemon[541]: Record [FTP\032file\032server\032on\032beyonwizt2._ftp._tcp.local     IN      TXT  ; ttl=4500] not fitting in legacy unicast packet, dropping.
Nov  1 15:51:58 beyonwizt2 daemon.warn avahi-daemon[541]: Record [FTP\032file\032server\032on\032beyonwizt2._ftp._tcp.local     IN      SRV 0 0 21 beyonwizt2.local ; ttl=120] not fitting in legacy unicast packet, dropping.
Nov  1 15:51:58 beyonwizt2 daemon.warn avahi-daemon[541]: Record [_ssh._tcp.local       IN      PTR beyonwizt2._ssh._tcp.local ; ttl=4500] not fitting in legacy unicast packet, dropping.
Nov  1 15:51:58 beyonwizt2 daemon.warn avahi-daemon[541]: Record [beyonwizt2._ssh._tcp.local    IN      TXT  ; ttl=4500] not fitting in legacy unicast packet, dropping.
Nov  1 15:51:58 beyonwizt2 daemon.warn avahi-daemon[541]: Record [beyonwizt2._ssh._tcp.local    IN      SRV 0 0 22 beyonwizt2.local ; ttl=120] not fitting in legacy unicast packet, dropping.
I'm not yet sure whether that's typical.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Sun Nov 01, 2020 21:50

prl wrote:
Sun Nov 01, 2020 08:20
If it's to be done on MemFree, it probably needs to be done on each fetch, not just at startup.

Definitely. The whole point is to avoid an out of memory condition when the IceTV plugin executes an update, so the MemFree figures have to reflect the current state of the system at that point in time.

prl wrote:
Sun Nov 01, 2020 11:23
Which are mostly smaller than you suggest they might be.

My bad. When I looked at MemFree on the two test systems, they were both in standby. When running, there is a lot less memory available. Note that I pulled the numbers used in calculation out of thin air. The real figures need to be derived empirically.

prl wrote:
Sun Nov 01, 2020 11:23
per day per channel EPG size estimate might be better, since incremental updates will be much smaller than a full update.

I'd probably recommend being conservative and basing the estimated memory requirement on a full fetch. i.e. the amount of EPG data you'd expect on a system that has not received any updates for 10+ days and has an average of 5 IceTV recordings per-day.

prl wrote:
Sun Nov 01, 2020 11:23
The EPG cache file for my 27 IceTV channels is 890kB, which is only about 32kB/channel. A JSON capture of the ACT IceTV EPG (slightly older, with 25 channels) is 5 MB, or about 178kB/channel.

I expect that the working set will have a much, much larger memory footprint, maybe even as much as a whole order of magnitude.

prl wrote:
Sun Nov 01, 2020 12:45
There also seems to be an odd thing in the RSS after doWork() has completed (and its local variables been made potentially available for garbage collection, with further memory being retained after subsequent full fetches

Those figures alone do not provide much insight. If these are incremental deltas then I'd be concerned that we are losing several MB with every fetch. If those values are deltas from a baseline, then they do not really provide much insight.

However, it is good to know that a run for Canberra needs about 25 MB for 26 services. I think as a rule of thumb we should probably budget 1MB per service plus some baseline slack - say 5MB.

prl wrote:
Sun Nov 01, 2020 12:45
Is it worth putting a gc.collect() at the end of doWork()?

Maybe for debugging. I don't think normal code should need to interfere with the normal course of garbage collection. I'm not too familiar with the finer details of Python memory management and how it interacts with the enigma2 C++ code that hosts the interpreter, so I'll refrain from dispensing too much advice. :wink:

I know some other enigma2 distros run with swap enabled, however I don't want to do that. Once you start swapping you enter a world of unpredictable performance issues. I'd much rather see the system run out of memory and address the issue.

"Beauty lies in the hands of the beer holder."
Blog.

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Sun Nov 01, 2020 21:54

Above post, out of sequence. That's what happens when you get 95% of the way through posting and then real life interferes....
:shock:

"Beauty lies in the hands of the beer holder."
Blog.

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Sun Nov 01, 2020 22:13

prl wrote:
Sun Nov 01, 2020 15:54
I've simulated the "Python int too large" error by changing the start time of the last entry in the EPG to the square of its original value.

Interesting, indeed.

The line of code that populates the value does quite a lot:
start = int(timegm(strptime(iceTimer["start_time"].split("+")[0], "%Y-%m-%dT%H:%M:%S")))

I wonder if the IceTV server sent out malformed timestamps that ended up being incorrectly parsed.

I suppose the right thing to do here is to sanity check every value sent by the IceTV server and and reject non-sensical values. For example, only accept start time values that are in the range of now - 7 days and now + 14 days, durations that are no more than 48 hours, etc. As much as I don't generally like to put limits on acceptable API values, it seems to be a sensible thing to do here.

Any out of range values should be logged and perhaps reported back to the IceTV server as error conditions. That way issues can be debugged from both ends.

"Beauty lies in the hands of the beer holder."
Blog.

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Sun Nov 01, 2020 22:24

Oh man, I think I'm too tired to look at this now. I was looking at the timer start values, but now I realise that you were talking about the EPG event start values. So my code snippet above is from the wrong function, but mostly the same.

However, pretty much everything I said about checking for sane values on timers should be applicable to EPG entries too.

Unfortunately, I don't think there is any way of letting the IceTV server know that it's sending bogus data.

"Beauty lies in the hands of the beer holder."
Blog.

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Mon Nov 02, 2020 14:01

peteru wrote:
Sat Oct 31, 2020 15:10
My guess would be that duration is the most likely suspect, followed by start and then the DVB triplet.
Daniel Hall got back to me with a JSON sample of a user's full IceTV download from before and then after the IceTV restart.

The error in those samples was in the genres, which I didn't expect, either. Both errors were in the two "QLD Votes 2020" EPG entries (ABC News, Sat Oct 31, 18:30, then 21:00), and were in both::

Code: Select all

    "category": [
        {
            "eit": "0xbb009f27", 
            "name": ""
        }
    ], 
Those errors weren't in the "after the server restart" data.

I haven't coded the tests of the timer data yet, but that genre data isn't present in the timers anyway.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Mon Nov 02, 2020 14:18

peteru wrote:
Sun Nov 01, 2020 22:13
prl wrote:
Sun Nov 01, 2020 15:54
I've simulated the "Python int too large" error by changing the start time of the last entry in the EPG to the square of its original value.

Interesting, indeed.

I only changed that field because it was easy to create a known bad value.
peteru wrote:
Sun Nov 01, 2020 22:13
The line of code that populates the value does quite a lot:
start = int(timegm(strptime(iceTimer["start_time"].split("+")[0], "%Y-%m-%dT%H:%M:%S")))

The new code that I have waiting takes advantage of a new field (only in the EPG, not in the timers) that Daniel introduced to help speed up the DE IceTV load:
start = int(show["start_unix"])
peteru wrote:
Sun Nov 01, 2020 22:13
I suppose the right thing to do here is to sanity check every value sent by the IceTV server and and reject non-sensical values. For example, only accept start time values that are in the range of now - 7 days and now + 14 days, durations that are no more than 48 hours, etc. As much as I don't generally like to put limits on acceptable API values, it seems to be a sensible thing to do here.

Any out of range values should be logged and perhaps reported back to the IceTV server as error conditions. That way issues can be debugged from both ends.

Yes, though there's currently really only the facility to send back error reports for timers. I'll ask Daniel about it, but currently the server isn't expecting a response for the EPG data.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Mon Nov 02, 2020 15:00

prl wrote:
Mon Nov 02, 2020 14:01
The error in those samples was in the genres, which I didn't expect, either. Both errors were in the two "QLD Votes 2020" EPG entries (ABC News, Sat Oct 31, 18:30, then 21:00), and were in both::

Code: Select all

    "category": [
        {
            "eit": "0xbb009f27", 
            "name": ""
        }
    ], 

Curioser and curioser: that precise "eit" value causes a error logged to the debug file. The "int too large" exception only happens if
category[0]["name"] == "" and int(category[0]["eit"]) & 0x80000000 and int(category[0]["eit"]) & 0xff == 0
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Tue Nov 03, 2020 00:05

Is it not as simple as an unsigned to signed overflow? As in, Python treats the hex value as an unsigned integer and the conversion to C++ can not fit the value into a signed data type.

"Beauty lies in the hands of the beer holder."
Blog.

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Tue Nov 03, 2020 00:23

I see what's happening - Python is treating the value as a long, rather than an int.

Code: Select all

>>> i = 0x1234
>>> type(i)
<type 'int'>
>>> i = 0xbb009f27
>>> type(i)
<type 'long'>
I wonder if this should be addressed in the C++ code by perhaps changing the use of PyInt_AsLong to PyInt_AsUnsignedLongMask, or if the IceTV data should be sanitised in the plugin.

The value is clearly illegal, as eit events are a 16-bit quantity.

"Beauty lies in the hands of the beer holder."
Blog.

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Tue Nov 03, 2020 03:31

I pushed changes that should improve the behaviour with some out of range values coming from the IceTV server.

I don't think that these changes will fix the issues that cause the T2 to stop responding, however they should make the Python exceptions go away.

Unless the problem was caused by the IceTV EPG sending all entries with an invalid EIT that resulted in the event_id always being None. If that was the case, then commit 4f7e60d9c00 may fix/hide the issue. I say hide, because if there is an issue (resource leak) related to an event being replaced, the commit reduces the frequency of this happening by a factor of ~64k. Or rather, the bug from 576dc34ccc, together with invalid data from IceTV, would have increased the frequency of encountering a resource leak 65500 times.

"Beauty lies in the hands of the beer holder."
Blog.

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Tue Nov 03, 2020 09:11

peteru wrote:
Tue Nov 03, 2020 00:23
The value is clearly illegal, as eit events are a 16-bit quantity.

It's not an EIT event id, it's a content identifier (i.e. genre id). That's 8 bits unsigned, but yes, that just makes it further out of range.

When I write the checking code I'll be checking for valid ranges of the values, not just whether they fit into a C++ int or long. For example, conversion functions for the MJD dates used for timestamps in the EPG are limited to 1900-03-01 .. 2100-02-28 inclusive.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Tue Nov 03, 2020 09:20

peteru wrote:
Tue Nov 03, 2020 03:31
I don't think that these changes will fix the issues that cause the T2 to stop responding, however they should make the Python exceptions go away.

My suspicion is that an exception thrown from the C++ code in importEvents() is causing problems with garbage collection and hence creating a memory leak. That memory leak could be as much as the whole Python temporary storage used for an EPG load. Because the exception stops the "last update" value being updated, the box continues doing full EPG fetches, throwing the exception and continuing to leak memory.

If that's the case, making the exceptions go away should make OOM GUI restarts and hangs on the T2 at least less frequent.

There's also a strange message that appears in the debug log just before the exception stack traceback when importEvents() throws an exception: RuntimeWarning: tp_compare didn't return -1 or -2 for exception.

I don't know what it means (and Googling hasn't helped much), but I don't like the look of it.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Tue Nov 03, 2020 10:16

I disagree with the change to start in this change to epgcache.cpp:

Code: Select all

-		long start = PyLong_AsLong(PyTuple_GET_ITEM(singleEvent, 0));
-		long duration = PyInt_AsLong(PyTuple_GET_ITEM(singleEvent, 1));
+		long start = PyInt_AsUnsignedLongMask(PyTuple_GET_ITEM(singleEvent, 0));
+		long duration = PyInt_AsUnsignedLongMask(PyTuple_GET_ITEM(singleEvent, 1));
The MJD dates in the EPG start times are permitted in the range 1900-03-01 .. 2100-02-28 UTC inclusive. That means that negative values for start are permissible (though they probably shouldn't appear), and they should not be converted as unsigned (then cast back to signed).

Also, the end of the valid time range is past the "end-of-time" for 32-bit signed Unix dates (03:14:07 19 Jan 2038 UTC), so IMO the conversion in the Python code should be:
start = long(timegm(strptime(show["start"].split("+")[0], "%Y-%m-%dT%H:%M:%S")))
rather than
start = int(timegm(strptime(show["start"].split("+")[0], "%Y-%m-%dT%H:%M:%S")))
That's mainly a clarity issue, since int(long_value) returns a long, not an int.

The conversion in the C++ should remain as:
long start = PyLong_AsLong(PyTuple_GET_ITEM(singleEvent, 0));

The times themselves should be range-checked against their valid date range.

The time validity check should probably be something like (though with the time ranges precalculated, of course):
timegm((1900, 3, 1, 0, 0, 0, 0, 0, 0)) <= start < timegm((2100, 3, 1, 0, 0, 0, 0, 0, 0))

The other changes in epgcache.cpp are fine, but I think that checks should also be put into the Python code so that they can be logged (and when/if possible, reported to IceTV).
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Tue Nov 03, 2020 11:37

Here's a revised calculation for the IceTV batch size.

Code: Select all

def _getBatchsize():
    batchsize = config.plugins.icetv.batchsize.value
    if batchsize:
        try:
            for l in file("/proc/meminfo"):
                f = l.split()
                # Reduce batch size on machines with < 512MiB total available memory
                if len(f) >= 2 and f[0] == "MemFree:":
                    memMb = int(f[1]) / 1024
                    # Allow 50MB headroom & 1.5 MB/channel &
                    # clamp in the range 1 .. config.plugins.icetv.batchsize
                    batchsize = min(batchsize, max(1, (memMb - 50) * 2 / 3))
                    break
        except IOError:
            pass
    print "[EPGFetcher] batch size:", batchsize # ZZ
    return batchsize
On a T2, the first fetch can start before a lot of memory has been allocated in enigma2, and I've seen batch sizes on that in the range of 1-8 channels (on a small sample size).

Subsequent full fetches (Disable IceTV/Enable IceTV/Fetch EPG...) use a batch size of 1 on the T2. Subsequent (incremental) fetches also use a batch size of 1.

Batching with a size 1 on a full fetch doesn't slow the T2 fetch down by much: from 10-20 sec for a single fetch, and 15-20 sec for single-channel batches. There's a proportionately bigger effect for incremental fetches: an incremental fetch that's expected to have more-or-less no new information goes from 2-3 sec to 3-5 sec.

I am a bit concerned about hitting the IceTV server so often when doing a fetch, especially an incremental fetch that's only likely to return a small amount of data. Perhaps the minimum batch size might be set a bit higher, say 4-5?

Or perhaps MemAvailable might be a better measure to use than MemFree?
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Tue Nov 03, 2020 23:07

prl wrote:
Tue Nov 03, 2020 10:16
I disagree with the change to start in this change to epgcache.cpp:

Code: Select all

-		long start = PyLong_AsLong(PyTuple_GET_ITEM(singleEvent, 0));
-		long duration = PyInt_AsLong(PyTuple_GET_ITEM(singleEvent, 1));
+		long start = PyInt_AsUnsignedLongMask(PyTuple_GET_ITEM(singleEvent, 0));
+		long duration = PyInt_AsUnsignedLongMask(PyTuple_GET_ITEM(singleEvent, 1));

Those changes may not make much sense until you read the Python source code. In particular intobject.c and longobject.c.

There's a lot going under the wraps, but it boils down to avoiding overflow and underflow detection that results in the error message and ((long)-1) being returned.

If correctly range checked values are used, the change will have no effect. If the code has to process out of range values, it will handle sign extension correctly, truncate the value and avoid the error condition associated with the overflow/underflow. More specifically, we avoid complete data loss (getting -1 for all out of range values) and end up with value truncation.

If you can spot another solution, I'd be happy to see an alternative implementation. It will have to be done in the C++ code as that is the boundary between the Python and C++ APIs.
Also, the end of the valid time range is past the "end-of-time" for 32-bit signed Unix dates (03:14:07 19 Jan 2038 UTC), so IMO the conversion in the Python code should be:
start = long(timegm(strptime(show["start"].split("+")[0], "%Y-%m-%dT%H:%M:%S")))
rather than
start = int(timegm(strptime(show["start"].split("+")[0], "%Y-%m-%dT%H:%M:%S")))
That's mainly a clarity issue, since int(long_value) returns a long, not an int.

The conversion in the C++ should remain as:
long start = PyLong_AsLong(PyTuple_GET_ITEM(singleEvent, 0));

The lower level APIs use time_t, which is currently implemented as long int. Yes, we have a Y2k38 problem. But the above code will not help with that. Both int and long are 32-bit quantities on all Beyonwiz architectures. On the Python side, integral types will automatically extend from 32-bit ints to arbitrarily large longs as required. In fact, the Python code does not really need the int() or long() explicit conversions since timegm already returns Unix timestamp value.

I see the nature of the argument you are making - avoid data loss by preserving the full range. Unfortunately there are many layers of APIs that get in the way.

"Beauty lies in the hands of the beer holder."
Blog.

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Tue Nov 03, 2020 23:54

prl wrote:
Tue Nov 03, 2020 11:37
Here's a revised calculation for the IceTV batch size.

Code: Select all

def _getBatchsize():
    batchsize = config.plugins.icetv.batchsize.value
    if batchsize:
        try:
            for l in file("/proc/meminfo"):
                f = l.split()
                # Reduce batch size on machines with < 512MiB total available memory
                if len(f) >= 2 and f[0] == "MemFree:":
                    memMb = int(f[1]) / 1024
                    # Allow 50MB headroom & 1.5 MB/channel &
                    # clamp in the range 1 .. config.plugins.icetv.batchsize
                    batchsize = min(batchsize, max(1, (memMb - 50) * 2 / 3))
                    break
        except IOError:
            pass
    print "[EPGFetcher] batch size:", batchsize # ZZ
    return batchsize

I would rename config.plugins.icetv.batchsize to config.plugins.icetv.max_batchsize to be explicit about this being a limit and also to allow possibility for a matching min limit.

I would also not make this an optional behaviour. Batching should always be enabled to prevent OOM conditions. You never know what other things the user is doing. They may be using Opera, YouTube plugin or Kodi. Maybe recording from HDMI. All those things can change the playing field dramatically.

In terms of the memory calculation, maybe 10MB headroom is sufficient. I'm not sure. What you would ideally want to do is make sure that if a user decides to play a media file from the network, or a couple of recordings start during an IceTV fetch, things don't start failing.

On a T2, the first fetch can start before a lot of memory has been allocated in enigma2, and I've seen batch sizes on that in the range of 1-8 channels (on a small sample size).

Subsequent full fetches (Disable IceTV/Enable IceTV/Fetch EPG...) use a batch size of 1 on the T2. Subsequent (incremental) fetches also use a batch size of 1.

That sounds about right. The memory estimates are based on worst case scenarios. To prevent problems. A different approach may be to base estimates on typical case. Download the EPG data and if the downloaded data set exceeds the estimated typical case, discard it and make several smaller requests.

Batching with a size 1 on a full fetch doesn't slow the T2 fetch down by much: from 10-20 sec for a single fetch, and 15-20 sec for single-channel batches. There's a proportionately bigger effect for incremental fetches: an incremental fetch that's expected to have more-or-less no new information goes from 2-3 sec to 3-5 sec.

See above suggestion for one approach to reducing the number of requests. You could add another estimation variable to decide how much to fetch. Maybe use the duration since last fetch as a scaling factor for the expected size of data and therefore the required memory. Say 100% for a 7 day update, 20% for an hourly update. You probably have a good feel for how often IceTV update their data and how big those updates tend to be. IceTV may even give you some metrics if you tell them you are trying to put in measures to balance the number of requests against their servers vs the reliability of the client firmware. (And thus the user experience.)

I am a bit concerned about hitting the IceTV server so often when doing a fetch, especially an incremental fetch that's only likely to return a small amount of data. Perhaps the minimum batch size might be set a bit higher, say 4-5?

I think keeping the memory footprint on the individual devices within limits is a lot more important. There is nothing that a user can do about the amount of memory that their device has available. On the other hand, IceTV are in full control of the infrastructure they provide on a commercial basis. They should be able to scale their service to deal with the load that the paying customers generate. Even though there are more fetches, each request will put less load on the server as it only needs a fraction of the data. See above suggestions for some of the measures that can be taken to find a balance.

Or perhaps MemAvailable might be a better measure to use than MemFree?

Maybe. I am a bit wary of using that value as it will likely cause the buffers/caches to be evicted, thus affecting I/O performance. I guess testing and empirical evidence is the key to making an informed decision. If you are going to switch, I would suggest having generous headroom. Once you reach that limit, you really are out of memory.

"Beauty lies in the hands of the beer holder."
Blog.

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Wed Nov 04, 2020 09:10

peteru wrote:
Tue Nov 03, 2020 23:54
You could add another estimation variable to decide how much to fetch. Maybe use the duration since last fetch as a scaling factor for the expected size of data and therefore the required memory. Say 100% for a 7 day update, 20% for an hourly update.

I thought that you weren't in favour of that approach:
peteru wrote:
Sun Nov 01, 2020 21:50
prl wrote:
Sun Nov 01, 2020 11:23
per day per channel EPG size estimate might be better, since incremental updates will be much smaller than a full update.

I'd probably recommend being conservative and basing the estimated memory requirement on a full fetch. i.e. the amount of EPG data you'd expect on a system that has not received any updates for 10+ days and has an average of 5 IceTV recordings per-day.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

prl
Wizard God
Posts: 32711
Joined: Tue Sep 04, 2007 13:49
Location: Canberra; Black Mountain Tower transmitters

Re: IceTV data causing denial of service

Post by prl » Wed Nov 04, 2020 09:26

peteru wrote:
Tue Nov 03, 2020 23:07
Both int and long are 32-bit quantities on all Beyonwiz architectures.

I had assumed (or perhaps, hoped) that long would be 64 bits. If long is 32 bits than the current interface isn't capable of carrying the full range of DVB allowed times. I'll adjust my Python range checking accordingly.

Do you know whether there is any way to ask Python what the size of time_t is? Or should I just assume that it's always going to be -sys.maxint-1 .. sys.maxint?
peteru wrote:
Tue Nov 03, 2020 23:07
I see the nature of the argument you are making - avoid data loss by preserving the full range. Unfortunately there are many layers of APIs that get in the way.
I've had a look at the code you linked to, and yes, the way you've done it is probably the best of a bad lot of choices.
Peter
T4 HDMI
U4, T4, T3, T2, V2 test/development machines
Sony BDV-9200W HT system
LG OLED55C9PTA 55" OLED TV

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Wed Nov 04, 2020 09:56

prl wrote:
Wed Nov 04, 2020 09:10
peteru wrote:
Tue Nov 03, 2020 23:54
You could add another estimation variable to decide how much to fetch. Maybe use the duration since last fetch as a scaling factor for the expected size of data and therefore the required memory. Say 100% for a 7 day update, 20% for an hourly update.

I thought that you weren't in favour of that approach:
peteru wrote:
Sun Nov 01, 2020 21:50
prl wrote:
Sun Nov 01, 2020 11:23
per day per channel EPG size estimate might be better, since incremental updates will be much smaller than a full update.

I'd probably recommend being conservative and basing the estimated memory requirement on a full fetch. i.e. the amount of EPG data you'd expect on a system that has not received any updates for 10+ days and has an average of 5 IceTV recordings per-day.

Yes, initially I wanted to keep it as simple as possible and ensure that the solution would be robust and conservative. However, you have provided some real world results that convince me that more complexity is warranted. The trick here is to find a balance between performance and reliability, while keeping implementation complexity low. At the end of the day we want to prevent an out of memory situation (and excessive CPU use), preferably without paying a large performance penalty.

prl wrote:
Wed Nov 04, 2020 09:26
Do you know whether there is any way to ask Python what the size of time_t is?

It's int. You can ask Python to give you a type, like in the following example. I believe there is also a way of asking the size of a type, but in this instance we already know it's 32 bits.

Code: Select all

>>> type(timegm(gmtime()))
<type 'int'>

In C++ it is ultimately defined as long int, so again a signed 32-bit quantity.

"Beauty lies in the hands of the beer holder."
Blog.

Grumpy_Geoff
Uber Wizard
Posts: 6490
Joined: Thu Mar 05, 2009 22:54
Location: Perth

Re: IceTV data causing denial of service

Post by Grumpy_Geoff » Sat Nov 07, 2020 00:07

prl wrote:
Tue Nov 03, 2020 11:37
Or perhaps MemAvailable might be a better measure to use than MemFree?

I don't know if you are still pursuing memory usage with the IceTV plugin, but...
I had a script running on my T2 for a few hours monitoring memory usage.
It reported the lowest MemFree of 11,264 KB (MemAvailable 69,584 KB), which was about 5 minutes after the GUI had started. At the initial IceTV fetch time it was around 18,328 KB.
Later it settled to range between 49,000 and 60,000 KB (MemAvailable 71,000 - 75,000 KB).

The T4 had its lowest MemFree at 34,296 KB (MemAvailable of 927,504 KB).
MemAvailable ranged from 922,000 to 996,000 KB.

User avatar
peteru
Uber Wizard
Posts: 9741
Joined: Tue Jun 12, 2007 23:06
Location: Sydney, Australia
Contact:

Re: IceTV data causing denial of service

Post by peteru » Sat Nov 07, 2020 02:45

I merged the first pull request and followed it up with a commit that puts in some TODO: markers near the code dealing with quantities that ought to be range checked. well, at least the things that seemed the most obvious.

"Beauty lies in the hands of the beer holder."
Blog.

Post Reply

Return to “Ice TV”