Jump to content
Light-O-Rama Forums

Massive errors on some E1.31 universes


k6ccc

Recommended Posts

This was brought up a year or more ago, but I could not find the thread, so starting anew.  I'm sure this will end up being a @MattBrown issue, but other comments / opinions / suggestions are welcome.

My show runs mostly on E1.31 with an assortment of E1.31 controllers (various Falcons), and a 3W x 2H P5 matrix and a 4W x 3H P10 matrix.  The two matrix panels are each using a PocketBeagle running Falcon Player and a PocketScroller to take in E1.31 data and drive the matrix panels.  The P10 matrix uses Universes 101 - 136 and the P5 matrix uses Universes 201 - 272.  The P10 matrix works perfectly, but the FPP for the P5 reports LOTS of errors for the higher universes (starting at about 240 or so), and the bottom third of the matrix often tears and changes take up to several seconds to catch up.  This has been this way since day 1 with S6, but worked fine in S5.  It does not matter which of the three computers are being used to play the sequence, and it behaves the same either playing from Sequencer or as part of a show.  And no, there is only one computer at any given time sending data to the matrix panels.

Over the past year or two since this was noticed, the FPP instances have run many different versions, currently 7.x-master-417-gddec08e9, and I have run every S6 version that has ever been released.  The FPP instance for the P5 reports less than 35% CPU usage, and the P10 reports less than 25% CPU usage.

As a test today, I copied my preview and deleted the P10 matrix and tested with only the P5 matrix.  Same problem.  Next test was to change the P5 matrix so it is using Universes 101 - 172.  Again, no change.  Obviously for these two tests, only the P5 matrix was playing.

Currently testing this from my primary sequencing computer:

Light-O-Rama S6 Sequencer (64-bit)
Version 6.2.18
Copyright 2016-2023 Light-O-Rama, Inc.
Pro Edition
Registered to: Jim Walls

Microsoft Windows 10 Pro 64-bit
Intel® Core™ i7-8700 CPU @ 3.20GHz
NVIDIA GeForce GTX 1070 Ti
Intel® UHD Graphics 630
OpenGL version: 4.6.0 NVIDIA 457.51

Note that during the off season, both matrix panels are hanging in my family room, so it is very easy to test this.  One more note.  All connectivity between any of the three computers and the matrix panels either when in storage or deployed for use is either 1Gb/s or 10Gb/s.

Any ideas?

 

Link to comment
Share on other sites

Jim,

1) I first set up a test with the P5 matrix in my lab (288 universes). This is controlled by FPP 5.5 running on a Pi 4 and a Colorlight card.

2) Driving this matrix from the Show Player in the S6 Control Panel did produce errors on the FPP status screen.

3) Next I looked at the FPP source code to see what conditions cause it to report an error. Turns out that errors only get reported in Bridge mode when E1.31 packets are received out of order, e.g. if one is skipped (each packet has a sequence number) The relevant code is here (lines 459-468): https://github.com/FalconChristmas/fpp/blob/master/src/e131bridge.cpp#L452

4) I reviewed the S6 code that sends out E1.31 packets. The way this code is structured, it is impossible for it to send out packets out of order. To check, I used Wireshark to capture packets sent by the Control Panel. Wireshark showed all packets were in order even as FPP was reporting errors.

5) From this, I suspected that the problem was that FPP was dropping packets. I verified this by artificially reducing the rate at which the Control Panel was sending packets to FPP. I was able to drop the rate to a point where FPP no longer reported errors.

Therefore, I don't believe there is any problem with S6. The problem is that FPP cannot process the incoming packets fast enough when in bridge mode.

Why does it work ok with S5 and not S6? Because S6 sends E1.31 packets out much faster than S5 or S4.

Link to comment
Share on other sites

Thanks for doing all that research..  I am assuming that in addition to the errors shown on the FPP status screen, the bottom portion of the P5 matrix also was displaying incorrectly.  Curiosity, how many universes was it getting before it started the errors (that will relate to the next paragraph)?  Now to figure out my solution.  Now you can better see the reason for my query on Tuesday about the Master / Slave (if / when implemented) if it will be compatible with with the remote mode in FPP.  Although it took a couple tweeks, I successfully exported a sequence from S6 as an .fseq file, uploaded that to the PocketBeagle running my P5 matrix, and played it from FPP on the P5 matrix - and it worked perfectly.

Although no expert, I am going to speculate that the PocketBeagle is not exactly the fastest FPP compatible processor out there.  So wondering if I should plan on upgrading the PocketBeagles to something faster - either a faster BeagleBone or a RasPi.  I used a ColorLight one year with RasPi 3B for each matrix, but dealing with ColorLight config was such a pain in the backside that I was happy to go with the PocketBeagles.  Although the RasPis have been re-purposed for a ham radio application, coming up with a Pi-4 or even Pi-5 (once that is fully supported by FPP) would not be a big deal.  I still have the ColorLight cards.

The last few years, this has not been an operational problem because I have only been deploying my P10 matrix (36 universes), and not the P5 matrix (72 universes).  However, I am planning on rebuilding my P10 matrix to make it larger so it will also become 72 universes.  And dammit, I'm GOING to build a waterproof enclosure for the P5 so I can deploy it next Christmas!

Now here's a novel approach.  Do you see any reason that this would not work?  Build the matrix with the upper half driven by one FPP instance and the lower half driven by another.  The matrix in the Preview would remain as 72 consecutive universes (201 - 272 in my case), but in the Control Panel Networking tab, it would be set up as two networks - i.e. universes 201 - 236 go to 192.168.131.81 and universes 237 - 272 go to 192.168.131.82.  That would not be hard for me to test.  Granted, this is a band-aid, but may be a workable solution - paragraph 2 may be a better plan...

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...