-
Notifications
You must be signed in to change notification settings - Fork 4
Performance Data Propagation #7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
81d9e57
84866e8
84527dc
260f613
cb026ee
e940f05
adcb5a9
7cc6cc9
5f6fefd
c4bd5c2
1d1f8a9
32948f3
c461e7e
115aae1
891bff0
4288719
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,179 @@ | ||
| - Feature Name: `performance-data-propagation` | ||
| - Start Date: 2021-01-29 | ||
| - RFC PR: [RFC#7](https://github.com/Unity-Technologies/com.unity.multiplayer.rfcs/pull/7) | ||
| - Issue: [MLAPI#484](https://github.com/Unity-Technologies/com.unity.multiplayer.mlapi/issues/484) | ||
|
|
||
| # Summary | ||
| [summary]: #summary | ||
|
|
||
| Currently there is Transport performance data and MLAPI performance data that is not being propagated up externally for any consumer to render or consume. The use cases for this would be useful for technologies such as a **Network Stats Overlay** or a **Profiler** (**Network Messages** and **Network Operation** need this data in the vanilla Unity Profiler). Users need a viable and low overhead way to obtain this data (such as RPCs sent this frame/tick) from MLAPI and utilize this data in the way that they want for their own metrics. | ||
|
|
||
| # Motivation | ||
| [motivation]: #motivation | ||
|
|
||
| This is needed to provide both a default and user specific implementations for future Network Profilers and Network Stats overlays. The MLAPI source code can remain Open Source, and yet, the data being streamed can be hooked up to external and low level systems. | ||
|
|
||
| # Guide-level explanation | ||
| [guide-level-explanation]: #guide-level-explanation | ||
|
|
||
| In order to allow first and third party rendering of the data we will need to allow both MLAPI and its transports to propagate data up the chain. | ||
|
|
||
| If the profiler flags are enabled the following will need to happen: | ||
|
|
||
| Transports will need to output the number of bytes being sent in the current tick and the latency. The data will be written out to a `ProfilerTickData` chunk. This data will be generic and transport agnostic. | ||
|
|
||
| MLAPI will need to output the number of networked objects and the amount of RPCs being sent. This data will be added to the `ProfilerTickData` chunk. | ||
|
|
||
| External Tools and APIs will be able to then get this fully written `ProfilerTickData` at the end of the frame for them to do what they wish (Render to Profiler Stats Overlay or Display/Record in a Profiler or file) | ||
|
|
||
| As this data is largely a `Dictionary`, accessing its data will be as simple as accessing things by specified fields. Additionally the rendering or logging of these statistics will fall on the user and not affect the Tick or be done at all within the MLAPI framework. | ||
|
|
||
| # Reference-level explanation | ||
| [reference-level-explanation]: #reference-level-explanation | ||
|
|
||
| In order to limit the interface, we are pushing the collected set or performance data to be obtainable only at the end of a tick. The following is roughly what this object could look like. Currently this object will be updated in `NetworkManager.LateUpdate()` for the MLAPI and `Transport.Send()` and other relevant transport method for the transport layer. At the end of the tick the user will be notified that the relevant data is complete and ready (both the MLAPI data and Transport data). Since MLAPI is responsible for collecting the data, and the data from transports will be generic and apply to all/most transports, MLAPI as the owner of the collection of performance data will call `GetTransportGetData()` on the transports through an interface in order to get the data exactly when it needs it in the pipeline before propagating the data out externally. Whether or not the data is collected per transport can also be toggled by a runtime flag `profilerEnabled`. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this behavior vary whether we're a client or server? And if we're a server, is this data aggregated for all connected clients or separated out per client? For example, with AOI the amount of data replicated will vary by client connection. Would you want to be able to see that? Or is it enough to lump it into one pile for all connections/clients?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is dependent on what MLAPI wants to output. Currently i'll match 1-to-1 with existing Profiling which is all connections and server based. |
||
|
|
||
| ## Pseudo-code | ||
|
|
||
| ```cs | ||
| namespace MLAPI | ||
| { | ||
| class ProfilerConstants | ||
| { | ||
| public const string NumberOfServerRPCs = "numberOfServerRPCs"; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I recently went through an exercise to clean up the channel stuff that might be of use here. Check out Transport.cs circa line 9 / the 'Channel' definition. You can define each measurement as an enum, then if you need to output a string you can just call ToString() on it. Then you don't have to maintain the strings / pay for the string de-ref.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From another comment by @becksebenius-unity : "Suggestion for the dictionary to use string keys instead of a predefined enum. unintuitively, it's actually better for performance (enums as dictionary keys actually get boxed 🤢). and if we're going to go with a flexible approach, lets maximize that instead of restricting them to predefined keys, that way users building their own tools for their own custom transport could extend. we can still provide predefined keys for the user in a constants class so that we can guarantee good data for our profiler."
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think he's not suggesting to change
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah i see. I can update to that. in the actual PR |
||
| public const string TickDuration = "tickDuration"; | ||
| public const string NumberOfBuffMessagesOut = "numberOfBuffMessagesOut"; | ||
| public const string NumberOfBuffMessagesIn = "numberOfBuffMessagesIn"; | ||
| public const string NumberOfUnBuffMessagesOut = "numberOfUnBuffMessagesOut"; | ||
| public const string NumberOfUnBuffMessagesIn = "numberOfUnBuffMessagesIn"; | ||
| public const string NumberBytesSent = "numberBytesSent"; | ||
| public const string NumberBytesReceived = "numberBytesReceived"; | ||
| } | ||
|
|
||
| public class ProfilerTickData | ||
| { | ||
| public int tickID; | ||
|
|
||
| public readonly Dictionary<string, int> tickData = new Dictionary<string, int>(); | ||
| } | ||
|
|
||
| public interface ITransportProfilerData | ||
| { | ||
| Dictionary<string, int> GetTransportGetData(); | ||
| } | ||
| } | ||
|
|
||
| namespace MLAPI | ||
| { | ||
| class NetworkManager | ||
| { | ||
| private static int tickID = 0; | ||
|
|
||
| public delegate void PerfomanceDataEventHandler(ProfilerTickData profilerData); | ||
|
|
||
| event PerfomanceDataEventHandler PerfomanceDataEvent; | ||
|
|
||
| void LateUpdate() | ||
| { | ||
| // ... | ||
| ProfilerTickData profilerData = new ProfilerTickData(); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Probably should use a pool, (impl I know)
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will likely use a static pool where the receiver (third party) can't keep a reference or the reference is readonly |
||
| profilerData.tickID = tickID++; | ||
| // ... | ||
| // Transports do their send and receives | ||
| // ... | ||
| profilerData.tickData[ProfilerConstants.NumberOfServerRPCs] += 1; | ||
| // ... | ||
| var profileTransport = NetworkConfig.NetworkTransport as ITransportProfilerData; | ||
| if (profileTransport != null) | ||
| { | ||
| var transportProfilerData = profileTransport.GetTransportGetData(); | ||
| transportProfilerData.ForEach(x => | ||
| { | ||
| if (!profilerData.tickData.ContainsKey(x.Key)) profilerData.tickData.Add(x.Key, x.Value); | ||
| }); | ||
| } | ||
|
|
||
| PerfomanceDataEvent?.Invoke(profilerData); | ||
| } | ||
| } | ||
| } | ||
|
|
||
|
|
||
| namespace MLAPI.Transport | ||
| { | ||
| class UNetTransport : Transport, ITransportProfilerData | ||
| { | ||
| private static Dictionary<string, int> transportProfilerData; | ||
| private static bool profilerEnabled; | ||
|
|
||
| void Init() | ||
| { | ||
| transportProfilerData = new Dictionary<string, int>(); | ||
| } | ||
|
|
||
| Dictionary<string, int> GetTransportGetData() | ||
| { | ||
| return transportProfilerData; | ||
| } | ||
|
|
||
| void Send(ulong clientId, ArraySegment<byte> data, string channelName) | ||
| { | ||
| if (profilerEnabled) | ||
| { | ||
| transportProfilerData[ProfilerConstants.NumberOfBuffMessagesOut] += 1; | ||
| } | ||
| } | ||
| } | ||
| } | ||
|
|
||
| namespace ThirdParty | ||
| { | ||
| class StatsRenderer | ||
| { | ||
| void Start() | ||
| { | ||
| NetworkManager.instance.PerfomanceDataEvent += OnPerformanceEvent; | ||
| } | ||
|
|
||
| void OnPerformanceEvent(ProfilerTickData profilerData) | ||
| { | ||
| IMGUI.textlabel("Num Server RPCs", profilerData.tickData[ProfilerConstants.NumberOfServerRPCs]); | ||
| } | ||
| } | ||
| } | ||
| ``` | ||
|
|
||
| # Drawbacks | ||
| [drawbacks]: #drawbacks | ||
|
|
||
| With a dictionary we occur the performance overhead of boxing somewhat. Additionally we require our callers to look up the appropriate types for the data that they are pulling out of the dictionary. | ||
|
|
||
| However, changes to the `ProfilerTickData` will not cause large scale changes to the transport, MLAPI, and all third party consumers. | ||
|
|
||
| Each transport is also required to implement any required changes needed to fill out the relevant transport-level information. Of course not filling out the fields simply means the Overlay, Profiler, Logger will just display default values for the field types. This means they can implement after an initial release of their transport, if need be. | ||
|
|
||
| # Rationale and alternatives | ||
| [rationale-and-alternatives]: #rationale-and-alternatives | ||
|
|
||
| This design of single memory block is the best for locality of data. | ||
|
|
||
| There is less room for user error since there is only one readonly object that they get access to. | ||
|
|
||
| There is no room for user injected data or exceptions since we aren’t using an event signaling or actions. All user interaction for rendering happens at end of tick so they can’t affect the time of profiling or cause unforeseen issues. | ||
|
|
||
| The impact of not doing a Network Tick and Profiler Notification System is that we’d loose the Profiling capabilities that existed in the vanilla Profiler with Mirror/UNET. | ||
|
|
||
| # Prior art | ||
| [prior-art]: #prior-art | ||
|
|
||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is already a stat gathering feature added to MLAPI that might be extendable / have things we can steal (see ProfilerStat / ProfilerStatManager). It does a moving average. In particular I had to put in some smarts in the 'Record' method to handle when tons of samples are taken in a short period of time. Though admittedly this samples / rate computation would live more naturally in 3rd party stuff, except that displaying this to users is so fundamental to letting them get started reverses my feeling to think we probably should keep including it.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That code will still be used. This should feed into this for the Profiler among other things. |
||
| Writing all pertinent data per tick is a similar method to what most rendering pipelines do, which is gather relevant data store in a buffer and render it. Since Multiplayer requires the same effective speed as our Render Pipelines we want to make sure it is as stable and robust as possible. | ||
|
|
||
| # Unresolved questions | ||
| [unresolved-questions]: #unresolved-questions | ||
|
|
||
| Some of the design involving `profilerEnabled` flag and using an interface is based upon the MultiPlex Transport which may use other transports underneath. Is this truly necessary? | ||
|
|
||
| # Future possibilities | ||
| [future-possibilities]: #future-possibilities | ||
|
|
||
| This is a low impact and visibility change. All in all, it's unlikely that there would be much in future possibilities or requests to this feature. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in the # of networked objects that are currently being managed? Subtle point, would the AOI-ness matter? Like would we count objects that are outside AOI for a client?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thats an example. but ultimately it would be up to how that data is captured by MLAPI and isn't dependent on whether or not we have this system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I interpreted this more as, how many objects sent state updates in this network tick. Both might be helpful though