Thread: oUF performance
View Single Post
09-27-18, 04:41 PM   #6
Blooblahguy
A Deviate Faerie Dragon
AddOn Author - Click to view addons
Join Date: Oct 2009
Posts: 18
So i'll reference this document here, because it probably has better examples than what i'll list. What i've profiled so far is incomplete on it's own so i'll try and get more profiling done this weekend.

* Localizing variables outside of for loops
The size of the gain here depends on the complexity of the loop and the calls from within it. Obviously number of calls matters but i think that factor is far outweighed by how expensive common WoW functions are.

I tested the following calls in seperate 10,000 loops. Depending on the layout, frequent health updates, number of units on screen these calls frequently hit & exceed 10k in a given fight so I thought it would be a good test number.
The first number is what they clocked without localizing the API call first, the 2nd is with the api call localized

UnitIsConnected 1.775 -> 1.676
UnitExists 1.918 -> 1.826
UnitReaction 4.301 -> 4.254
UnitIsUnit 1.904 -> 1.874
UnitAura 1.925 -> 1.843
UnitIsPlayer 1.657 -> 1.589
UnitIsTapDenied 1.626 -> 1.607
UnitPlayerControlled 1.683 -> 1.596
UnitHealth 4.950 -> 4.872
UnitHealthMax 4.996 -> 4.913

total time: 26.735
total time optimized: 26.050
avg improvement: 2.62%

So granted, not large - but keep in mind this was to localize what is already a single reference, no table lookups or anything involved. Just making local UnitHealthMax = UnitHealthMax. I think total time is a really important stat here, but I'll touch back on that.

When we make the call include a lookup on a multidimensional table things look a lot different. Let's analyze the the health element since basically every layout uses it. I can't do a 1:1 comparison right now but even just looking at the lookup to unpack reaction color we see a large improvement.

Before my profile I set this table:
Code:
local parent = {}
parent.colors = {}
parent.colors.reaction = {}
parent.colors.reaction[4] = {.1, .2, .3, 1}
Code:
profile("unitreaction_color", function()
			for i = 1, 100 do
				local unitreaction = UnitReaction('nameplate1', 'player')
				local color = unpack(parent.colors.reaction[4])
			end
		end)
Takes 0.0602
Code:
profile("optimized_unitreaction_color", function()
			local unpack, UnitReaction = unpack, UnitReaction
			local r_table = parent.colors.reaction
			for i = 1, 100 do
				local unitreaction = UnitReaction('nameplate1', 'player')
				local color = unpack(r_table[4])
			end
		end)
Takes 0.0523
imrprovement: 13%

That table is as simple as it gets. This difference gets more and more pronounced the bigger the table reference is and what else the function does. We unpack colors from the self element in these cases and these self tables can often get really large, especially when layouts use many of the elements available in oUF. I tried unpacking color from my bdCore library table, which is really pretty lean, and that increased the difference to 21%. I'll try and get exact stats on oUF layouts when I get home, right now I don't have an easy way to test.

Again with all of the above in mind, I think it's important to note just how often these functions call. Maybe not from just player, target, tot, and pet but when you have raid frame and nameplate layouts then all of these call counts go up drastically.

* Creating tables or table templates(key sizing) outside of loops and just updating their reference inside of loops
This and some other points are just about potential optimizations, not that I saw it was being done blatantly wrong at any point. oUF does seem to mostly create variable inside of loops though.
Take the following code as an example

Code:
for i = 1, 1000000 do
local a = {}
a[1] = 1; a[2] = 2; a[3] = 3
end
Takes 52.240 seconds to run while
Code:
for i = 1, 1000000 do
local a = {true, true, true}
a[1] = 1; a[2] = 2; a[3] = 3
end
Only takes 20.98 seconds to run. It's 60% faster. Obviously total time is exaggerated by a high loop, but I don't think implementing this practice would take much time and the benefits start to add up.

*memoizing
I'm not the biggest fan of this either, but referencing back to the above about how long default calls can take over the course of a fight the only real method of optimization there is is to call these functions and loops less. Just to pick on the health element again it's update color function could be memoized easily because we know that given a certain set of inputs, the element will always be colored the same way.

If we pass `UnitIsTapDenied(unit)`, UnitIsPlayer(unit) = UnitIsPlayer(unit) and select(2, UnitClass(unit)) or false, and UnitReaction(unit,unit2) then we have a unique set of parameters that always return the same colors. that we could cache and return the next time we call it. I've implemented this on my nameplates because UNIT_THREAT_LIST_UPDATE and UNIT_HEALTH fire so frequently. Memory is far cheaper than processing power, and that is especially true in the case of WoW. It is absolutely worth trading some off. We could further optimize this by storing self.class, self.reaction, self.isplayer and updating those variable on the correct events - but that is definitely cumbersome.
It can't be used often though, since the whole job of oUF is to take a bunch of variable data and make it easily usable. But in the case of memory here, we're talking about creating 100kbs of table caches to save hundreds if not thousands of cpu loops.

OnShow / OnUpdate improvements
Yeah they exist for a reason, and I know it's not as simple as just disabling these things and hoping it all works out. I do think it it could be a good to give some elements an attribute to opt out of the UpdateAllElements function.

DisableAddon Blizzard stuff
So i was actually wrong on part of this. I thought I had disabled blizzard nameplates and relogged and was still able to use my oUF layout but I must not have relogged or something. This definitely does not work. However, when tracking frames functions and addon cpu usage blizzard nameplates are still clocking in really high, and I think that the handle blizzard function inside of oUF may be missing something when it disables these frames, I'll investigate more this weekend. This happens a lot more with CompactRaidFrames, which definitely can be disabled without breaking raid frame layouts. Note that this is different from the RaidUI addon, which let's players place markers and whatnot. To me it seems reasonable to disable CombatRaidFrames when a raid layout is initialized, but if you feel that is overstepping then I can understand that.

I'll try and get more profiling numbers this weekend and really dig into some of the FPS problems people are reporting to me.
  Reply With Quote