Checking Zero-Based Boundaries
👋 Introduction
What is boundary checking? Boundary checking is a safeguard used when accessing an integer-based indexer. It's for ensuring that the index is within boundary before doing the actual operation that the indexer is required for. In this post I'm only talking about a special type of boundary checking in which the lower boundary is always 0 and the index in question is of a signed integer type.
❓ How to check boundaries?
So let's say you have a class MyCustomCollection
having an indexer (int index
) that returns the index
th element in your collection. You know the length of your collection is stored in the variable int length
and indexing starts from 0. So the lower boundary is 0, and the upper is length
.
So how do you boundary check? I figure you simply check whether the index
is greater than or equal to zero and then check if it is less than length
. Or do you? Let's compare the below options.
⚖️ index >= 0 && index < length
vs (uint)index < (uint)length
Logically, there's no difference. Here's the explanation:
length
is never negative by definition,uint
conversion is needed so that the compiler knows it too. So, theuint
conversion has no effect on the actual value oflength
.index
, on the other hand can be negative, since it's a 'user-input'. Here, theuint
conversion ensures that when that's the case, the conversion results in a number greater thanint.MaxValue
(due to the intrinsics of storing a negative number), thus making the logical check's result false. The happy path ((index >= 0 && index < length) == true
) is not affected by the uint conversions.
❓ But how about the difference between performance?
🔍 JIT Output Analysis
index >= 0 && index < length
(Simple)
mov eax, [rcx+0x20]
test eax, eax
jl short L0011
cmp eax, [rcx+0x24]
setl al
movzx eax, al
ret // It was jitted as a function, so naturally it would be a jmp pointing right after xor eax, eax
xor eax, eax
ret
(uint)index < (uint)length
(Optimized)
mov eax, [rcx+0x20]
cmp eax, [rcx+0x24]
setb al
movzx eax, al
ret
You can see that the optimized one compiles into half the amount of instructions than the simple one does.
⏱️ Measurements
Method | Mean |
---|---|
Simple32 | 9.421 ns |
Simple64 | 9.702 ns |
Optimized32 | 5.964 ns |
Optimized64 | 4.433 ns |
SimpleIteration32 | 113,636.877 ns |
SimpleIteration64 | 115,620.033 ns |
OptimizedIteration32 | 88,971.130 ns |
OptimizedIteration64 | 87,281.202 ns |
📋 Summary
The optimized one seems to be almost twice as fast than the simple one. When iterating, we cannot see such a big difference, and we also cannot really see a difference between int
and long
. This is because when iterating, the execution of the condition (or in the latter case, the iteration part) is really just a small part of all the repeating operations (increase iterator, increase variable, jump to branch, reading the array element from memory, executing the condition).