Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Optimize range checks for a[i & C], a[i % c] and a[(i & c1)>>c2)] patterns #1644

Merged
merged 11 commits into from Mar 4, 2020

Conversation

EgorBo
Copy link
Member

@EgorBo EgorBo commented Jan 12, 2020

Optimize range checks for the following cases:

static ReadOnlySpan<byte> span => new byte[] { 1,2,3,4 };

byte Case1(int i)  => span[i & 2];
byte Case2(int i)  => span[i & (span.Length - 1)]; // "i & 3"
byte Case3(int i)  => span[(i & 10) >> 2]; // or <<
byte Case4(uint i) => span[(int)(i % 2)]; // only for unsigned i

See asm diff for the snippet above ^.

This PR currently handles only &, % and >> operators with constant op2 and ignores actual range of op1 (always sets 0 as a lower limit and op2->IconValue() as an upper limit) to keep this PR simple and harmless.

jit-diff:

Total bytes of diff: -133 (-0.00% of base)
    diff is an improvement.

Top file improvements by size (bytes):
         -69 : System.Private.CoreLib.dasm (-0.00% of base)
         -64 : System.Private.Uri.dasm (-0.07% of base)

2 total files with size differences (2 improved, 0 regressed), 107 unchanged.

Top method improvements by size (bytes):
         -69 (-8.61% of base) : System.Private.CoreLib.dasm - System.IO.Path:Populate83FileNameFromRandomBytes(long,int,System.Span`1[Char])
         -38 (-2.77% of base) : System.Private.Uri.dasm - System.UriHelper:EscapeStringToBuilder(System.ReadOnlySpan`1[Char],byref,System.ReadOnlySpan`1[Boolean],bool)
         -14 (-5.88% of base) : System.Private.Uri.dasm - System.UriHelper:EscapeAsciiChar(ushort,byref)
         -12 (-10.34% of base) : System.Private.Uri.dasm - <>c:<HexEscape>b__121_0(System.Span`1[Char],ushort):this

Top method improvements by size (percentage):
         -12 (-10.34% of base) : System.Private.Uri.dasm - <>c:<HexEscape>b__121_0(System.Span`1[Char],ushort):this
         -69 (-8.61% of base) : System.Private.CoreLib.dasm - System.IO.Path:Populate83FileNameFromRandomBytes(long,int,System.Span`1[Char])
         -14 (-5.88% of base) : System.Private.Uri.dasm - System.UriHelper:EscapeAsciiChar(ushort,byref)
         -38 (-2.77% of base) : System.Private.Uri.dasm - System.UriHelper:EscapeStringToBuilder(System.ReadOnlySpan`1[Char],byref,System.ReadOnlySpan`1[Boolean],bool)

4 total methods with size differences (4 improved, 0 regressed), 262165 unchanged.

The jit diff is quite small but there are few more places in the BCL where we can slightly tune and benefit from this opt, e.g.: here, also, see this comment. Also all [x % c] patterns require cast to uint.

PS: range checks are not eliminated when CNS is negative

@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 12, 2020
@EgorBo EgorBo changed the title JIT: Optimize range checks for array[i & C] pattern JIT: Optimize range checks for a[i & C], a[i % c] and a[(i & c1)>>c2)] patterns Jan 12, 2020
@sandreenko
Copy link
Contributor

PTAL @dotnet/jit-contrib

@benaadams
Copy link
Member

benaadams commented Jan 17, 2020

Will this work for a static span lifted to a local? Something like e.g.

private static ReadOnlySpan<byte> s_hexEncodeMap => new byte[] { (byte)'0', (byte)'1', (byte)'2', (byte)'3', (byte)'4', (byte)'5', (byte)'6', (byte)'7', (byte)'8', (byte)'9', (byte)'A', (byte)'B', (byte)'C', (byte)'D', (byte)'E', (byte)'F' };

/// <summary>
/// A faster version of String.Concat(<paramref name="str"/>, <paramref name="separator"/>, <paramref name="number"/>.ToString("X8"))
/// </summary>
/// <param name="str"></param>
/// <param name="separator"></param>
/// <param name="number"></param>
/// <returns></returns>
public static string ConcatAsHexSuffix(string str, char separator, uint number)
{
    var length = 1 + 8;
    if (str != null)
    {
        length += str.Length;
    }

    return string.Create(length, (str, separator, number), (buffer, tuple) =>
    {
        var (tupleStr, tupleSeparator, tupleNumber) = tuple;
        var hexEncodeMap = s_hexEncodeMap;

        var i = 0;
        if (tupleStr != null)
        {
            tupleStr.AsSpan().CopyTo(buffer);
            i = tupleStr.Length;
        }

        var number = (int)tupleNumber;
        buffer[i + 8] = (char)hexEncodeMap[number & 0xF];
        buffer[i + 7] = (char)hexEncodeMap[(number >> 4) & 0xF];
        buffer[i + 6] = (char)hexEncodeMap[(number >> 8) & 0xF];
        buffer[i + 5] = (char)hexEncodeMap[(number >> 12) & 0xF];
        buffer[i + 4] = (char)hexEncodeMap[(number >> 16) & 0xF];
        buffer[i + 3] = (char)hexEncodeMap[(number >> 20) & 0xF];
        buffer[i + 2] = (char)hexEncodeMap[(number >> 24) & 0xF];
        buffer[i + 1] = (char)hexEncodeMap[(number >> 28) & 0xF];
        buffer[i] = tupleSeparator;
    });
}

@EgorBo
Copy link
Member Author

@benaadams yeah should work (just checked - no range checks for hexEncodeMap accesses)

@benaadams
Copy link
Member

Added PR then 😄 dotnet/aspnetcore#18406

@briansull
Copy link
Contributor

How can you prove that the length of hexEncodeMap is 16 ?
The static s_hexEncodeMap can be re-assigned to be an array with a different length.

@benaadams
Copy link
Member

How can you prove that the length of hexEncodeMap is 16 ?

Its getter which is a look up into readonly data. Not sure it can be reassigned in anyway?

@briansull
Copy link
Contributor

Not sure that there is anything that guarantees it is read only.
I know that reflection can be used to change "read only" values.
The JIT can't afford to omit a bounds check at allow an out of bounds read or write operation.

@benaadams
Copy link
Member

Its a method not a value; with the following il

.method private hidebysig specialname static 
	valuetype [System.Private.CoreLib]System.ReadOnlySpan`1<uint8> get_s_hexEncodeMap () cil managed 
{
	// Method begins at RVA 0x2050
	// Code size 13 (0xd)
	.maxstack 8

	IL_0000: ldsflda valuetype '<PrivateImplementationDetails>'/'__StaticArrayInitTypeSize=16' '<PrivateImplementationDetails>'::CE27CB141098FEB00714E758646BE3E99C185B71
	IL_0005: ldc.i4.s 16
	IL_0007: newobj instance void valuetype [System.Private.CoreLib]System.ReadOnlySpan`1<uint8>::.ctor(void*, int32)
	IL_000c: ret
}

@benaadams
Copy link
Member

Or following asm

C.get_s_hexEncodeMap()
    L0000: mov rax, 0x2d1c3a60844
    L000a: mov [rcx], rax
    L000d: mov dword [rcx+0x8], 0x10
    L0014: mov rax, rcx
    L0017: ret

So the length is fixed at 0x10 or 16 bytes

@briansull
Copy link
Contributor

I see it is not an array but a [ReadOnly]Span which is a struct with a length field.

@briansull
Copy link
Contributor

And to get this optimization, the JIT probably has to inline: get_s_hexEncodeMap and the call it makes to the ReadOnlySpan`1:.ctor as well.

@benaadams
Copy link
Member

Not sure I understand, its the same pattern as in the summary of the PR

static ReadOnlySpan<byte> span => new byte[] { 1,2,3,4 };

byte Case1(int i)  => span[i & 2];

@EgorBo EgorBo reopened this Jan 18, 2020
@EgorBo
Copy link
Member Author

EgorBo commented Jan 18, 2020

How can you prove that the length of hexEncodeMap is 16 ?
The static s_hexEncodeMap can be re-assigned to be an array with a different length.

Well, jit already does that, see sharplab.io (thanks to Roslyn dotnet/roslyn#24621)
My PR only limits possible range of an indexer and doesn't remove anything by itself (it wouldn't remove anything if array's length was unknown)

@benaadams
Copy link
Member

Another improvement based on this change dotnet/aspnetcore#18450

@stephentoub
Copy link
Member

@EgorBo, @dotnet/jit-contrib, what are the next steps for this PR? It's green, existing feedback appears to be addressed, and there's been no activity on it for a few weeks. Thanks.

@EgorBo
Copy link
Member Author

@EgorBo, @dotnet/jit-contrib, what are the next steps for this PR? It's green, existing feedback appears to be addressed, and there's been no activity on it for a few weeks. Thanks.

it's ready for review from my end.
It handles the most common patterns I managed to find.

src/coreclr/src/jit/rangecheck.cpp Outdated Show resolved Hide resolved
src/coreclr/src/jit/rangecheck.cpp Outdated Show resolved Hide resolved
Copy link
Member

@erozenfeld erozenfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@erozenfeld
Copy link
Member

@dotnet/jit-contrib I think this can be merged. Anyone else wants to take a look at the final version?

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@erozenfeld erozenfeld merged commit 5c68e94 into dotnet:master Mar 4, 2020
@erozenfeld
Copy link
Member

Thank you @EgorBo !

@EgorBo EgorBo deleted the rngchk-and branch May 25, 2020 11:57
@dotnet dotnet locked as resolved and limited conversation to collaborators Dec 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI optimization
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants