Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix method and basic block flags used by early opts. #2126

Merged
merged 3 commits into from Jan 27, 2020

Conversation

erozenfeld
Copy link
Member

Add missing BBF_HAS_NULLCHECK in loop test transformation.
This change had no diffs in frameworks and benchmark.

Add missing OMF_HAS_ARRAYREF and BBF_HAS_IDX_LEN in many places.
This change had diffs in frameworks and benchmark since the optimization
replacing GT_ARRLENGTH with a constant fires much more often.

Fix inlining to not lose basic block flags when the inlinee basic block
has a ret expr but no statements.

Change propGetType condition in optDoEarlyPropForFunc to check for OMF_HAS_NEWARRAY
in addition to OMF_HAS_NEWOBJ.

Check that methods and basic blocks that have calls to object allocation
helpers have OMF_HAS_NEWOBJ and BBF_HAS_NEWOBJ set.

Check that methods and basic blocks that have calls to array allocation
helpers have OMF_HAS_NEWARRAY and BBF_HAS_NEWARRAY set.

Check that methods and basic blocks that have GT_NULLCHECK nodes
helpers have OMF_HAS_NULLCHECK and BBF_HAS_NULLCHECK set.

Check that methods and basic blocks that access MethodTable
via GT_IND on a ref have OMF_HAS_VTABLEREF and BBF_HAS_VTABLEREF set.

Check that methods and basic blocks that have GT_ARRLENGTH
have OMF_HAS_ARRAYREF and BBF_HAS_IDX_LEN set.

Add missing BBF_HAS_NULLCHECK in loop test transformation.
This change had no diffs in frameworks and benchmark.

Add missing OMF_HAS_ARRAYREF and BBF_HAS_IDX_LEN in many places.
This change had diffs in frameworks and benchmark since the optimization
replacing GT_ARRLENGTH with a constant fires much more often.

Fix inlining to not lose basic block flags when the inlinee basic block
has a ret expr but no statements.

Change propGetType condition in optDoEarlyPropForFunc to check for OMF_HAS_NEWARRAY
in addition to OMF_HAS_NEWOBJ.

Check that methods and basic blocks that have calls to object allocation
helpers have OMF_HAS_NEWOBJ and BBF_HAS_NEWOBJ set.

Check that methods and basic blocks that have calls to array allocation
helpers have OMF_HAS_NEWARRAY and BBF_HAS_NEWARRAY set.

Check that methods and basic blocks that have GT_NULLCHECK nodes
helpers have OMF_HAS_NULLCHECK and BBF_HAS_NULLCHECK set.

Check that methods and basic blocks that access MethodTable
via GT_IND on a ref have OMF_HAS_VTABLEREF and BBF_HAS_VTABLEREF set.

Check that methods and basic blocks that have GT_ARRLENGTH
have OMF_HAS_ARRAYREF and BBF_HAS_IDX_LEN set.
@erozenfeld
Copy link
Member Author

x64 pmi benchmark diffs:

PMI CodeSize Diffs for benchstones and benchmarks game in f:\runtime\artifacts\tests\coreclr\Windows_NT.x64.Release for  default jit
Summary of Code Size diffs:
(Lower is better)
Total bytes of diff: -1362 (-0.27% of base)
    diff is an improvement.
Top file improvements (bytes):
        -913 : Bytemark\Bytemark\Bytemark.dasm (-1.05% of base)
        -155 : SciMark\SciMark\SciMark.dasm (-0.87% of base)
         -69 : BenchmarksGame\spectralnorm\spectralnorm-3\spectralnorm-3.dasm (-3.70% of base)
         -63 : Benchstones\BenchI\QuickSort\QuickSort\QuickSort.dasm (-5.90% of base)
         -60 : SIMD\ConsoleMandel\ConsoleMandel\ConsoleMandel.dasm (-0.20% of base)
         -35 : SIMD\RayTracer\RayTracer\RayTracer.dasm (-0.14% of base)
         -26 : Benchstones\BenchI\IniArray\IniArray\IniArray.dasm (-4.15% of base)
         -21 : Benchstones\BenchI\BubbleSort2\BubbleSort2\BubbleSort2.dasm (-2.33% of base)
         -15 : Benchstones\BenchI\Array1\Array1\Array1.dasm (-1.37% of base)
          -3 : Benchstones\BenchF\MatInv4\MatInv4\MatInv4.dasm (-0.07% of base)
          -2 : Benchstones\BenchF\Whetsto\Whetsto\Whetsto.dasm (-0.06% of base)
11 total files with Code Size differences (11 improved, 0 regressed), 71 unchanged.
Top method improvements (bytes):
        -263 (-12.11% of base) : Bytemark\Bytemark\Bytemark.dasm - AssignRect:second_assignments(ref,ref)
        -166 (-8.23% of base) : Bytemark\Bytemark\Bytemark.dasm - AssignJagged:second_assignments(ref,ref)
        -155 (-15.96% of base) : SciMark\SciMark\SciMark.dasm - kernel:benchSparseMult()
        -111 (-15.16% of base) : Bytemark\Bytemark\Bytemark.dasm - IDEAEncryption:Run():double:this
        -102 (-23.34% of base) : Bytemark\Bytemark\Bytemark.dasm - ByteMark:bench_with_confidence(int,byref,byref,byref):int
         -82 (-8.86% of base) : Bytemark\Bytemark\Bytemark.dasm - EMFloatClass:DivideInternalFPF(InternalFPF,InternalFPF,InternalFPF)
         -70 (-8.85% of base) : Bytemark\Bytemark\Bytemark.dasm - EMFloatClass:MultiplyInternalFPF(InternalFPF,InternalFPF,InternalFPF)
         -69 (-7.64% of base) : Bytemark\Bytemark\Bytemark.dasm - EMFloat:DivideInternalFPF(byref,byref,byref)
         -69 (-12.55% of base) : BenchmarksGame\spectralnorm\spectralnorm-3\spectralnorm-3.dasm - SpectralNorm_3:Bench(int):double
         -63 (-27.63% of base) : Benchstones\BenchI\QuickSort\QuickSort\QuickSort.dasm - QuickSort:Bench():bool
         -60 (-8.65% of base) : SIMD\ConsoleMandel\ConsoleMandel\ConsoleMandel.dasm - VectorHelper:Create(Func`2):Vector`1 (6 methods)
         -47 (-6.10% of base) : Bytemark\Bytemark\Bytemark.dasm - EMFloat:MultiplyInternalFPF(byref,byref,byref)
         -35 (-4.23% of base) : SIMD\RayTracer\RayTracer\RayTracer.dasm - Color:ChangeHue(float):this
         -26 (-23.64% of base) : Benchstones\BenchI\IniArray\IniArray\IniArray.dasm - IniArray:Bench():bool
         -21 (-7.84% of base) : Benchstones\BenchI\BubbleSort2\BubbleSort2\BubbleSort2.dasm - BubbleSort2:Bench():bool
         -15 (-8.88% of base) : Benchstones\BenchI\Array1\Array1\Array1.dasm - Array1:Bench():bool
          -3 (-0.47% of base) : Benchstones\BenchF\MatInv4\MatInv4\MatInv4.dasm - MatInv4:Bench():bool
          -2 (-0.27% of base) : Bytemark\Bytemark\Bytemark.dasm - EMFloatClass:DoEmFloatIteration(ref,ref,ref,int,int):long
          -2 (-0.09% of base) : Benchstones\BenchF\Whetsto\Whetsto\Whetsto.dasm - Whetsto:Bench():bool
          -1 (-0.21% of base) : Bytemark\Bytemark\Bytemark.dasm - EMFloat:DoEmFloatIteration(ref,ref,ref,int,int):long
Top method improvements (percentages):
         -63 (-27.63% of base) : Benchstones\BenchI\QuickSort\QuickSort\QuickSort.dasm - QuickSort:Bench():bool
         -26 (-23.64% of base) : Benchstones\BenchI\IniArray\IniArray\IniArray.dasm - IniArray:Bench():bool
        -102 (-23.34% of base) : Bytemark\Bytemark\Bytemark.dasm - ByteMark:bench_with_confidence(int,byref,byref,byref):int
        -155 (-15.96% of base) : SciMark\SciMark\SciMark.dasm - kernel:benchSparseMult()
        -111 (-15.16% of base) : Bytemark\Bytemark\Bytemark.dasm - IDEAEncryption:Run():double:this
         -69 (-12.55% of base) : BenchmarksGame\spectralnorm\spectralnorm-3\spectralnorm-3.dasm - SpectralNorm_3:Bench(int):double
        -263 (-12.11% of base) : Bytemark\Bytemark\Bytemark.dasm - AssignRect:second_assignments(ref,ref)
         -15 (-8.88% of base) : Benchstones\BenchI\Array1\Array1\Array1.dasm - Array1:Bench():bool
         -82 (-8.86% of base) : Bytemark\Bytemark\Bytemark.dasm - EMFloatClass:DivideInternalFPF(InternalFPF,InternalFPF,InternalFPF)
         -70 (-8.85% of base) : Bytemark\Bytemark\Bytemark.dasm - EMFloatClass:MultiplyInternalFPF(InternalFPF,InternalFPF,InternalFPF)
         -60 (-8.65% of base) : SIMD\ConsoleMandel\ConsoleMandel\ConsoleMandel.dasm - VectorHelper:Create(Func`2):Vector`1 (6 methods)
        -166 (-8.23% of base) : Bytemark\Bytemark\Bytemark.dasm - AssignJagged:second_assignments(ref,ref)
         -21 (-7.84% of base) : Benchstones\BenchI\BubbleSort2\BubbleSort2\BubbleSort2.dasm - BubbleSort2:Bench():bool
         -69 (-7.64% of base) : Bytemark\Bytemark\Bytemark.dasm - EMFloat:DivideInternalFPF(byref,byref,byref)
         -47 (-6.10% of base) : Bytemark\Bytemark\Bytemark.dasm - EMFloat:MultiplyInternalFPF(byref,byref,byref)
         -35 (-4.23% of base) : SIMD\RayTracer\RayTracer\RayTracer.dasm - Color:ChangeHue(float):this
          -3 (-0.47% of base) : Benchstones\BenchF\MatInv4\MatInv4\MatInv4.dasm - MatInv4:Bench():bool
          -2 (-0.27% of base) : Bytemark\Bytemark\Bytemark.dasm - EMFloatClass:DoEmFloatIteration(ref,ref,ref,int,int):long
          -1 (-0.21% of base) : Bytemark\Bytemark\Bytemark.dasm - EMFloat:DoEmFloatIteration(ref,ref,ref,int,int):long
          -2 (-0.09% of base) : Benchstones\BenchF\Whetsto\Whetsto\Whetsto.dasm - Whetsto:Bench():bool
20 total methods with Code Size differences (20 improved, 0 regressed), 1873 unchanged.

@erozenfeld
Copy link
Member Author

x64 pmi framework diffs:

PMI CodeSize Diffs for System.Private.CoreLib.dll, framework assemblies for  default jit
Summary of Code Size diffs:
(Lower is better)
Total bytes of diff: -1296 (-0.00% of base)
    diff is an improvement.
Top file regressions (bytes):
           6 : ILCompiler.Reflection.ReadyToRun.dasm (0.01% of base)
           6 : System.Data.SqlClient.dasm (0.00% of base)
           2 : System.Net.WebSockets.dasm (0.01% of base)
           2 : System.Net.WebSockets.WebSocketProtocol.dasm (0.01% of base)
           2 : System.Security.Cryptography.Algorithms.dasm (0.00% of base)
           2 : System.Security.Cryptography.Cng.dasm (0.00% of base)
Top file improvements (bytes):
        -508 : System.Private.DataContractSerialization.dasm (-0.07% of base)
        -322 : Microsoft.CodeAnalysis.CSharp.dasm (-0.01% of base)
        -197 : System.DirectoryServices.dasm (-0.04% of base)
         -77 : System.Data.OleDb.dasm (-0.03% of base)
         -54 : System.Runtime.Numerics.dasm (-0.08% of base)
         -42 : System.Text.Encoding.CodePages.dasm (-0.06% of base)
         -34 : System.Private.CoreLib.dasm (-0.00% of base)
         -26 : System.Data.Odbc.dasm (-0.01% of base)
         -16 : Newtonsoft.Json.dasm (-0.00% of base)
         -15 : System.Private.Xml.dasm (-0.00% of base)
         -14 : Microsoft.CodeAnalysis.VisualBasic.dasm (-0.00% of base)
          -6 : System.Data.Common.dasm (-0.00% of base)
          -4 : System.Security.Cryptography.Xml.dasm (-0.00% of base)
          -1 : Microsoft.Diagnostics.Tracing.TraceEvent.dasm (-0.00% of base)
20 total files with Code Size differences (14 improved, 6 regressed), 145 unchanged.
Top method regressions (bytes):
          54 ( 2.64% of base) : System.Data.Common.dasm - SqlDecimal:op_Multiply(SqlDecimal,SqlDecimal):SqlDecimal
           6 ( 0.33% of base) : ILCompiler.Reflection.ReadyToRun.dasm - UnwindInfo:.ctor(ref,int):this (4 methods)
           2 ( 0.18% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - SymbolReader:CopyStreamToFile(Stream,String,String,byref):int:this
           2 ( 0.15% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - EventPipeEventSource:.ctor(PinnedStreamReader,String):this
           2 ( 0.63% of base) : System.Data.Odbc.dasm - OdbcConnection:GetConnectAttr(int,int):int:this
           2 ( 1.19% of base) : System.Data.SqlClient.dasm - UShortNormalizer:DeNormalize(FieldInfo,Object,Stream):this
           2 ( 1.20% of base) : System.Data.SqlClient.dasm - UIntNormalizer:DeNormalize(FieldInfo,Object,Stream):this
           2 ( 1.19% of base) : System.Data.SqlClient.dasm - ULongNormalizer:DeNormalize(FieldInfo,Object,Stream):this
           2 ( 0.28% of base) : System.Net.WebSockets.dasm - ManagedWebSocket:.ctor(Stream,bool,String,TimeSpan):this
           2 ( 0.28% of base) : System.Net.WebSockets.WebSocketProtocol.dasm - ManagedWebSocket:.ctor(Stream,bool,String,TimeSpan):this
           2 ( 1.40% of base) : System.Private.CoreLib.dasm - TextReader:ReadToEnd():String:this
           2 ( 0.34% of base) : System.Private.Xml.dasm - XmlPreloadedResolver:Add(Uri,Stream):this
           2 ( 0.63% of base) : System.Security.Cryptography.Algorithms.dasm - CngCommon:HashData(Stream,HashAlgorithmName):ref
           2 ( 0.63% of base) : System.Security.Cryptography.Cng.dasm - CngCommon:HashData(Stream,HashAlgorithmName):ref
Top method improvements (bytes):
        -416 (-30.06% of base) : System.Private.DataContractSerialization.dasm - DataContract:ComputeHash(ref):ref
        -322 (-6.73% of base) : Microsoft.CodeAnalysis.CSharp.dasm - MethodBodySynthesizer:ConstructFieldLikeEventAccessorBody_Regular(SourceEventSymbol,bool,CSharpCompilation,DiagnosticBag):BoundBlock
        -197 (-34.74% of base) : System.DirectoryServices.dasm - TrustHelper:CreateTrustPassword():String
         -77 (-2.04% of base) : System.Data.OleDb.dasm - OleDbMetaDataFactory:GetDataTypesTable(OleDbConnection):DataTable:this
         -42 (-4.38% of base) : System.Text.Encoding.CodePages.dasm - SBCSCodePageEncoding:ReadBestFitTable():this
         -32 (-5.42% of base) : System.Runtime.Numerics.dasm - BigInteger:GreatestCommonDivisor(BigInteger,BigInteger):BigInteger
         -28 (-1.69% of base) : System.Private.DataContractSerialization.dasm - SchemaExporter:ExportCollectionDataContract(CollectionDataContract,XmlSchema):this
         -24 (-1.26% of base) : System.Data.Odbc.dasm - OdbcDataReader:RetrieveKeyInfoFromStatistics(QualifiedTableName,bool):int:this
         -21 (-1.40% of base) : System.Private.DataContractSerialization.dasm - SchemaExporter:ExportClassDataContract(ClassDataContract,XmlSchema):this
         -17 (-0.39% of base) : System.Data.Common.dasm - DbDataAdapter:Update(ref,DataTableMapping):int:this
         -16 (-1.27% of base) : Newtonsoft.Json.dasm - ConvertUtils:ToBigInteger(Object):BigInteger
         -16 (-5.69% of base) : System.Data.Common.dasm - SqlDecimal:DivByULong(int):int:this
         -16 (-1.45% of base) : System.Data.Common.dasm - BigIntegerStorage:ConvertToBigInteger(Object,IFormatProvider):BigInteger
         -16 (-7.48% of base) : System.Private.CoreLib.dasm - TimerQueue:CreateTimerQueues():ref (2 methods)
         -16 (-2.41% of base) : System.Private.DataContractSerialization.dasm - SchemaExporter:ExportXmlDataContract(XmlDataContract):this
         -12 (-0.32% of base) : Microsoft.CodeAnalysis.VisualBasic.dasm - Parser:ParseXmlDeclaration():XmlDeclarationSyntax:this
         -12 (-1.24% of base) : System.Private.CoreLib.dasm - PerCoreLockedStacks:.ctor():this (7 methods)
         -12 (-1.58% of base) : System.Private.DataContractSerialization.dasm - SchemaExporter:ExportISerializableDataContract(ClassDataContract,XmlSchema):this
         -12 (-7.59% of base) : System.Runtime.Numerics.dasm - BigInteger:op_Implicit(int):BigInteger (2 methods)
         -11 (-0.90% of base) : System.Private.DataContractSerialization.dasm - SchemaExporter:ExportEnumDataContract(EnumDataContract,XmlSchema):this
Top method regressions (percentages):
          54 ( 2.64% of base) : System.Data.Common.dasm - SqlDecimal:op_Multiply(SqlDecimal,SqlDecimal):SqlDecimal
           2 ( 1.40% of base) : System.Private.CoreLib.dasm - TextReader:ReadToEnd():String:this
           2 ( 1.20% of base) : System.Data.SqlClient.dasm - UIntNormalizer:DeNormalize(FieldInfo,Object,Stream):this
           2 ( 1.19% of base) : System.Data.SqlClient.dasm - UShortNormalizer:DeNormalize(FieldInfo,Object,Stream):this
           2 ( 1.19% of base) : System.Data.SqlClient.dasm - ULongNormalizer:DeNormalize(FieldInfo,Object,Stream):this
           2 ( 0.63% of base) : System.Data.Odbc.dasm - OdbcConnection:GetConnectAttr(int,int):int:this
           2 ( 0.63% of base) : System.Security.Cryptography.Algorithms.dasm - CngCommon:HashData(Stream,HashAlgorithmName):ref
           2 ( 0.63% of base) : System.Security.Cryptography.Cng.dasm - CngCommon:HashData(Stream,HashAlgorithmName):ref
           2 ( 0.34% of base) : System.Private.Xml.dasm - XmlPreloadedResolver:Add(Uri,Stream):this
           6 ( 0.33% of base) : ILCompiler.Reflection.ReadyToRun.dasm - UnwindInfo:.ctor(ref,int):this (4 methods)
           2 ( 0.28% of base) : System.Net.WebSockets.dasm - ManagedWebSocket:.ctor(Stream,bool,String,TimeSpan):this
           2 ( 0.28% of base) : System.Net.WebSockets.WebSocketProtocol.dasm - ManagedWebSocket:.ctor(Stream,bool,String,TimeSpan):this
           2 ( 0.18% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - SymbolReader:CopyStreamToFile(Stream,String,String,byref):int:this
           2 ( 0.15% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - EventPipeEventSource:.ctor(PinnedStreamReader,String):this
Top method improvements (percentages):
        -197 (-34.74% of base) : System.DirectoryServices.dasm - TrustHelper:CreateTrustPassword():String
        -416 (-30.06% of base) : System.Private.DataContractSerialization.dasm - DataContract:ComputeHash(ref):ref
          -8 (-8.25% of base) : System.Private.CoreLib.dasm - AsyncTaskCache:CreateInt32Tasks():ref
         -12 (-7.59% of base) : System.Runtime.Numerics.dasm - BigInteger:op_Implicit(int):BigInteger (2 methods)
         -16 (-7.48% of base) : System.Private.CoreLib.dasm - TimerQueue:CreateTimerQueues():ref (2 methods)
        -322 (-6.73% of base) : Microsoft.CodeAnalysis.CSharp.dasm - MethodBodySynthesizer:ConstructFieldLikeEventAccessorBody_Regular(SourceEventSymbol,bool,CSharpCompilation,DiagnosticBag):BoundBlock
         -16 (-5.69% of base) : System.Data.Common.dasm - SqlDecimal:DivByULong(int):int:this
         -32 (-5.42% of base) : System.Runtime.Numerics.dasm - BigInteger:GreatestCommonDivisor(BigInteger,BigInteger):BigInteger
         -42 (-4.38% of base) : System.Text.Encoding.CodePages.dasm - SBCSCodePageEncoding:ReadBestFitTable():this
          -5 (-4.24% of base) : Microsoft.Diagnostics.Tracing.TraceEvent.dasm - StreamUtilities:CopyStream(Stream,Stream):int
         -10 (-2.99% of base) : System.Runtime.Numerics.dasm - BigInteger:GreatestCommonDivisor(ref,ref):BigInteger
          -4 (-2.94% of base) : System.Private.Xml.dasm - Base64Decoder:ConstructMapBase64():ref
          -9 (-2.69% of base) : System.Data.Common.dasm - DbDataReader:GetStream(int):Stream:this
         -16 (-2.41% of base) : System.Private.DataContractSerialization.dasm - SchemaExporter:ExportXmlDataContract(XmlDataContract):this
         -77 (-2.04% of base) : System.Data.OleDb.dasm - OleDbMetaDataFactory:GetDataTypesTable(OleDbConnection):DataTable:this
         -28 (-1.69% of base) : System.Private.DataContractSerialization.dasm - SchemaExporter:ExportCollectionDataContract(CollectionDataContract,XmlSchema):this
         -12 (-1.58% of base) : System.Private.DataContractSerialization.dasm - SchemaExporter:ExportISerializableDataContract(ClassDataContract,XmlSchema):this
          -4 (-1.47% of base) : System.Private.DataContractSerialization.dasm - NamespaceManager:get_XmlNamespace():Namespace
         -16 (-1.45% of base) : System.Data.Common.dasm - BigIntegerStorage:ConvertToBigInteger(Object,IFormatProvider):BigInteger
         -21 (-1.40% of base) : System.Private.DataContractSerialization.dasm - SchemaExporter:ExportClassDataContract(ClassDataContract,XmlSchema):this
49 total methods with Code Size differences (35 improved, 14 regressed), 244509 unchanged.

@erozenfeld
Copy link
Member Author

Regressions in all methods except SqlDecimal:op_Multiply(SqlDecimal,SqlDecimal):SqlDecimal are because of a larger encoding for some instructions with consts. For example,
cmp eax, dword ptr [r14+8] is 4 bytes
while
cmp eax, 0x400 is 5 bytes.

In SqlDecimal:op_Multiply(SqlDecimal,SqlDecimal):SqlDecimal we do a transformation like this

-mov      rax, gword ptr [rsp+38H]
-cmp      dword ptr [rax+8], 4
-setge    r9b
-movzx    r9, r9b
-test     r8d, r9d
+test     r8b, 1

The successor basic blocks need gword ptr [rsp+38H] so they add the moves since the parent basic block no longer has it. The regression is exacerbated by instances of a register allocator issue https://github.com/dotnet/runtime/blob/master/docs/design/features/lsra-detail.md#avoid-spill-when-stack-copy-is-valid where we end up with redundant stores like this one:

G_M51661_IG40:
       movsxd   r8, ecx
       xor      r9d, r9d
       mov      rax, gword ptr [rsp+38H]
       mov      dword ptr [rax+4*r8+16], r9d
       inc      ecx
       cmp      ecx, 4
       mov      gword ptr [rsp+38H], rax
       jl       SHORT G_M51661_IG40

@erozenfeld
Copy link
Member Author

In good diffs sometimes we can avoid loop cloning because after we've replaced ARR_LENGTH with a const we know that array accesses are in bounds. Here is an example:

; Assembly listing for method IniArray:Bench():bool
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows
; optimized code
; rsp based frame
; fully interruptible
; Final local variable assignments
;
-;  V00 loc0         [V00,T01] (  5, 38   )     ref  ->  rax         class-hnd exact
+;  V00 loc0         [V00,T01] (  3, 18   )     ref  ->  rax         class-hnd exact
;   V01 loc1         [V01,T02] (  4, 13   )     int  ->  rdx        
-;  V02 loc2         [V02,T00] (  9,132   )     int  ->  rcx        
+;  V02 loc2         [V02,T00] (  5, 68   )     int  ->  rcx        
;   V03 OutArgs      [V03    ] (  1,  1   )  lclBlk (32) [rsp+0x00]   "OutgoingArgSpace"
;
; Lcl frame size = 40

G_M27841_IG01:
       sub      rsp, 40						
G_M27841_IG02:
       mov      rcx, 0xD1FFAB1E
       mov      edx, 16
       call     CORINFO_HELP_NEWARR_1_VC
       xor      edx, edx
G_M27841_IG03:
       xor      ecx, ecx
-      cmp      dword ptr [rax+8], 16
-      jl       SHORT G_M27841_IG06				
G_M27841_IG04:
       movsxd   r8, ecx
       mov      word  ptr [rax+2*r8+16], 32
       inc      ecx
       cmp      ecx, 16
       jl       SHORT G_M27841_IG04
G_M27841_IG05:
-      jmp      SHORT G_M27841_IG07
+      inc      edx
+      cmp      edx, 0xD1FFAB1E
+      jl       SHORT G_M27841_IG03
G_M27841_IG06:
-      movsxd   r8, ecx
-      mov      word  ptr [rax+2*r8+16], 32
-      inc      ecx
-      cmp      ecx, 16
-      jl       SHORT G_M27841_IG06
+      mov      rcx, 0xD1FFAB1E
+      mov      rdx, rax
+      call     CORINFO_HELP_CHECKED_ASSIGN_REF
+      mov      eax, 1
G_M27841_IG07:
-      inc      edx
-      cmp      edx, 0xD1FFAB1E
-      jl       SHORT G_M27841_IG03
-G_M27841_IG08:
-      mov      rcx, 0xD1FFAB1E
-      mov      rdx, rax
-      call     CORINFO_HELP_CHECKED_ASSIGN_REF
-      mov      eax, 1
-G_M27841_IG09:
       add      rsp, 40
       ret      

-; Total bytes of code 110, prolog size 4, PerfScore 131.00, (MethodHash=1c8e933f) for method IniArray:Bench():bool
+; Total bytes of code 84, prolog size 4, PerfScore 64.40, (MethodHash=1c8e933f) for method IniArray:Bench():bool

@erozenfeld
Copy link
Member Author

@dotnet/jit-contrib @AndyAyersMS PTAL

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, just one question...

@@ -18,18 +18,26 @@

bool Compiler::optDoEarlyPropForFunc()
{
#if DEBUG
return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry a little about this. I take it you're relying on the fact that any missing flag in debug/check build will lead to assert, so the fact that debug/check potentially diverges from release is ok.

But if the flag checking somehow misses a case....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a change where I added asserts right before we do transformations to make sure we'll get there in release as well. Does that address your concern?

I also fixed one more place where we were losing basic block flags. That was causing an assert in arm64 test run.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed one more change. I noticed that BBF_HAS_VTABREF was missing in BBF_COMPACT_UPD and BBF_SPLIT_GAINED so I added it. This wasn't caught by asserts and doesn't produce any new diffs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The run before the last commit was green. I think this is ready for another review. @AndyAyersMS PTAL.

Add asserts to make sure we don't have diverging codegen
in debug/checked/release because of early opts.
@sandreenko sandreenko added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 27, 2020
Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest version looks good.

Copy link
Contributor

@briansull briansull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@erozenfeld erozenfeld merged commit fad9312 into dotnet:master Jan 27, 2020
@dotnet dotnet locked as resolved and limited conversation to collaborators Dec 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants