Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

document.all willful violation #668

Closed
annevk opened this issue Aug 15, 2016 · 73 comments · Fixed by #673
Closed

document.all willful violation #668

annevk opened this issue Aug 15, 2016 · 73 comments · Fixed by #673

Comments

@annevk
Copy link
Member

annevk commented Aug 15, 2016

Other than some event loop mismatch which I think we'll be able to resolve more easily, HTML (and JavaScript engines) also contain(s) the document.all willful violation.

https://html.spec.whatwg.org/multipage/obsolete.html#dom-document-all

How much appetite is there to try to formalize this exception?

@erights
Copy link

@annevk Of all the outstanding web reality issues, this is one of the least appetizing. But like the others, if everyone's gonna do it and we can't kill it, we're better off trying to pin down the best precise semantics we can get away with.

I remember there was a subtle but important difference between how Firefox did this vs how either or both Safari and Chrome did this. I think I remember that the Firefox way seemed like a better first step towards a cleaner semantics which is still compat with web reality. But I no longer remember the details.

Anyone? @BrendanEich , I think you were the one who explained this to me, ages ago. Does this ring a bell?

@erights
Copy link

I think I also remember that the FF way, if we had been able to agree on it in time, would have allowed us to banish the weirdness in strict mode. However, we were not, and so this was a missed opportunity for a significant cleanup.

Does any of this ring a bell? I think this was only discussed in side conversations, and so was probably never captured in the notes of the main mtg.

@domenic
Copy link
Member

I'm not aware of any semantic differences between browsers on this feature, but I could be wrong...

My main conflict is: how would we spec this? Presumably we wouldn't have ES talk about a specific object called document.all. We'd instead define some kind of generic behavior (what V8 calls an "undetectable object") and then HTML could say "document.all is an undetectable object". But you'd never ever ever want people to use "undetectable object" in other cases.

I think with diligence we'd be able to prevent people from doing so (e.g. people might want to use it to deprecate features in the future, and we can tell them not to do that). But it still strikes me as weird to create this more generic interface that we insist must only be used once by one client. The current structure of one obviously-ugly willful violation seems more explicit about being a one-off, to me.

@bterlson
Copy link
Member

bterlson commented Aug 16, 2016

I'm not convinced we want the spec machinery for document.all's exotic behavior in ECMA262 - it'd be a fair amound of machinery and wouldn't actually be used in any normative spec text (because we certainly wouldn't define document.all in 262, right?) I'm personally ok with the current level of specification here.

@allenwb
Copy link
Member

If we think it was important to document, we could do so as an standalone Ecma Technical Note

@bterlson
Copy link
Member

@allenwb do you think the formalization we would add in the technical note would be beneficial beyond what is already mentioned in HTML?

@allenwb
Copy link
Member

@bterlson probably a question best answered by browser implementors. Do they think that the definition provided in the HTML spec is inadequate for somebody who is trying to create an interoperatable implementation? Is so, are the inadequacies of the sort that technically could be best addressed by a TC39 document?

@bterlson
Copy link
Member

@annevk / @domenic: ^ ?

@domenic
Copy link
Member

There have been no implementer complaints. This is more about us being concerned that we're bad citizens by monkeypatching ToBoolean/typeof/abstract equality comparison, which is "ES's territory".

@bterlson
Copy link
Member

The grossness of the spec seems commensurate with the grossness of the desired semantics. I see no problems here :)

@annevk
Copy link
Member Author

The other thing is that by not including it here you make it harder to reason about an implementation of the language and might miss something important. And indeed you miss opportunities such as disabling the functionality in certain contexts.

@ljharb
Copy link
Member

ljharb commented Aug 17, 2016

We do have Object.prototype as an "exotic immutable prototype object" (pending a formal proposal to make that not be exotic), would we want to define an "exotic falsy object", so that the WHATWG spec could describe document.all as that? We'd then be able to forbid "exotic falsy objects" in any contexts that document.all didn't need to exist, as @annevk pointed out.

@bterlson
Copy link
Member

@ljharb that's how I'd do it, probably, but I think the current spec text in HTML is straightforward (and unlikely to be broken by spec revisions) so I'm reluctant to add this machinery. Immutable prototype objects have a use case. Crazy undefined-seeming objects don't beyond this compat hack.

@evilpie
Copy link
Contributor

evilpie commented Aug 17, 2016

So for the Firefox history see bug 792108. Basically we used to have a special way of passing along a "lookup context" (detecting flags) when calling [[Get]]. So for example when executing typeof document.all we would pass along the detecting flag and [[Get]] would return undefined in that case. Having special flags for property lookups is just totally incompatible with ES and was removed completely.

Oh and the current approach shared with all other? browsers is basically a lot simpler and just needs a check in a handful of places. (ToBoolean, typeof, ?)

@annevk
Copy link
Member Author

FWIW, despite @bterlson being okay with HTML defining some of it, I would like TC39 to own up to the entirety of the JavaScript subsystem in browsers, including this bit.

I think whenever we uplifted such warts it has led to a greater shared understanding of the computing platform we offer to the world.

@erights
Copy link

@annevk , I would like this too, but not enough to put much time into it. If you'll take the lead, be the champion, write proposals, and take them to the committee, I could probably find the time to review.

@brabalan, @charguer, @totherme, @Mbodin, @sergiomaffeis, @edgemaster
In order for any formal proof about a formal model of JavaScript to be relevant to real-world JavaScript, the differences between the formal model of JavaScript and real-world JavaScript need to be minimized and understood. As explained at tc39/proposal-ses#4 (comment) (including the containing thread), the formal model can be of something much less than the full language and still have the proofs be relevant to actual JavaScript.

The key is that the formal model states invariants that are true for the whole language. Ideally, these invariants are themselves stated normatively in the spec, so any spec detail, say in Date, that violates these invariants creates a spec inconsistency, rather than invalidating the invariant. When there's no overriding issue to the contrary, we can expect the detail to be corrected to conform to the invariant, rather than the other way around. This leaves proofs that depend on those invariants intact.

This is why document.all is more important than, for example, getting the details of Date correct. We know document.all violates what would otherwise be invariants. Because everyone implements document.all, the "invariants" it violates are no longer invariants. They are at best "invariants but for one exception". This would not be the first. === is reflexive but for NaN. To know whether such an invariant violation can be leveraged into an attack requires one of the following two approaches:

  • knowing precisely in what way the invariant is violated, so we can reason about it
  • knowing we can deny any potential attackers access to the exceptional values -- those for which the invariants are suspended.

Fortunately for document.all, denial is a viable approach. Indeed, denial is the more attractive approach when the threat model allows it. (By contrast, no conceivable threat model for JS would allow denial of NaN.) Fortunately for the JS security work I've been doing --- Caja, SES, ocaps, https://github.com/FUDCo/proposal-frozen-realms , etc --- my threat model does allow reliable denial of document.all, and so my security work need not consider these as invariant violations. As far as my work in concerned, proofs relying on "invariants" broken only by document.all are still valid. Hence my lack of energy for this particular exception.

@evilpie
Copy link
Contributor

This is basically everything evilpie@0514f06 that is implemented differently by SpiderMonkey for undetectable objects.

@bterlson
Copy link
Member

@evilpie looks mostly good but doesn't handle how to compare two undefined-like objects.

@allenwb
Copy link
Member

If TC39 does anything to specify this behavior it should not be in the main body of the spec and it shouldn't use general purpose sounding language such as “undetectable object”. Much preferable would be something like “documentAllLegacyHack object” that makes it clear that the behavior exists for only one specific purpose.

I would also still prefer that it note be in ECMA-262 at all, not even Annex B. I still like the idea of a separate Ecma technical report that not only describes the spec. delta but also explains why it exists and exactly how and why it violates invariants.

@evilpie
Copy link
Contributor

@bterlson Thanks. Actually that is handled by Step 1, they still have both have an object type. And indeed in Firefox, Chrome and Edge window.all == window.all is true.

@bterlson
Copy link
Member

In that case its the same object reference, but what if there were two of these kinds of objects in existence?

@evilpie
Copy link
Contributor

  1. There won't be. (minus the cross-iframe case)
  2. The code you proposed wouldn't actually be hit.
  3. We only really care about people doing window.all == undefined or window.all == null.
  4. Don't do that.

@erights
Copy link

@allenwb All my reasons for wanting this are served at least as well by a tech report. Altogether, I agree that the tech report is better than anywhere in the spec proper.

@bterlson
Copy link
Member

@evilpie There are multiple of these but I now see that == returns false in these cases (eg. var f = document.createElement('iframe'); document.body.appendChild(f);f.contentWindow.document.all == document.all). Will fix your point 2.

@littledan
Copy link
Member

If we write this in a spec, I'm all for writing down that it's an unfortunate legacy decision. What are the advantages to a separate technical report vs Annex B? It's already confusing as an implementer to have things separated out in Annex B, but at least it's in the same document and it's easy to link between them; using a separate technical note would just make things harder. Does anyone on this thread think that web browsers could unship document.all falsy behavior without breaking the web?

@bterlson
Copy link
Member

Depends on how big of breakage :) But I haven't checked in a few years so perhaps things have improved since we were all forced to implement this?

The advantage of a TR is it doesn't cruft up 262 and ECMA owns the spec text. However the primary motivation for moving these semantics to 262 is discoverability and better spec factoring. I don't think either of these goals are achieved with a TR. I'd argue they're regressed. Personally, I think we should either have this in 262 or leave it in HTML.

@erights
Copy link

erights commented Aug 17, 2016

@littledan says:

Does anyone on this thread think that web browsers could unship document.all falsy behavior without breaking the web?

Browsers are not the only EcmaScript platform.

Putting it into Annex B implies that it is normative-optional. I would prefer some status weaker than that, in which case even Annex B is inappropriate.

Worse, Annex B implies that it is normative-mandatory for a platform that claims to be a web browser. This is definitely too strong. We should allow this one weirdness in this one special case. But we should never require it. If any browsers want to try killing it, we should not add to their costs for that brave experiment.

@domenic
Copy link
Member

Worse, Annex B implies that it is normative-mandatory for a platform that claims to be a web browser. This is definitely too strong. We should allow this one weirdness in this one special case. But we should never require it. If any browsers want to try killing it, we should not add to their costs for that brave experiment.

I don't think the presence or absence of things in Annex B has any impact on browsers' willingness to experiment. It is certainly not a cost.

@bterlson
Copy link
Member

bterlson commented Aug 18, 2016

Removing its special behavior is also unlikely to work - the reason to make document.all undefined-like wasn't just so people don't use it, it also made sure websites took the "modern" code path.

Upon reflection I mostly agree with @erights's point above. Today, document.all's behavior is non-standard and hopefully we're agreed that HTML requiring a non-compliant ECMAScript implementation is a problem. With #673 in place, the browser would be compliant. But so too would any ECMAScript implementation that includes all manner of ULEOs. To @domenic's point, it is within our purview to say a conformant ECMAScript implementation doesn't have any of these, or only has one of these.

The starting point above doesn't seem good, though. An ECMAScript implementation that isn't a browser that none-the-less has an intrinsic named "document" should not be allowed to have a ULEO. Perhaps we could instead say something like:

ECMAScript implementations that are part of a web browser may create an undefined-like exotic object for use as the document.all property as specified in HTML clause 15.2. Undefined-like exotic objects are not allowed in any other circumstances.

@claudepache
Copy link
Contributor

Per https://www.chromestatus.com/metrics/feature/timeline/popularity/83 usage is at 5% (threshold is 0.03%) which means it's basically impossible to ever remove.

What does the “DocumentAll” feature encompass? For example, are if(document.all) tests included (which won’t break if document.all is really undefined)?

@erights
Copy link

erights commented Aug 18, 2016

@bterlson I am at least as happy with the language you propose. However, it does raise a more general issue. What does

ECMAScript implementations that are part of a web browser

actually mean? The same issue arises in trying to pin down the meaning of Annex B's normative requirements.

The ECMAScript language syntax and semantics defined in this annex are required when the ECMAScript host is a web browser.

In both cases, I would prefer a rephrasing to the effect of "an ECMAScript host which claims to be a web browser". I'm sure this has problems as well. Ideas?

@erights
Copy link

Along the lines of the Annex B text I quote, the corresponding language would be

The ECMAScript language syntax and semantics defined in this document.all exemption is allowed when the ECMAScript host is a web browser.

@bterlson
Copy link
Member

Web browsers is just an easy prose way of saying "implements the HTML specification", IMO. I think the language I propose is fully unambiguous as it points to the spec text you must be trying to implement if you want to create a ULEO and otherwise says they are disallowed (even for web browsers). I'd also be ok with a purely Annex B style language but it's not very specific because what's a web browser?

@erights
Copy link

@bterlson good. I think both this issue and Annex B should state clearly "implements the HTML specification". The alternative easy prose in each case is unclear. Clarity on this matter is crucial.

For example, a frozen realm subsystem within a web browser would seem to qualify as "ECMAScript implementations that are part of a web browser" but it must not. OTOH, it does not implement (or claim to implement) the html spec, and its ECMAScript host is not a web browser. This suggests that both the Annex B language and your suggestion above are better phrasings than "ECMAScript implementations that are part of a web browser".

@bterlson
Copy link
Member

@domenic / @annevk any thoughts on updating the Annex B language (and ultimately the language in #673) to be more precise in this manner?

@erights
Copy link

(and @allenwb )

@annevk
Copy link
Member Author

"A web browser's ECMAScript implementation" and "an implementation of the HTML Standard" both seem fine as prerequisite to me. Or maybe go for "a web reality ECMAScript implementation". 😊

@erights
Copy link

"A web browser's ECMAScript implementation" is still unclear for subsystems within a browser that do implement ECMAScript but do not implement html. "an implementation of the HTML Standard" still seems fine to me.

Thanks for the smiley face on "a web reality ECMAScript implementation" -- I won't take that one seriously until someone else does ;).

@allenwb
Copy link
Member

I still don't believe that this should be in ECMA-262 at all, but if we reach consensus that it must then a reference to the HTML spec. along the lines that @bterlson and @erights are discussing should be all we have. We should not make any technical accommodation to theECMA-262 to accommodate this feature as any accommodations could be misused to taken as license for other hosts to do similar but different crazy things.

I don't really see why we are wasting time on this. This is an HTML features involving a HTML host object and it is up to HTML to decide how to specify it. The HTML spec. explicitly calls this feature out as a deviation from ECMA-262. That's good. We can't force platforms not to deviate from ECMA-262 but we should be pleased when platforms acknowledge that they have chosen to deviate for platform specific reasons.

This isn't the only deviation among platforms. Browser CSP deviates WRT eval (including syntactic direct eval) and Function. node.js completely eliminates the syntax and semantics of the Script parse goal uses its own non-conforming top level syntax and semantics. Many ES implementations targeting embedded environments have assorted deviations. It isn't our job to try to regularize or police all deviations.

@allenwb
Copy link
Member

@littledan

Does TC39 have an articulated mission written down somewhere?

see http://ecma-international.org/memento/TC39.htm

@annevk
Copy link
Member Author

@allenwb CSP is part of the language already. Hook for that landed some time ago.

@erights
Copy link

@allenwb I am conflicted about the explicit willful violation approach.

As @annevk says, CSP is already accommodated and so does not need to admit any violations.

Pro comment:

In general, for the spec to reflect reality well enough, we should try to keep such explicit violation as small as reasonably possible, as we generally try to do. But yes, I don't see why we would need this to be zero. document.all seems like a fine candidate for the first permanently unaccommodated willful violation.

Con comment:

To mix metaphors horribly, this escape hatch could easily become an avalanche. If it does, then tc39 loses control of real-world JavaScript. Let's revisit a good example of the potential problem: __proto__. At the time we faced this, the precise details of __proto__ across browsers was a complete mess. By bringing this issue into tc39, we made the spec much uglier, but we codified a single clear stance of __proto__ that all implementations then shifted to.

If we had the "explicit willful violation" escape hatch at the time, I think it is quite likely we would have swept this one under that rug. (Holy mixed metaphors batman!) The chaos of bizarrely different precise semantics for this misfeature would have then continued, which is still much worse than what we achieved by bringing this into tc39.

On __proto__, although we made the spec uglier, we made reality much cleaner.


Having written down both sides as I see it, I still lean against the explicit willful violation approach when we can reasonably avoid it. For document.all specifically, I still think the costs of bringing into tc39 -- _if we can agree to do so as the narrowest possible exemption consistent with web reality_ -- is less than the costs of letting it escape into the chaos of "explicit willful violation".

However, I don't take this as an absolute principle. I can imagine, in desperation, agreeing that other issues fall on the other side of this line.

@claudepache
Copy link
Contributor

Any way to get data about existing reliance of document.all's strange behavior? What would break if it were no longer non-conformant?

See: https://bugs.chromium.org/p/chromium/issues/detail?id=567998

@claudepache
Copy link
Contributor

claudepache commented Aug 19, 2016

Anyone have an adoptable interim mechanism to propose, analogous to poisoning, to discourage continued use of document.all?

The best strategy I can currently think of:

  • any property access on a documentAll object and any invocation of documentAll as function would produce a sore deprecation warning in the console;
  • monitor those uses.

Importantly, that wouldn’t affect uses of such as if(document.all) or var isIE = typeof document.all != "undefined", which won’t break when/if document.all is killed.

@littledan
Copy link
Member

@allenwb I think it would be great if the ECMAScript specification were something that web browsers could ship with 100% accuracy. The current specification doesn't offer an extension point for document.all--the specification is pretty explicit that all objects are truthy. Although the group's mission does not mention web browsers, ECMAScript has always been developed with web browsers as a primary embedding target.

@erights I'm pretty convinced by your "con" argument. I like the idea of "the narrowest possible exemption", but mentioning "document.all" by name (or part of the name) seems to be going a little too far. What if you wanted to implement a web browser from scratch today, based on specifications? Would you prefer to work with specifications which are explicitly designed to disagree with each other to create local, artificial cleanliness, or ones which layer properly and explain a coherent world together?

@Jack-Works
Copy link
Member

Per https://www.chromestatus.com/metrics/feature/timeline/popularity/83 usage is at 5% (threshold is 0.03%) which means it's basically impossible to ever remove.

(And we tried neglecting it for many many years already. Didn't work.)

image

Can anyone tell me what happened in 2017? The graph is weird

@devsnek
Copy link
Member

As it says right below the graph:

on 2017-10-26 the underlying metrics were switched over to a newer collection system which is more accurate. This is also the reason for the abrupt spike around 2017-10-26.

@Jack-Works
Copy link
Member

Oh thanks... I didn't mention that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.