Page MenuHomePhabricator

Clean up implementation for "follow" cases
Closed, InvalidPublic

Description

While rewriting the Cite extension, we tiptoed around two edge cases for the follow feature (see manual). This is something used on Wikisource when splitting long footnotes broken across pages in the published document. The PHP implementation of that feature causes considerable technical debt in Cite, this task's scope is to cleaned up most of the debt, or documented a path to fixing in the future. Don't change behavior if possible, but if there are changes please announce them as part of this task.

Existing logic:

  • In normal usage, <ref name="first">First text</ref> will precede <ref follow="first">Second text</ref>, producing a footnote marker for only the first reference, and the text "First text Second text" in the references section.
  • If the two refs are swapped so that follow comes before the "main" ref, the output will be broken. We won't show a footnote mark in the article body, but the follow ref's text will appear at the top of the references section, without any red error indicators. This is different than how any other erroring ref is handled, it seems justifiable to at least paint as an error, or make fully consistent with the other error types.
  • There are a ton of smaller edge cases, which may or may not be worth listing here.
Decision

After the in-depth investigation in T240858#5777804 we believe the current behavior was never a formal requirement, but an arbitrary fallback the original developer came up with, but a fallback that's not even relevant in production. We decided to consider the behavior a bug and fix it. The misplaced follow will not be displayed any more on top of the references list, but instead show an error in-place.

TechNews

Suggested wording for the User-notice:

In 2010, a <ref follow="…"> syntax was introduced. It allows to merge footnotes that follow each other. It's meant to be used for digitized books on Wikisource. If the order of such consecutive references is wrong, no error was reported but the bad <ref> shown outside of the <references> list. This will change and a red error message be shown instead.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

@thiemowmde Is there a screenshot of what the error message would look like? That would be handy for the announcements.
Thanks!

@JStrodt_WMDE: Here is a screenshot of the message that now appears in the text, at the position of the broken <ref>:

Screenshot from 2020-01-28 12-34-36.png (88×1 px, 15 KB)

This was published in https://meta.wikimedia.org/wiki/Tech/News/2020/03 by the way, if that wasn't obvious from the moving the ticket to different columns on the User-notice board.

Change 559671 merged by jenkins-bot:
[mediawiki/extensions/Cite@master] Standardize "follow" validation

https://gerrit.wikimedia.org/r/559671

Change 559672 merged by jenkins-bot:
[mediawiki/extensions/Cite@master] Remove "follow" special case from ReferencesFormatter

https://gerrit.wikimedia.org/r/559672

Change 559673 merged by jenkins-bot:
[mediawiki/extensions/Cite@master] Remove broken "follow" special case from ReferenceStack

https://gerrit.wikimedia.org/r/559673

Change 569611 had a related patch set uploaded (by C. Scott Ananian; owner: C. Scott Ananian):
[mediawiki/services/parsoid@master] Sync parserTests with Cite, including new blacklist entries

https://gerrit.wikimedia.org/r/569611

Change 569611 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Sync parserTests with Cite, including new blacklist entries

https://gerrit.wikimedia.org/r/569611

awight moved this task from Demo to Done on the WMDE-QWERTY-Sprint-2020-01-21 board.

Is the error message hidden in Page namespace?

The whole point of the "follow" functionality is to be used in Page namespace when the first named part of the reference is on a different page. Therefore, it would be disadvantageous to display an error message in Page namespace when the "follow" functionality is being used as designed.

Ok, I was hoping to get clarification on actual behaviour before drawing any conclusions, but since this is scheduled to go out today in the ongoing deploy of MediaWiki 1.35/wmf.18…

Based on the available information this change will break the primary workflow on Wikisource for a significant subset of cases. The error message is bad enough, but it appears the change will also actually suppress the display of the reference fragment, making it impossible to accurately proofread (transcribe) it.

In other words, this change needs to be pulled from the current deploy of MediaWiki 1.35/wmf.18, or, if that is too late, there needs to be an out-of-cycle rollback of this change!

I'm super confused right now. Why do you think this change will break something? If the follow attribute is used as it is meant to be used, nothing will change. We don't expect any change on any of the Wikisource pages.

The only change will happen in cases where the follow attribute was already broken. Example:

<ref follow="a">This is broken</ref>
<ref name="a">This is fine</ref>
== References ==
<references />

In this example, the first <ref> is broken because it tries to follow a <ref> named "a" that does not exist in the text before. Previously, such an erroneous wikitext caused the broken <ref> to be misplaced between the "References" headline and the list of other references. With the change we did, this will display an actual error message instead.

I think the confusion comes from the message left at enWS Scriptorium by Max Klemm:

For the follow-attribute to work it is important that the name of the reference is defined in the text before the follow attribute is called. Unfortunately, if this is not the case, the follow reference is shown on top of the reference list, with no footnote marker and unattached to the predecessor which it's supposed to follow. In the past, there was no error message for this edge case.

This certainly sounds like it will break enWS usage. Imagine two pages, one with the named ref and one with a follow-ref:

Page 1:

Some text on page 1<ref name="p1_ref">A footnote, but it's really long and so</ref>

Page 2:

More text on page 2
<ref follow="p1_ref">it continues onto page 2, even though there is no superscript reference present on this page</ref>

If it is indeed " it is important that the name of the reference is defined in the text before the follow attribute is called", it sounds like page 2 will break, as there is no <ref name="p1_ref"> on page 2. The two parts only marry up into a complete set when page 1 and page 2 are transcluded together (usually into mainspace).

If this workflow will not break, then it's just a miscommunication of some kind. If it will break pages like page 2, it's also going to break thousands of Wikisource pages.

This is not a contrived example, it's exactly how the follow attribute has been used for almost a decade (or longer). Here's an example from 2011:

Page 1: https://en.wikisource.org/wiki/Page:Seventeen_lectures_on_the_study_of_medieval_and_modern_history_and_kindred_subjects.djvu/177
Page 2: https://en.wikisource.org/wiki/Page:Seventeen_lectures_on_the_study_of_medieval_and_modern_history_and_kindred_subjects.djvu/178

I'm super confused right now.

No worries; so are we! :)

Why do you think this change will break something?

Well, we're operating on second-hand information here, but based on your description we foresee problems in this scenario:

Page:The Bible/123: contains <ref name=a>
Page:The Bible/124: contains <ref follow=a>

Since the wikipage at "The Bible/124" in the Page: namespace does not contain a <ref name=a>, the <ref follow=a> there will not display the reference contents and instead display the error message.

This is the normal and intended way to use <ref follow=a> (its primary use case) and affects every wiki using the ProofreadPage extension.

Key information: the Wikisourcen (minus deWS, I'm told) transcribe books page by page in separate subpages in the Page: namespace, and then transclude them together in mainspace for presentation after the fact.

Change 570279 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Cite@master] Revert "Remove broken "follow" special case from ReferenceStack"

https://gerrit.wikimedia.org/r/570279

Change 570280 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Cite@master] Revert "Remove "follow" special case from ReferencesFormatter"

https://gerrit.wikimedia.org/r/570280

Change 570281 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Cite@master] Revert "Standardize "follow" validation"

https://gerrit.wikimedia.org/r/570281

Change 570283 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Cite@master] Revert "follow" standardization (backport)

https://gerrit.wikimedia.org/r/570283

Thanks for the timely and helpful notes about the Wikisource "follow" use case! I'm making sure our changes will not be deployed tonight, and hopefully we can come back around to find a correct implementation in the next month.

Change 570285 had a related patch set uploaded (by Awight; owner: Awight):
[mediawiki/extensions/Cite@wmf/1.35.0-wmf.18] Revert "follow" standardization (backport)

https://gerrit.wikimedia.org/r/570285

Change 570283 abandoned by Awight:
Revert "follow" standardization (backport)

Reason:
wrong branch

https://gerrit.wikimedia.org/r/570283

Change 570285 merged by jenkins-bot:
[mediawiki/extensions/Cite@wmf/1.35.0-wmf.18] Revert "follow" standardization (backport)

https://gerrit.wikimedia.org/r/570285

Mentioned in SAL (#wikimedia-operations) [2020-02-05T12:32:41Z] <awight@deploy1001> Synchronized php-1.35.0-wmf.18/extensions/Cite: SWAT: [[gerrit:570285|Revert follow standardization (T240858)]] (duration: 01m 13s)

awight changed the point value for this task from 8 to 3.Feb 5 2020, 12:46 PM

Our changes are safely backed out of MediaWiki ahead of tonight's train deployment, so Wikisource should not experience any glitches.

I'm leaving this task open because we didn't accomplish the stated goal. The remaining work is to reapply the cleanup patches, this time with a fix to cause the intended behavior when the first half of a follow reference is missing. We also have the choice to either leave the feature exactly as it has been, or make improvements such as prepending a snippet like "(continued)" to these lonely follow refs. (My thought is that it would be helpful in diagnosing legitimately broken refs, and a nice cue when proof reading.) I'll default to not making any behavior changes unless I hear otherwise.

Change 570335 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/Cite@master] Add two extreme follow edge cases back to parser tests

https://gerrit.wikimedia.org/r/570335

Change 570279 merged by jenkins-bot:
[mediawiki/extensions/Cite@master] Revert "Remove broken "follow" special case from ReferenceStack"

https://gerrit.wikimedia.org/r/570279

Change 570280 merged by jenkins-bot:
[mediawiki/extensions/Cite@master] Revert "Remove "follow" special case from ReferencesFormatter"

https://gerrit.wikimedia.org/r/570280

Change 570281 merged by jenkins-bot:
[mediawiki/extensions/Cite@master] Revert "Standardize "follow" validation"

https://gerrit.wikimedia.org/r/570281

An attempt at some usage / requirements type notes… (and pinging @Tpt as they may wish to be aware of this task)

  • In the NS_PROOFREAD_PAGE namespace (should be ns:250 everywhere in WMF-land; prefix "Page:" on enWS), instances of <ref follow="foo">…</ref> that have no accompanying <ref name="foo">…</ref> on the same wikipage are normal and should not generate an error.
  • Such <ref follow="foo">…</ref> elements are usually simply appended to the end of the wikimarkup in the page, but can, and sometimes do, appear anywhere in the page. Since they generate no visible footnote marker, other factors dictate where users choose to place them physically in the wikimarkup. (and we have a decade's worth of accumulated weirdness over 5k+ pages just on enWS)
  • There may be more than one instance of a follow footnote on a given wikipage. <ref follow="foo">…</ref> and <ref follow="bar">…</ref> should be displayed along with the other references in a page, but formatted without the footnote marker: pretty much exactly as if they were a normal footnote but the marker and backlink were invisible. The order of display should be the same as the order in which they appear in the physical wikimarkup, and a nested ref should display after the ref it is nested within. {{#tag:ref …}} syntax is often used for this.
  • A "(continued)" or similar marker would have some utility, but should not unduly affect formatting/layout of the footnote, as that is a primary concern when users are editing these pages. It should also be fairly subtle since it will exist inside the content (vs. chrome) area of the page, and may easily be mistaken for similar markings in the original book that is being transcribed. Somewhere out there I absolutely guarantee that there is a publisher that's tagged such continued footnotes with "(continued)", which would lead to our pages showing "(continued) (continued)". Not a huge problem, but sufficiently confusing to be worth keeping in mind.
  • follow footnotes can, and relatively often do, span multiple pages. Elswhere it was mentioned a case where a single footnote spanned 6 wikipages (one name=foo and five pages with follow=foo). Even nested footnotes can span multiple pages (yes, historically, printers and publishers were insane).
  • These wikipages (in the Page: namespace) are primarily a working area for Wikisource users, and are not primarily intended for readers; except to the degree that on a wiki all pages are in principle user-facing and should be suficiently accessible to permit a passerby to correct a spelling error, or even be recruited as a long-term contributor. The main presentation for readers is in mainspace (ns:0), and the Translation: namespace (don't have the ns number handy, sorry). In these namespaces, the individual wikipages from the Page: namespace are transcluded together using the special syntax provided by the ProofreadPage extension (including Labelled Section Transclusion). In these namespaces there is extensive use of subpages, typically one subpage per chapter of a book, or similar logical or practical considerations.
  • However, in these namespaces, I cannot think of any scenario in which it is necessary or desireable to either transclude only the follow ref without also transcluding the associated name ref, or to transclude them such that the follow ref logically precedes the associated name ref. There may be edge cases where funky stuff is done in order to compensate for broken scan or similar, but I don't think I could come up with an example of that (not even a contrived one).
  • In these reader-facing namespaces, having visible error messages and/or maintenance categories that flag genuinely incorrect usage would be very valuable. If you have a ref over three pages, and the middle one has an incorrect value in its follow parameter, you would have a hard time figuring out that one giant footnote was missing its middle part while both its beginning and end was actually present.
  • A maintenance category would be good to enable gnomes to patrol and fix such issues. A visible error message would be good in order to make contributors notice an issue and to easily find where in the page it exists; but if there are a lot of existing pages that will suddenly sprout reader-visible error messages (more than can be manually fixed by a small volunteer pool in relatuvely short order) that may be infeasible. If we expect a lot of historical cruft, a two-step process starting with a maintenance category to be knocked down to zero before turning on visible error messages is an option.

Change 570361 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/Cite@master] Don't talk about follow being "broken" but "incomplete"

https://gerrit.wikimedia.org/r/570361

@Xover Thank you for the ping and the great summary for the problem.
A minor detail: ns250 is used by ProofreadPage only on the new wikis. Sadly, older wikis uses different ids for historical reasons. But for all wikis with ProofreadPage installed the canonical name of the Page: namespace is "Page".

Thank you @awight and @thiemowmde for the quick reaction!

Change 570373 had a related patch set uploaded (by Thiemo Kreuz (WMDE); owner: Thiemo Kreuz (WMDE)):
[mediawiki/extensions/Cite@master] [POC] Render incomplete follow="…" not as misplaced <p>

https://gerrit.wikimedia.org/r/570373

I don't know why, but for some reason the message is gone. It was part of the revert https://gerrit.wikimedia.org/r/570285 and should not be gone. @Raymond, do you have an explanation for this?

As a quick workaround, you can create the page https://en.wikisource.org/wiki/MediaWiki:Cite_references_no_link and paste this: <p id="$1">$2</p>

@thiemowmde Theory: the initial change removed the message, which removed it from translatewiki.net. The deploy started to Group 0 on Tuesday, at which point messages were sync'ed from translatewik. The patch was reverted on Wednsday before deploy to Group 1, but still with the translatewiki sync from Tuesday. Translatewiki was updated with the restored key, but that change won't be sync'ed back until the next scheduled deploy on Tuesday next.

I'm trying to find someone who knows how this works on IRC to figure out if that holds water and how it can be fixed (manual sync and SWAT deploy of messages maybe?)

If that's the case this affects all language Wikisourcen (we'd need to create the message locally on all of them). In addition, due to… reasons… we have no local Interface Admins on enWS so we can't actually create that message.

The message cite_references_no_link is present in https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/Cite/+/wmf/1.35.0-wmf.16/i18n/en.json#46 and also https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/Cite/+/wmf/1.35.0-wmf.18/i18n/en.json#47 as well as in master, unless I'm blind. So AFAICT this is a Deployments issue, not a code issue. Probably best to wait for the current train upheaval to pass (T233866).

@Billinghurst I'm told there are other ongoing issues that may make getting the right eyes on this problem difficult with any urgency (cf. Nemo's message above). But as I recall you're a global interface admin: your thoughts on requesting the global interface admins on meta to mass add this interface message on all the affected projects? Details of what's needed in Thiemo's message above. We'd need to also remove it again once whatever is causing this is fixed, which I'm not sure whether the interface admin bit is sufficient for.

Just for completeness: I've verified that the problem occurs on plWS as well as enWS.

I'm assuming it hits all projects, but I can do more testing if any is needed.

thiemowmde triaged this task as Unbreak Now! priority.Feb 7 2020, 6:46 AM

As far as I know messages are never deleted on translatewiki: https://translatewiki.net/wiki/MediaWiki:Cite_references_no_link/en. It appears like some special thing should have been done to this backport nobody was aware of. I'm raising the priority of this now, but don't know how to tag this, or who we can ping.

I don't think this is just an i18n messages problem, the source code itself is not at the expected revision. On the Special:Version page I see that Cite doesn't match wmf/1.35.0-wmf.18, it doesn't include my revert. I must have botched this SWAT deployment! I have time to look at this in ~ 2 hours.

Okay, I'm on-board with the i18n message theory, up to a point. I see the correct version of Cite code everywhere on the deployment server, in /srv/mediawiki-staging and also in /srv/mediawiki. I wasn't able to log into an app server to check there because I can't remember how to do that at the moment :-/

The state reported in Special:Version is still scary, but that page is often cached or wrong. There, it reports the correct mw-core version which should have the patched Cite extension, but the Cite version line reports an older, unpatched version.

The problem we're seeing should only be possible with the newer, correctly reverted code. i18n messages should have been synced with the "full scap" of regular train deployment.

@awight Per Nemo above (and on IRC) there is some chaos surrounding this week's deploy train, and issues related to that which are keeping the key people busy. It sounds plausible that this and those issues may be related; but I have no details on what's going on beyond the task Nemo linked above (T233866, <s>which just looks like the deployment tracker to me</s> Nope, I see it's been updated with significant and relevant breakage.).

@awight: I can try to look on app servers to confirm what you're seeing. Which patch should be present?

In T240858#5858071, @Xover wrote:

@Billinghurst I'm told there are other ongoing issues that may make getting the right eyes on this problem difficult with any urgency (cf. Nemo's message above). But as I recall you're a global interface admin: your thoughts on requesting the global interface admins on meta to mass add this interface message on all the affected projects? Details of what's needed in Thiemo's message above. We'd need to also remove it again once whatever is causing this is fixed, which I'm not sure whether the interface admin bit is sufficient for.

IMO, creating this message on all Wikisources is a good workaround. I wanted to also add that there's no urgency in removing the message once it's no longer needed, maybe that makes this easier?

I'm slowly realizing that the problem comes from how I deployed on Wednesday, and bad assumptions I made. The l10n cache files were already built on Tuesday, when the "full scap sync" was performed. During group0 -> group1 and -> group2 deployment, a different and more limited script is run, deploy-promote. This in turn runs sync-wikiversions and simply pushes new configuration for the multiversion framework.

@awight: I can try to look on app servers to confirm what you're seeing. Which patch should be present?

Hi, thanks! I think the mysteries are all wrapped up now, I see that the correct code is running, but the messages are missing due to my misunderstandings about promotions are made to groups 1 and 2.

Special:Version would be more useful if we could make it trustworthy, but that's a different issue... If you're curious, the extension-Cite code that seems to be deployed includes patch I695bf3ecc003715466c45c0d04cc023acf6ebe7c or rECIT71c2f93adcc0, however Special:Version is still reporting that Cite is at the previous branch point, 7defec4d7a4c9890564e7373b5fdf609419c5a36 .

Special:Version would be more useful if we could make it trustworthy

I see what happened here, too: the cache/gitinfo files used by Special:Version are written during full scap, but not updated by scap sync-file. No big deal, I guess.

@awight Ok, so the summary seems to be: when the next planned "full scap sync" happens on Tuesday the l10n cache CDBs will get rebuilt with the correct messages, and either on Tuesday or Wednesday—depending on in which phase the caches get pushed out—the missing message key will become available on all the Group 1 wikis and this problem will go away. Is that roughly correct?

If that's correct, and given the technical and practical challenges in getting the necessary magic done out of cycle, we can probably downgrade this from "Unbreak now!" and let the communities figure out whether and how to implement the workaround.

Great work tracking this issue down btw!

awight lowered the priority of this task from Unbreak Now! to High.Feb 7 2020, 5:28 PM
In T240858#5859958, @Xover wrote:

@awight Ok, so the summary seems to be: when the next planned "full scap sync" happens on Tuesday the l10n cache CDBs will get rebuilt with the correct messages, and either on Tuesday or Wednesday—depending on in which phase the caches get pushed out—the missing message key will become available on all the Group 1 wikis and this problem will go away. Is that roughly correct?

Thanks for the nudge, this is exactly right. However, the main MediaWiki train won't refresh these caches magically until the next version is released, and meanwhile another issue is blocking further deployment of this (wmf.18) version. So what I should do is schedule a full l10n cache refresh for next Monday.

For any wikis where an interface admin is available, please also consider the workaround provided by Thiemo:

As a quick workaround, you can create the page https://en.wikisource.org/wiki/MediaWiki:Cite_references_no_link and paste this: <p id="$1">$2</p>

@awight It seems all wikis were rolled back to wmf.16 (including the l10n cache) for other reasons, so we're now back to the status quo and <ref follow="foo"> works as normal again. As I understand it they haven't decided yet whether next week's deploy will be retrying wmf.18 or moving on to wmf.19, but from my (very limited) understanding we should not run into this particular issue again either way.

I've made some simple checks (and will do a couple more to be sure) and modulo cached pages that will need purging, everything looks fine now.

thiemowmde raised the priority of this task from High to Needs Triage.Feb 7 2020, 9:50 PM

Change 570361 merged by jenkins-bot:
[mediawiki/extensions/Cite@master] Don't talk about follow being "broken" but "incomplete"

https://gerrit.wikimedia.org/r/570361

Good news: the wmf.18 version has been deployed to en.wikisource.org, and messages were rebuilt so the follow behavior has returned to normal. We'll continue working on the original purpose of this task, to clean up tech debt without any user-facing changes. Thanks again for all the help and attention to the bugs we caused!

Change 570335 merged by jenkins-bot:
[mediawiki/extensions/Cite@master] Add two extreme follow edge cases back to parser tests

https://gerrit.wikimedia.org/r/570335

thiemowmde removed the point value for this task.

@awight, I believe you will find https://gerrit.wikimedia.org/r/570376 quite fascinating, if not funny. As it turns out one of the chunks of code that motivated this ticket here is not even needed.

@awight, I believe you will find https://gerrit.wikimedia.org/r/570376 quite fascinating, if not funny. As it turns out one of the chunks of code that motivated this ticket here is not even needed.

Not sure of the meaning of your "incomplete" follow reference, whether you mean that someone hasn't got the syntax correct, or something else.

With regard to the follow component being top-located in reflist without a number, that is completely by design. In works of the 19th century being reproduced, the "follow"ing component of a footnote has been represented by the publishers as numberless, and usually the lead. Example at https://en.wikisource.org/wiki/Page:The_life_of_Matthew_Flinders.djvu/83

That's a wonderful example, thanks a lot for that! Whats currently written in this task is based on wrong assumptions. I created a new ticket T245549 to help us track the remaining Technical-Debt.

Change 570373 abandoned by Thiemo Kreuz (WMDE):
[mediawiki/extensions/Cite@master] [POC] Render incomplete follow="…" not as misplaced <p>

Reason:

https://gerrit.wikimedia.org/r/570373