2151-2200 of 10000 results (88ms)
2020-04-01 §
08:28 <vgutierrez@cumin1001> START - Cookbook sre.hosts.downtime [production]
08:28 <vgutierrez> depool & decommission cp2017 - T249084 [production]
08:21 <vgutierrez> pool cp2039 - T248816 [production]
08:09 <marostegui> Deploy schema change on db1138 (s4 primary master) [production]
08:06 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
08:04 <vgutierrez@cumin1001> START - Cookbook sre.hosts.downtime [production]
07:13 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db1121 after schema change', diff saved to https://phabricator.wikimedia.org/P10841 and previous config saved to /var/cache/conftool/dbconfig/20200401-071339-marostegui.json [production]
07:12 <vgutierrez> pool cp2038 - T248816 [production]
06:38 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
06:38 <vgutierrez@cumin1001> START - Cookbook sre.hosts.decommission [production]
06:36 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
06:36 <vgutierrez@cumin1001> START - Cookbook sre.hosts.downtime [production]
06:36 <vgutierrez> depool & decommission cp2012 - T249080 [production]
06:24 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
06:22 <vgutierrez@cumin1001> START - Cookbook sre.hosts.downtime [production]
05:39 <marostegui> Deploy schema change on db1121 (this will create lag on s4 labs) [production]
05:38 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1121 for schema change', diff saved to https://phabricator.wikimedia.org/P10840 and previous config saved to /var/cache/conftool/dbconfig/20200401-053827-marostegui.json [production]
00:39 <reedy@deploy1001> Synchronized docroot/mediawiki.org/xml/: Update http and prot rel links to https, fix link to sitelist in MW Core (duration: 01m 06s) [production]
00:12 <reedy@deploy1001> Synchronized docroot/mediawiki.org/xml/: Add export-0.11 (duration: 01m 05s) [production]
2020-03-31 §
22:23 <marxarelli> group0 to 1.35.0-wmf.26 (T247773); no rise in error rates following redeployment [production]
22:18 <marxarelli> group0 to 1.35.0-wmf.26 (T247773); no rise in error rates following redeployment [production]
22:13 <dduvall@deploy1001> rebuilt and synchronized wikiversions files: group0 to 1.35.0-wmf.26 [production]
22:07 <dduvall@deploy1001> rebuilt and synchronized wikiversions files: testwiki to php-1.35.0-wmf.26 (T247773) [production]
21:54 <dduvall@deploy1001> sync aborted: testwiki to php-1.35.0-wmf.26 (T247773) (duration: 07m 31s) [production]
21:47 <dduvall@deploy1001> Started scap: testwiki to php-1.35.0-wmf.26 (T247773) [production]
21:46 <jforrester@deploy1001> Synchronized php-1.35.0-wmf.26/includes/user/UserNameUtils.php: T249045 Use wfMessage in UserNameUtils::isUsable for now (duration: 00m 58s) [production]
21:05 <eileen> process-control config revision is f80d248113 - (catch up dedupe now off - fyi MBeat ) [production]
20:59 <hashar> contint1001: manually reverted /lib/systemd/system/jenkins.service [production]
20:51 <hashar> Restarting Jenkins for new CSP rules # T245658 [production]
20:26 <dduvall@deploy1001> rebuilt and synchronized wikiversions files: rolling back 1.35.0-wmf.26 testwiki deployment following significant increase in error rate (cc T247773) [production]
20:14 <marxarelli> correction: RequestContext::getLanguage errors are for testwiki deployment, pre group0 [production]
20:08 <marxarelli> a slew of "ErrorException from line 334 of /srv/mediawiki/php-1.35.0-wmf.26/includes/context/RequestContext.php: PHP Warning: Recursion detected in RequestContext::getLanguage" after group0 deployment (cc T247773) [production]
20:04 <dduvall@deploy1001> Finished scap: testwiki to php-1.35.0-wmf.26 and rebuild l10n cache (duration: 142m 48s) [production]
19:20 <ariel@deploy1001> Finished deploy [dumps/dumps@713c297]: more filelist methods cleanup, sort prefetch possible files properly (duration: 00m 04s) [production]
19:20 <ariel@deploy1001> Started deploy [dumps/dumps@713c297]: more filelist methods cleanup, sort prefetch possible files properly [production]
18:08 <ariel@deploy1001> Finished deploy [dumps/dumps@8376c62]: bring snapshot1010 up to date (duration: 00m 05s) [production]
18:07 <ariel@deploy1001> Started deploy [dumps/dumps@8376c62]: bring snapshot1010 up to date [production]
17:42 <dduvall@deploy1001> Started scap: testwiki to php-1.35.0-wmf.26 and rebuild l10n cache [production]
17:40 <dduvall@deploy1001> Pruned MediaWiki: 1.35.0-wmf.23 (duration: 26m 51s) [production]
17:38 <elukey> restart elasticsearch_6@cloudelastic-chi-eqiad.service on cloudelastic1001 to see if it recovers from a trashing/gc state - T231517 [production]
16:30 <marxarelli> 1.35.0-wmf.26 was branched at bec758b668aaa57fc259a1d0ecf3b35340d2661b for T247773 [production]
16:24 <jforrester@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Touch and secondary sync of IS for cache-busting (duration: 01m 00s) [production]
16:15 <vgutierrez> pool cp2037 - T248816 [production]
15:39 <vgutierrez@cumin1001> START - Cookbook sre.hosts.downtime [production]
15:36 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
15:35 <mutante> decom mw1254 through mw1258 (last remaining old servers in rack D5, depooled a while ago and average response time is again under 200ms) T247780 [production]
15:33 <dzahn@cumin1001> START - Cookbook sre.hosts.decommission [production]
15:29 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
15:29 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime [production]
15:28 <vgutierrez@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]