3651-3700 of 10000 results (39ms)
2021-02-17 §
07:01 <marostegui> Restart db1103 (x1) primary master DONE - T273758 [production]
07:00 <marostegui> Restart db1103 (x1) primary master - T273758 [production]
06:39 <marostegui@cumin1001> dbctl commit (dc=all): 'Add db1172 to dbctl, but not pooled yet T258361', diff saved to https://phabricator.wikimedia.org/P14385 and previous config saved to /var/cache/conftool/dbconfig/20210217-063915-marostegui.json [production]
01:41 <mutante> mwdebug1001 - back on buster and pooled [production]
01:41 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mwdebug1001.eqiad.wmnet [production]
01:39 <mutante> mwdebug1001 - rebooting [production]
01:04 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw1345.eqiad.wmnet [production]
01:04 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mw1351.eqiad.wmnet [production]
01:00 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mwdebug1001.eqiad.wmnet [production]
01:00 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mwdebug1001.eqiad.wmnet [production]
00:58 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw1345.eqiad.wmnet [production]
00:49 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mw1351.eqiad.wmnet [production]
00:33 <mutante> mw1351 - powercycled [production]
00:27 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mwdebug1001.eqiad.wmnet [production]
00:17 <legoktm@deploy1001> Synchronized php-1.36.0-wmf.30/extensions/timeline/: Add $wgTimelineFontDirectory to be passed as GDFONTPATH (T274822) (duration: 01m 06s) [production]
00:15 <legoktm@deploy1001> Synchronized php-1.36.0-wmf.31/extensions/timeline/: Add $wgTimelineFontDirectory to be passed as GDFONTPATH (T274822) (duration: 01m 02s) [production]
00:13 <legoktm@deploy1001> Synchronized wmf-config/timeline.php: Set $wgTimelineFontDirectory (T274822) (duration: 01m 05s) [production]
00:04 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1345.eqiad.wmnet with reason: REIMAGE [production]
00:02 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1345.eqiad.wmnet with reason: REIMAGE [production]
2021-02-16 §
23:54 <mutante> puppetmaster1001 - puppet cert clean mwdebug1001, sign new request, initial puppet run, now on buster (T274023) [production]
23:54 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1351.eqiad.wmnet with reason: REIMAGE [production]
23:52 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mw1351.eqiad.wmnet with reason: REIMAGE [production]
23:44 <dzahn@cumin1001> conftool action : set/pooled=inactive; selector: name=mwdebug1001.eqiad.wmnet [production]
23:44 <mutante> reimaging mwdebug1001 with buster [production]
23:43 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mwdebug1001.eqiad.wmnet [production]
23:37 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwdebug1001.eqiad.wmnet with reason: OS upgrade [production]
23:37 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mwdebug1001.eqiad.wmnet with reason: OS upgrade [production]
23:09 <twentyafterfour@deploy1001> Synchronized php-1.36.0-wmf.30/includes/HookContainer/DeprecatedHooks.php: silence deprecation refs T274889 (duration: 01m 14s) [production]
22:52 <jgleeson> updated payments-wiki config to 3d1b4564a2 [production]
22:39 <gehel> restarting wdqs-updater on wdqs2001 [production]
22:35 <bstorm@cumin1001> END (FAIL) - Cookbook wmcs.wikireplicas.add_wiki (exit_code=99) [production]
22:23 <bstorm@cumin1001> START - Cookbook wmcs.wikireplicas.add_wiki [production]
22:22 <akosiaris> re-enable puppet and squid on install1003. wdqs seems to be mildly related to the outage, restart it [production]
22:09 <elukey@cumin1001> END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0) for Hadoop analytics cluster [production]
21:45 <akosiaris> stop squid as a stopgap on install1003 and disable puppet so that it is not restarted while we figure out what wdqs updater is doing to cause issue to mediawiki [production]
20:47 <marxarelli> 1.36.0-wmf.31 rolled to group0. no new errors for wmf.31 (T271345) [production]
20:33 <dduvall@deploy1001> rebuilt and synchronized wikiversions files: group0 wikis to 1.36.0-wmf.31 [production]
20:20 <mutante> mwdebug1002 has been recreated on buster and has been repooled after scap pull - you can find a .tar.gz in your home with the contents of your home before reimaging, fingerprint at T274023#6835116 [production]
20:18 <legoktm@cumin1001> conftool action : set/pooled=yes; selector: name=mw1297.eqiad.wmnet [production]
20:18 <legoktm@cumin1001> conftool action : set/pooled=yes; selector: name=mw1290.eqiad.wmnet [production]
20:18 <legoktm@cumin1001> conftool action : set/pooled=yes; selector: name=mw1289.eqiad.wmnet [production]
20:18 <legoktm@cumin1001> conftool action : set/pooled=yes; selector: name=mw1288.eqiad.wmnet [production]
20:17 <dzahn@cumin1001> conftool action : set/pooled=yes; selector: name=mwdebug1002.eqiad.wmnet [production]
20:15 <dzahn@cumin1001> conftool action : set/pooled=no; selector: name=mwdebug1002.eqiad.wmnet [production]
20:04 <legoktm@cumin1001> conftool action : set/pooled=no; selector: name=mw1297.eqiad.wmnet [production]
20:04 <legoktm@cumin1001> conftool action : set/pooled=no; selector: name=mw1290.eqiad.wmnet [production]
20:04 <legoktm@cumin1001> conftool action : set/pooled=no; selector: name=mw1289.eqiad.wmnet [production]
20:03 <legoktm@cumin1001> conftool action : set/pooled=no; selector: name=mw1288.eqiad.wmnet [production]
19:58 <ryankemper> [WDQS] De-pooled `wdqs100[4,7]` to catch up on lag, and pooled `wdqs100[5,6]` [production]
19:09 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on mwdebug1002.eqiad.wmnet with reason: OS upgrade [production]