4701-4750 of 10000 results (25ms)
2021-05-26 §
04:34 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1106', diff saved to https://phabricator.wikimedia.org/P16211 and previous config saved to /var/cache/conftool/dbconfig/20210526-043439-marostegui.json [production]
04:34 <marostegui@cumin1001> dbctl commit (dc=all): 'db1160 (re)pooling @ 25%: Repool db1160', diff saved to https://phabricator.wikimedia.org/P16210 and previous config saved to /var/cache/conftool/dbconfig/20210526-043424-root.json [production]
03:29 <eileen> process-control config revision is 7b646533da [production]
02:43 <wm-bot> <legoktm> Shutdown freenode version [tools.wikibugs]
02:06 <wm-bot> <bd808> Shutdown freenode bot [tools.jouncebot]
02:05 <wm-bot> <bd808> Shutdown freenode bot [tools.stashbot]
01:58 <wm-bot> <bd808> Disabled all freenode connections [tools.bridgebot]
00:47 <eileen> civicrm revision changed from 584b96452a to eac772e9c9, config revision is 2ca92c3c3c [production]
00:27 <mutante> phab2001 - restarted apache2 [production]
2021-05-25 §
23:09 <razzi@cumin1001> END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) [production]
22:39 <razzi@cumin1001> START - Cookbook sre.hadoop.roll-restart-masters [production]
22:21 <razzi@cumin1001> END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) [production]
22:21 <razzi@cumin1001> START - Cookbook sre.hadoop.roll-restart-masters [production]
22:21 <razzi@cumin1001> END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) [production]
22:21 <razzi@cumin1001> START - Cookbook sre.hadoop.roll-restart-masters [production]
22:04 <razzi@cumin1001> END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) [production]
22:04 <razzi@cumin1001> START - Cookbook sre.hadoop.roll-restart-masters [production]
21:58 <razzi@cumin1001> END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) [production]
21:58 <razzi@cumin1001> START - Cookbook sre.hadoop.roll-restart-masters [production]
21:13 <razzi@cumin1001> END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) [production]
21:13 <razzi@cumin1001> START - Cookbook sre.hadoop.roll-restart-masters [production]
21:13 <razzi@cumin1001> END (ERROR) - Cookbook sre.hadoop.roll-restart-workers (exit_code=97) [production]
21:13 <razzi@cumin1001> START - Cookbook sre.hadoop.roll-restart-workers [production]
20:40 <razzi@cumin1001> END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) [production]
20:28 <razzi@cumin1001> START - Cookbook sre.hadoop.roll-restart-workers [production]
20:00 <twentyafterfour@deploy1002> rebuilt and synchronized wikiversions files: group0 wikis to 1.37.0-wmf.7 [production]
19:20 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
19:17 <cmjohnson@cumin1001> START - Cookbook sre.dns.netbox [production]
19:17 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
19:12 <twentyafterfour@deploy1002> Finished scap: testwikis wikis to 1.37.0-wmf.7 (duration: 33m 29s) [production]
19:12 <cmjohnson@cumin1001> START - Cookbook sre.dns.netbox [production]
18:38 <twentyafterfour@deploy1002> Started scap: testwikis wikis to 1.37.0-wmf.7 [production]
18:16 <razzi> sudo systemctl start all failed units from `systemctl list-units --state=failed` on an-launcher1002 [analytics]
18:14 <razzi> sudo systemctl start eventlogging_to_druid_navigationtiming_hourly.service [analytics]
18:08 <krinkle@deploy1002> Synchronized wmf-config/CommonSettings.php: I2ebe9674fb109f (duration: 00m 56s) [production]
18:01 <razzi> manually edit /etc/hadoop/conf/capacity-scheduler.xml to make queues running and sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues [analytics]
17:52 <razzi> sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues on an-master1001 and an-master1002 [analytics]
17:34 <Krinkle> mwmaint1002: Running purge-parsercache-now.php on server 2/4 (pc1007, depooled spare). Ref P16060, T280605, T282761. [production]
17:30 <marostegui@cumin1001> dbctl commit (dc=all): 'db1164 (re)pooling @ 100%: Repool db1164', diff saved to https://phabricator.wikimedia.org/P16207 and previous config saved to /var/cache/conftool/dbconfig/20210525-173031-root.json [production]
17:28 <razzi> sudo systemctl restart refine_eventlogging_legacy [analytics]
17:28 <razzi> sudo -u yarn kerberos-run-command yarn yarn rmadmin -refreshQueues to enable submitting jobs once again [analytics]
17:22 <effie> disable puppet on mc2019 (for tests) [production]
17:15 <marostegui@cumin1001> dbctl commit (dc=all): 'db1164 (re)pooling @ 75%: Repool db1164', diff saved to https://phabricator.wikimedia.org/P16206 and previous config saved to /var/cache/conftool/dbconfig/20210525-171527-root.json [production]
17:14 <andrewbogott> deleting old ingress controllers toolsbeta-test-k8s-ingress-1 and toolsbeta-test-k8s-ingress-2 [toolsbeta]
17:13 <andrewbogott> created two new ingress nodes, toolsbeta-test-k8s-ingress-4 and toolsbeta-test-k8s-ingress-5 [toolsbeta]
17:07 <razzi> re-enabled puppet on an-masters and an-launcher [analytics]
17:04 <razzi> sudo -u hdfs kerberos-run-command hdfs hdfs dfsadmin -safemode leave [analytics]
17:03 <razzi> sudo -u hdfs /usr/bin/hdfs haadmin -failover an-master1002-eqiad-wmnet an-master1001-eqiad-wmnet [analytics]
17:00 <marostegui@cumin1001> dbctl commit (dc=all): 'db1164 (re)pooling @ 50%: Repool db1164', diff saved to https://phabricator.wikimedia.org/P16205 and previous config saved to /var/cache/conftool/dbconfig/20210525-170024-root.json [production]
16:45 <marostegui@cumin1001> dbctl commit (dc=all): 'db1164 (re)pooling @ 25%: Repool db1164', diff saved to https://phabricator.wikimedia.org/P16203 and previous config saved to /var/cache/conftool/dbconfig/20210525-164520-root.json [production]