2851-2900 of 10000 results (61ms)
2022-08-22 §
00:30 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
00:25 <tstarling@deploy1002> Synchronized php-1.39.0-wmf.25/includes/objectcache/SqlBagOStuff.php: fix modtoken comparison T315271 (duration: 03m 45s) [production]
2022-08-21 §
21:12 <andrewbogott> restarted neutron-dhcp-agent on cloudnet1003. it was claiming to be unable to contact Rabbit but seems happy after a restart [admin]
14:36 <Krinkle> krinkle@mwmaint1002 foreachwikiindblist 'all - small' deleteEqualMessages.php [production]
14:33 <Krinkle> krinkle@mwmaint1002 foreachwikiindblist 'small - closed' deleteEqualMessages.php [production]
13:07 <Reedy> looks live various CI jobs (coverage etc) have been stuck for about 8.5 hours [releng]
13:00 <Reedy> Reloading Zuul to deploy https://gerrit.wikimedia.org/r/824862 [releng]
12:36 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db[1111,1127,1132].eqiad.wmnet with reason: 10.6 being 10.6 [production]
12:36 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on db[1111,1127,1132].eqiad.wmnet with reason: 10.6 being 10.6 [production]
12:30 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depool 10.6 hosts', diff saved to https://phabricator.wikimedia.org/P32649 and previous config saved to /var/cache/conftool/dbconfig/20220821-123038-ladsgroup.json [production]
12:11 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1132', diff saved to https://phabricator.wikimedia.org/P32648 and previous config saved to /var/cache/conftool/dbconfig/20220821-121140-root.json [production]
09:27 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1156 (T314041)', diff saved to https://phabricator.wikimedia.org/P32647 and previous config saved to /var/cache/conftool/dbconfig/20220821-092727-ladsgroup.json [production]
09:12 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P32646 and previous config saved to /var/cache/conftool/dbconfig/20220821-091221-ladsgroup.json [production]
08:57 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P32645 and previous config saved to /var/cache/conftool/dbconfig/20220821-085716-ladsgroup.json [production]
08:42 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1156 (T314041)', diff saved to https://phabricator.wikimedia.org/P32644 and previous config saved to /var/cache/conftool/dbconfig/20220821-084209-ladsgroup.json [production]
04:24 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1156 (T314041)', diff saved to https://phabricator.wikimedia.org/P32643 and previous config saved to /var/cache/conftool/dbconfig/20220821-042415-ladsgroup.json [production]
04:24 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [production]
04:23 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance [production]
04:23 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance [production]
04:23 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1156.eqiad.wmnet with reason: Maintenance [production]
03:30 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32642 and previous config saved to /var/cache/conftool/dbconfig/20220821-033020-ladsgroup.json [production]
03:15 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P32641 and previous config saved to /var/cache/conftool/dbconfig/20220821-031514-ladsgroup.json [production]
03:00 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1146:3312', diff saved to https://phabricator.wikimedia.org/P32640 and previous config saved to /var/cache/conftool/dbconfig/20220821-030008-ladsgroup.json [production]
02:45 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32639 and previous config saved to /var/cache/conftool/dbconfig/20220821-024502-ladsgroup.json [production]
01:35 <rzl@cumin2002> dbctl commit (dc=all): 'Depool db1143', diff saved to https://phabricator.wikimedia.org/P32638 and previous config saved to /var/cache/conftool/dbconfig/20220821-013504-rzl.json [production]
2022-08-20 §
23:20 <wm-bot> <legoktm> Updated mjolnir from 1.2.1 to 1.5.0, now running on node16 [tools.mjolnir]
22:18 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1146:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32637 and previous config saved to /var/cache/conftool/dbconfig/20220820-221826-ladsgroup.json [production]
22:18 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance [production]
22:18 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1146.eqiad.wmnet with reason: Maintenance [production]
17:41 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 9 hosts with reason: Maintenance [production]
17:40 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 9 hosts with reason: Maintenance [production]
17:40 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance [production]
17:40 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2104.codfw.wmnet with reason: Maintenance [production]
17:37 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32636 and previous config saved to /var/cache/conftool/dbconfig/20220820-173723-ladsgroup.json [production]
17:22 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P32635 and previous config saved to /var/cache/conftool/dbconfig/20220820-172217-ladsgroup.json [production]
17:07 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P32634 and previous config saved to /var/cache/conftool/dbconfig/20220820-170711-ladsgroup.json [production]
16:52 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1105:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32633 and previous config saved to /var/cache/conftool/dbconfig/20220820-165203-ladsgroup.json [production]
11:58 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1105:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32632 and previous config saved to /var/cache/conftool/dbconfig/20220820-115816-ladsgroup.json [production]
11:58 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance [production]
11:58 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance [production]
11:57 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1182 (T314041)', diff saved to https://phabricator.wikimedia.org/P32631 and previous config saved to /var/cache/conftool/dbconfig/20220820-115755-ladsgroup.json [production]
11:42 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P32630 and previous config saved to /var/cache/conftool/dbconfig/20220820-114249-ladsgroup.json [production]
11:27 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P32629 and previous config saved to /var/cache/conftool/dbconfig/20220820-112744-ladsgroup.json [production]
11:12 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1182 (T314041)', diff saved to https://phabricator.wikimedia.org/P32628 and previous config saved to /var/cache/conftool/dbconfig/20220820-111238-ladsgroup.json [production]
08:04 <dcaro_away> after cloudvirt1023 reboot, the vm irc-buster shows as running, but even after restart is not responsive through ssh nor console (T315718) [dwl]
07:55 <dcaro_away> after cloudvirt1023 reboot, the vm irc-buster does not seem to have rebooted correctly (no ssh, no console), rebooting (T315718) [dwl]
07:44 <dcaro_away> all k8s nodes ready now \o/ (T315718) [tools]
07:43 <dcaro_away> rebooted tools-k8s-control-2, seemed stuck trying to wait for tools home (nfs?), after reboot came back up (T315718) [tools]
07:41 <dcaro_away> cloudvirt1023 down took out 3 workers, 1 control, and a grid exec and a weblight, they are taking long to restart, looking (T315718) [tools]
07:39 <dcaro_away> cloudvirt1023 is back up, VMs are starting to recover (T315718) [admin]