5251-5300 of 10000 results (114ms)
2024-06-10 ยง
09:57 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 5:00:00 on 26 hosts with reason: Issue from T367019 [production]
09:54 <arnaudb@cumin1002> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 5:00:00 on 870 hosts with reason: Issue from T367019 [production]
09:54 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 5:00:00 on 870 hosts with reason: Issue from T367019 [production]
09:53 <jayme@deploy1002> helmfile [staging] DONE helmfile.d/services/thumbor: apply [production]
09:53 <jayme@deploy1002> helmfile [staging] START helmfile.d/services/thumbor: apply [production]
09:47 <fabfur@cumin1002> conftool action : set/pooled=yes; selector: name=cp4048.ulsfo.wmnet [production]
09:37 <godog> roll upgrade prometheus-statsd-exporter to baremetal - T302373 [production]
09:34 <taavi@deploy1002> Finished scap: Backport for [[gerrit:1040222|Reapply "wikitech: Replace OSM class in Gerrit blocking hook"]] (duration: 11m 17s) [production]
09:25 <taavi@deploy1002> taavi: Continuing with sync [production]
09:25 <taavi@deploy1002> taavi: Backport for [[gerrit:1040222|Reapply "wikitech: Replace OSM class in Gerrit blocking hook"]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
09:24 <volans@cumin1002> END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox [production]
09:24 <volans@cumin1002> START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox [production]
09:22 <taavi@deploy1002> Started scap: Backport for [[gerrit:1040222|Reapply "wikitech: Replace OSM class in Gerrit blocking hook"]] [production]
09:22 <volans@cumin1002> END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary [production]
09:22 <volans@cumin1002> START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary [production]
09:16 <marostegui@cumin1002> dbctl commit (dc=all): 'Depooling db1173 (T364069)', diff saved to https://phabricator.wikimedia.org/P64517 and previous config saved to /var/cache/conftool/dbconfig/20240610-091631-marostegui.json [production]
09:16 <marostegui@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance [production]
09:16 <marostegui@cumin1002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance [production]
09:16 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1168 (T364069)', diff saved to https://phabricator.wikimedia.org/P64516 and previous config saved to /var/cache/conftool/dbconfig/20240610-091606-marostegui.json [production]
09:15 <arnaudb@cumin1002> dbctl commit (dc=all): 'Promote db2207 to s2 primary T367019', diff saved to https://phabricator.wikimedia.org/P64515 and previous config saved to /var/cache/conftool/dbconfig/20240610-091506-arnaudb.json [production]
09:14 <arnaudb> Starting s2 codfw failover from db2204 to db2207 - T367019 [production]
09:01 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2015.codfw.wmnet [production]
09:01 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2015.codfw.wmnet [production]
09:01 <godog> upload prometheus-statsd-exporter 0.26.1-1 to apt - T302373 [production]
09:00 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P64514 and previous config saved to /var/cache/conftool/dbconfig/20240610-090058-marostegui.json [production]
09:00 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1013.eqiad.wmnet [production]
09:00 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1013.eqiad.wmnet [production]
08:57 <arnaudb@cumin1002> dbctl commit (dc=all): 'Set db2207 with weight 0 T367019', diff saved to https://phabricator.wikimedia.org/P64513 and previous config saved to /var/cache/conftool/dbconfig/20240610-085721-arnaudb.json [production]
08:57 <arnaudb@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary switchover s2 T367019 [production]
08:56 <arnaudb@cumin1002> START - Cookbook sre.hosts.downtime for 1:00:00 on 26 hosts with reason: Primary switchover s2 T367019 [production]
08:55 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2207 (re)pooling @ 100%: post maintenance repool', diff saved to https://phabricator.wikimedia.org/P64512 and previous config saved to /var/cache/conftool/dbconfig/20240610-085548-arnaudb.json [production]
08:54 <godog> upgrade prometheus-statsd-exporter on webperf - T302373 [production]
08:53 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ganeti1013.eqiad.wmnet [production]
08:53 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host ganeti2015.codfw.wmnet [production]
08:51 <cmooney@cumin1002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
08:50 <cmooney@cumin1002> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new entries for cr2-codfw peering to ssw1-d8-codfw - cmooney@cumin1002" [production]
08:50 <cmooney@cumin1002> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new entries for cr2-codfw peering to ssw1-d8-codfw - cmooney@cumin1002" [production]
08:48 <fabfur@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4048.ulsfo.wmnet [production]
08:47 <cmooney@cumin1002> START - Cookbook sre.dns.netbox [production]
08:46 <jmm@cumin2002> START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1013.eqiad.wmnet [production]
08:46 <jmm@cumin2002> START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2015.codfw.wmnet [production]
08:45 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P64511 and previous config saved to /var/cache/conftool/dbconfig/20240610-084550-marostegui.json [production]
08:41 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2014.codfw.wmnet [production]
08:41 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2014.codfw.wmnet [production]
08:41 <jmm@cumin2002> END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1012.eqiad.wmnet [production]
08:41 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1012.eqiad.wmnet [production]
08:40 <arnaudb@cumin1002> dbctl commit (dc=all): 'db2207 (re)pooling @ 75%: post maintenance repool', diff saved to https://phabricator.wikimedia.org/P64510 and previous config saved to /var/cache/conftool/dbconfig/20240610-084042-arnaudb.json [production]
08:39 <fabfur@cumin1002> START - Cookbook sre.hosts.reboot-single for host cp4048.ulsfo.wmnet [production]
08:39 <fabfur@cumin1002> conftool action : set/pooled=no; selector: name=cp4048.ulsfo.wmnet [production]
08:36 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ping1004.eqiad.wmnet with OS bookworm [production]