2101-2150 of 10000 results (69ms)
2023-03-28 ยง
09:28 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main1001.eqiad.wmnet with reason: stop kafka and dist-upgrade [production]
09:28 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main1001.eqiad.wmnet with reason: stop kafka and dist-upgrade [production]
09:12 <jbond@cumin1001> END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Nicolas Fraison out of all services on: 2048 hosts [production]
09:11 <jbond@cumin1001> START - Cookbook sre.idm.logout Logging Nicolas Fraison out of all services on: 2048 hosts [production]
09:11 <jbond@cumin1001> END (ERROR) - Cookbook sre.idm.logout (exit_code=97) Logging Nicolas Fraison out of systemdlogoutd on: 2048 hosts [production]
09:11 <jbond@cumin1001> START - Cookbook sre.idm.logout Logging Nicolas Fraison out of systemdlogoutd on: 2048 hosts [production]
08:58 <vgutierrez> restart ipmiseld on cp2035 [production]
08:50 <aborrero@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2005-dev.wikimedia.org [production]
08:49 <ayounsi@deploy1002> helmfile [eqiad] DONE helmfile.d/admin 'apply'. [production]
08:48 <AndyRussG> update payments.wiki config 65bedd4a -> e31ffd7d, payments (automatic updates only) a6c6c2b1 -> f5ec2677 [production]
08:45 <ayounsi@deploy1002> helmfile [eqiad] START helmfile.d/admin 'apply'. [production]
08:43 <ayounsi@deploy1002> helmfile [codfw] DONE helmfile.d/admin 'apply'. [production]
08:42 <aborrero@cumin2002> START - Cookbook sre.hosts.reboot-single for host cloudservices2005-dev.wikimedia.org [production]
08:39 <ayounsi@deploy1002> helmfile [codfw] START helmfile.d/admin 'apply'. [production]
08:37 <ayounsi@deploy1002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
08:35 <ayounsi@deploy1002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
08:34 <ayounsi@deploy1002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'. [production]
08:32 <ayounsi@deploy1002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'. [production]
08:32 <phedenskog@deploy2002> Finished deploy [performance/navtiming@e757bdf]: (no justification provided) (duration: 00m 06s) [production]
08:32 <phedenskog@deploy2002> Started deploy [performance/navtiming@e757bdf]: (no justification provided) [production]
08:31 <ayounsi@deploy1002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'. [production]
08:29 <ayounsi@deploy1002> helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'. [production]
08:25 <ayounsi@deploy1002> helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
08:21 <ayounsi@deploy1002> helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
08:14 <ayounsi@deploy1002> helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'. [production]
08:11 <oblivian@deploy2002> Finished scap: Backport for [[gerrit:903209|Failover statsd to graphite2004 (T330165)]] (duration: 08m 48s) [production]
08:08 <ayounsi@deploy1002> helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'. [production]
08:06 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on 16 hosts with reason: Switch maintenance [production]
08:05 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 8:00:00 on 16 hosts with reason: Switch maintenance [production]
08:05 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on 21 hosts with reason: Switch maintenance [production]
08:05 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 8:00:00 on 21 hosts with reason: Switch maintenance [production]
08:04 <oblivian@deploy2002> oblivian and filippo: Backport for [[gerrit:903209|Failover statsd to graphite2004 (T330165)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet [production]
08:03 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on es[1020-1022].eqiad.wmnet with reason: Switch maintenance [production]
08:03 <ayounsi@deploy1002> helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'. [production]
08:03 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 8:00:00 on es[1020-1022].eqiad.wmnet with reason: Switch maintenance [production]
08:02 <oblivian@deploy2002> Started scap: Backport for [[gerrit:903209|Failover statsd to graphite2004 (T330165)]] [production]
08:02 <ayounsi@deploy1002> helmfile [staging-eqiad] START helmfile.d/admin 'apply'. [production]
08:00 <ayounsi@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
08:00 <godog> move graphite reads to codfw - T330165 [production]
07:56 <jayme@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
07:56 <jayme@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
07:56 <ayounsi@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
07:54 <root@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
07:54 <root@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
07:51 <ayounsi@deploy1002> helmfile [staging-codfw] DONE helmfile.d/admin 'apply'. [production]
07:51 <ayounsi@deploy1002> helmfile [staging-codfw] START helmfile.d/admin 'apply'. [production]
07:31 <marostegui@cumin1001> dbctl commit (dc=all): 'db1179 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P45965 and previous config saved to /var/cache/conftool/dbconfig/20230328-073122-root.json [production]
07:28 <ayounsi@cumin1001> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'clear' for AS: 17806 [production]
07:27 <ayounsi@cumin1001> START - Cookbook sre.network.peering with action 'clear' for AS: 17806 [production]
07:20 <kartik@deploy2002> Finished scap: Backport for [[gerrit:903003|Enable Section Translation on some wikis while Content Translation remains in beta (T308834)]] (duration: 12m 05s) [production]