501-550 of 10000 results (72ms)
2022-09-15 §
07:46 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 7:00:00 on db[2133,2160].codfw.wmnet with reason: reboot [production]
07:46 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db[2134,2160].codfw.wmnet with reason: reboot [production]
07:46 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 7:00:00 on db[2134,2160].codfw.wmnet with reason: reboot [production]
07:46 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db[2135,2160].codfw.wmnet with reason: reboot [production]
07:46 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 7:00:00 on db[2135,2160].codfw.wmnet with reason: reboot [production]
07:43 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
07:40 <marostegui@cumin1001> dbctl commit (dc=all): 'db2115 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34755 and previous config saved to /var/cache/conftool/dbconfig/20220915-074026-root.json [production]
07:35 <marostegui@cumin1001> dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34754 and previous config saved to /var/cache/conftool/dbconfig/20220915-073557-root.json [production]
07:25 <marostegui@cumin1001> dbctl commit (dc=all): 'db2115 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34753 and previous config saved to /var/cache/conftool/dbconfig/20220915-072520-root.json [production]
07:20 <marostegui@cumin1001> dbctl commit (dc=all): 'db2105 (re)pooling @ 10%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34752 and previous config saved to /var/cache/conftool/dbconfig/20220915-072053-root.json [production]
07:17 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2151.codfw.wmnet with reason: reboot [production]
07:17 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db2151.codfw.wmnet with reason: reboot [production]
07:14 <moritzm> installing zlib security updates [production]
07:13 <jmm@cumin2002> END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry [production]
07:13 <mwdebug-deploy@deploy1002> helmfile [codfw] DONE helmfile.d/services/mwdebug: apply [production]
07:12 <mwdebug-deploy@deploy1002> helmfile [codfw] START helmfile.d/services/mwdebug: apply [production]
07:12 <mwdebug-deploy@deploy1002> helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply [production]
07:11 <jmm@cumin2002> START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry [production]
07:10 <marostegui@cumin1001> dbctl commit (dc=all): 'db2115 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34751 and previous config saved to /var/cache/conftool/dbconfig/20220915-071015-root.json [production]
07:09 <jmm@cumin2002> END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry [production]
07:06 <jmm@cumin2002> START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry [production]
07:05 <marostegui@cumin1001> dbctl commit (dc=all): 'db2105 (re)pooling @ 5%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34750 and previous config saved to /var/cache/conftool/dbconfig/20220915-070548-root.json [production]
07:05 <mwdebug-deploy@deploy1002> helmfile [eqiad] START helmfile.d/services/mwdebug: apply [production]
06:55 <marostegui@cumin1001> dbctl commit (dc=all): 'db2115 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34749 and previous config saved to /var/cache/conftool/dbconfig/20220915-065510-root.json [production]
06:50 <marostegui@cumin1001> dbctl commit (dc=all): 'db2105 (re)pooling @ 3%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34748 and previous config saved to /var/cache/conftool/dbconfig/20220915-065043-root.json [production]
06:47 <marostegui@cumin1001> dbctl commit (dc=all): 'Give some weight to db2096 T317842', diff saved to https://phabricator.wikimedia.org/P34747 and previous config saved to /var/cache/conftool/dbconfig/20220915-064750-marostegui.json [production]
06:46 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2115 T317842', diff saved to https://phabricator.wikimedia.org/P34746 and previous config saved to /var/cache/conftool/dbconfig/20220915-064635-marostegui.json [production]
06:45 <marostegui@cumin1001> dbctl commit (dc=all): 'Promote db2096 to x1 primary and set section read-write T317842', diff saved to https://phabricator.wikimedia.org/P34745 and previous config saved to /var/cache/conftool/dbconfig/20220915-064525-root.json [production]
06:44 <marostegui> Starting x1 codfw failover from db2115 to db2096 - T317842 [production]
06:40 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: Primary switchover x1 T317842 [production]
06:40 <marostegui@cumin1001> dbctl commit (dc=all): 'Set db2096 with weight 0 T317842', diff saved to https://phabricator.wikimedia.org/P34744 and previous config saved to /var/cache/conftool/dbconfig/20220915-064014-root.json [production]
06:40 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: Primary switchover x1 T317842 [production]
06:35 <marostegui@cumin1001> dbctl commit (dc=all): 'db2105 (re)pooling @ 1%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34743 and previous config saved to /var/cache/conftool/dbconfig/20220915-063538-root.json [production]
06:14 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db2105 T317839', diff saved to https://phabricator.wikimedia.org/P34742 and previous config saved to /var/cache/conftool/dbconfig/20220915-061421-root.json [production]
06:13 <marostegui@cumin1001> dbctl commit (dc=all): 'Promote db2127 to s3 codfw T317839', diff saved to https://phabricator.wikimedia.org/P34741 and previous config saved to /var/cache/conftool/dbconfig/20220915-061317-marostegui.json [production]
06:12 <marostegui> Starting s3 codfw failover from db2105 to db2127 - T317839 [production]
06:03 <marostegui@cumin1001> dbctl commit (dc=all): 'Set db2127 with weight 0 T317839', diff saved to https://phabricator.wikimedia.org/P34740 and previous config saved to /var/cache/conftool/dbconfig/20220915-060307-root.json [production]
06:02 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Codfw switchover s3 T317839 [production]
06:02 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on 23 hosts with reason: Codfw switchover s3 T317839 [production]
05:32 <marostegui@cumin1001> END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 7 days, 0:00:00 on db1189.eqiad.wmnet with reason: down T317662 [production]
05:32 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1189.eqiad.wmnet with reason: down T317662 [production]
05:12 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1189.eqiad.wmnet with reason: down T317662 [production]
05:12 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1189.eqiad.wmnet with reason: down T317662 [production]
2022-09-14 §
22:08 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Depooling db1190 (T314041)', diff saved to https://phabricator.wikimedia.org/P34739 and previous config saved to /var/cache/conftool/dbconfig/20220914-220822-ladsgroup.json [production]
22:08 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance [production]
22:08 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance [production]
22:08 <ladsgroup@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance [production]
22:07 <ladsgroup@cumin1001> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance [production]
22:07 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1149 (T314041)', diff saved to https://phabricator.wikimedia.org/P34738 and previous config saved to /var/cache/conftool/dbconfig/20220914-220744-ladsgroup.json [production]
21:52 <ladsgroup@cumin1001> dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P34737 and previous config saved to /var/cache/conftool/dbconfig/20220914-215238-ladsgroup.json [production]