1151-1200 of 10000 results (23ms)
2025-11-25 ยง
16:46 <cgoubert@cumin1003> START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-ctrl1003.eqiad.wmnet [production]
16:46 <daniel@deploy2002> helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply [production]
16:46 <daniel@deploy2002> helmfile [codfw] START helmfile.d/services/rest-gateway: apply [production]
16:45 <cgoubert@deploy2002> Locking from deployment [MediaWiki]: Depooling wikikube-ctrl1003 [production]
16:44 <jforrester@deploy2002> Finished scap sync-world: Backport for [[gerrit:1211139|Select zid after highest if latest zid insertion is taken (T410895)]] (duration: 12m 07s) [production]
16:43 <daniel@deploy2002> helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply [production]
16:42 <daniel@deploy2002> helmfile [eqiad] START helmfile.d/services/rest-gateway: apply [production]
16:39 <jforrester@deploy2002> jforrester: Continuing with sync [production]
16:37 <jhathaway@cumin1003> START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm [production]
16:37 <jhathaway@cumin1003> END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sretest1005.eqiad.wmnet with OS bookworm [production]
16:37 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P85649 and previous config saved to /var/cache/conftool/dbconfig/20251125-163700-marostegui.json [production]
16:36 <jforrester@deploy2002> jforrester: Backport for [[gerrit:1211139|Select zid after highest if latest zid insertion is taken (T410895)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [production]
16:32 <jforrester@deploy2002> Started scap sync-world: Backport for [[gerrit:1211139|Select zid after highest if latest zid insertion is taken (T410895)]] [production]
16:28 <daniel@deploy2002> helmfile [staging] DONE helmfile.d/services/rest-gateway: apply [production]
16:27 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2011.codfw.wmnet [production]
16:26 <daniel@deploy2002> helmfile [staging] START helmfile.d/services/rest-gateway: apply [production]
16:26 <jnuche@deploy2002> Finished scap sync-world: Backport for [[gerrit:1211158|Add the full set of post-processing options to the ParserOptions array (T411017)]] (duration: 08m 42s) [production]
16:24 <jhathaway@cumin1003> START - Cookbook sre.hosts.reimage for host sretest1005.eqiad.wmnet with OS bookworm [production]
16:23 <jhathaway@cumin2002> END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host sretest1005.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL [production]
16:22 <jnuche@deploy2002> jnuche: Continuing with sync [production]
16:21 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P85646 and previous config saved to /var/cache/conftool/dbconfig/20251125-162152-marostegui.json [production]
16:21 <jnuche@deploy2002> jnuche: Backport for [[gerrit:1211158|Add the full set of post-processing options to the ParserOptions array (T411017)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [production]
16:21 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host maps2011.codfw.wmnet [production]
16:20 <ayounsi@cumin1003> END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) maps2011.codfw.wmnet on all recursors [production]
16:20 <ayounsi@cumin1003> START - Cookbook sre.dns.wipe-cache maps2011.codfw.wmnet on all recursors [production]
16:19 <ayounsi@cumin1003> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
16:19 <ayounsi@cumin1003> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps2011 - ayounsi@cumin1003" [production]
16:19 <ayounsi@cumin1003> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add AAAA for maps2011 - ayounsi@cumin1003" [production]
16:17 <jnuche@deploy2002> Started scap sync-world: Backport for [[gerrit:1211158|Add the full set of post-processing options to the ParserOptions array (T411017)]] [production]
16:16 <jhathaway@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on sretest1005.eqiad.wmnet with reason: sleep test [production]
16:16 <jhathaway@cumin2002> START - Cookbook sre.hosts.provision for host sretest1005.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL [production]
16:16 <robh@cumin2002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:08:00 on kafka-main1009.eqiad.wmnet with reason: C/D Migration [production]
16:15 <ayounsi@cumin1003> START - Cookbook sre.dns.netbox [production]
16:14 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2012.codfw.wmnet [production]
16:12 <hnowlan@deploy1003> helmfile [codfw] DONE helmfile.d/services/thumbor: apply [production]
16:12 <hnowlan@deploy1003> helmfile [codfw] START helmfile.d/services/thumbor: apply [production]
16:10 <vgutierrez> repool cp7001 [production]
16:09 <ayounsi@cumin1003> END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) maps2012.codfw.wmnet on all recursors [production]
16:09 <ayounsi@cumin1003> START - Cookbook sre.dns.wipe-cache maps2012.codfw.wmnet on all recursors [production]
16:09 <moritzm> installing glibc security updates [production]
16:08 <fceratto@cumin1003> START - Cookbook sre.mysql.pool db1187 gradually with 4 steps - Repooling due to T410508 [production]
16:07 <jmm@cumin2002> START - Cookbook sre.hosts.reboot-single for host maps2012.codfw.wmnet [production]
16:07 <aikochou@deploy2002> helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' . [production]
16:07 <hnowlan@deploy1003> helmfile [eqiad] DONE helmfile.d/services/thumbor: apply [production]
16:07 <hnowlan@deploy1003> helmfile [eqiad] START helmfile.d/services/thumbor: apply [production]
16:06 <claime> Eviction partition leadership from kafka-main1009 - T405950 [production]
16:06 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db2223 (T410531)', diff saved to https://phabricator.wikimedia.org/P85644 and previous config saved to /var/cache/conftool/dbconfig/20251125-160646-marostegui.json [production]
16:05 <aikochou@deploy2002> helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' . [production]
16:02 <fceratto@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance [production]
16:02 <marostegui@cumin1003> dbctl commit (dc=all): 'Depooling db2223 (T410531)', diff saved to https://phabricator.wikimedia.org/P85643 and previous config saved to /var/cache/conftool/dbconfig/20251125-160208-marostegui.json [production]