2551-2600 of 10000 results (128ms)
2025-04-14 §
16:03 <vgutierrez@cumin1002> END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-text_ulsfo and not P{cp4037.ulsfo.wmnet} and A:cp [production]
15:59 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P74970 and previous config saved to /var/cache/conftool/dbconfig/20250414-155925-fceratto.json [production]
15:58 <vriley@cumin1002> START - Cookbook sre.hosts.provision for host an-worker1181.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL [production]
15:57 <vriley@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1181.eqiad.wmnet with OS bullseye [production]
15:56 <fceratto@dns1004> END - running authdns-update [production]
15:53 <fceratto@dns1004> START - running authdns-update [production]
15:45 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P74969 and previous config saved to /var/cache/conftool/dbconfig/20250414-154419-fceratto.json [production]
15:44 <fceratto@cumin1002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
15:43 <fceratto@cumin1002> START - Cookbook sre.dns.netbox [production]
15:40 <fceratto@cumin1002> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
15:37 <urandom> bootstrapping Cassandra/restbase1044-a — T389423 [production]
15:37 <fceratto@cumin1002> START - Cookbook sre.dns.netbox [production]
15:32 <eevans@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on restbase1044.eqiad.wmnet with reason: Bootstrapping — T389423 [production]
15:30 <vgutierrez@cumin1002> END (PASS) - Cookbook sre.cdn.roll-upgrade-varnish (exit_code=0) rolling upgrade of Varnish on A:cp-upload_ulsfo and not P{cp4047.ulsfo.wmnet} and not P{cp4045.ulsfo.wmnet} and A:cp [production]
15:29 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2149 (T391056)', diff saved to https://phabricator.wikimedia.org/P74968 and previous config saved to /var/cache/conftool/dbconfig/20250414-152911-fceratto.json [production]
15:26 <volans> deployed homer v0.9.0 to cumin hosts [production]
15:25 <vriley@cumin1002> START - Cookbook sre.hosts.reimage for host an-worker1181.eqiad.wmnet with OS bullseye [production]
15:25 <volans@cumin1002> END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: Release v0.9.0 - volans@cumin1002 [production]
15:24 <jiji@cumin1002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2002.codfw.wmnet [production]
15:23 <volans@cumin1002> START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1002.eqiad.wmnet with reason: Release v0.9.0 - volans@cumin1002 [production]
15:15 <jiji@cumin1002> START - Cookbook sre.hosts.reboot-single for host mc-misc2002.codfw.wmnet [production]
15:13 <fceratto@cumin1002> dbctl commit (dc=all): 'Depooling db2149 (T391056)', diff saved to https://phabricator.wikimedia.org/P74967 and previous config saved to /var/cache/conftool/dbconfig/20250414-151316-fceratto.json [production]
15:13 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2149.codfw.wmnet with reason: Maintenance [production]
15:07 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance [production]
15:02 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1240.eqiad.wmnet with reason: Maintenance [production]
15:02 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1212 (T391056)', diff saved to https://phabricator.wikimedia.org/P74966 and previous config saved to /var/cache/conftool/dbconfig/20250414-150200-fceratto.json [production]
14:46 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P74965 and previous config saved to /var/cache/conftool/dbconfig/20250414-144653-fceratto.json [production]
14:31 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P74964 and previous config saved to /var/cache/conftool/dbconfig/20250414-143146-fceratto.json [production]
14:26 <bking@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cirrussearch2104.codfw.wmnet with OS bullseye [production]
14:16 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1212 (T391056)', diff saved to https://phabricator.wikimedia.org/P74963 and previous config saved to /var/cache/conftool/dbconfig/20250414-141639-fceratto.json [production]
14:12 <fceratto@cumin1002> dbctl commit (dc=all): 'Depooling db1212 (T391056)', diff saved to https://phabricator.wikimedia.org/P74962 and previous config saved to /var/cache/conftool/dbconfig/20250414-141227-fceratto.json [production]
14:12 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance [production]
14:12 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1212.eqiad.wmnet with reason: Maintenance [production]
14:11 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1198 (T391056)', diff saved to https://phabricator.wikimedia.org/P74961 and previous config saved to /var/cache/conftool/dbconfig/20250414-141148-fceratto.json [production]
14:04 <bking@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cirrussearch2104.codfw.wmnet with reason: host reimage [production]
14:01 <godog> temp disable "backend time" panel using unaggregated big mediawiki metric on "reading web performance" dashboard - T391677 [production]
14:01 <bking@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on cirrussearch2104.codfw.wmnet with reason: host reimage [production]
13:57 <vriley@cumin1002> END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1178.eqiad.wmnet with OS bullseye [production]
13:56 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P74960 and previous config saved to /var/cache/conftool/dbconfig/20250414-135640-fceratto.json [production]
13:47 <arnaudb@cumin1002> END (ERROR) - Cookbook sre.gerrit.failover (exit_code=97) from gerrit1003.wikimedia.org to gerrit2003.wikimedia.org [production]
13:47 <arnaudb@cumin1002> START - Cookbook sre.gerrit.failover from gerrit1003.wikimedia.org to gerrit2003.wikimedia.org [production]
13:41 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P74956 and previous config saved to /var/cache/conftool/dbconfig/20250414-134132-fceratto.json [production]
13:41 <TheresNoTime> UTC afternoon backport window done [production]
13:40 <samtar@deploy1003> Finished scap sync-world: Backport for [[gerrit:1135850|Enable SUL3 on most remaining beta cluster wikis]], [[gerrit:1136104|punjabiwikimedia, maiwikimedia: fix tagline (T348611)]] (duration: 12m 00s) [production]
13:38 <sukhe> reprepro -C component/nginx-ech include bookworm-wikimedia nginx_1.22.1-9+deb12u1+ech2_amd64.changes: T205378 [production]
13:33 <samtar@deploy1003> matmarex, anzx, samtar: Continuing with sync [production]
13:33 <samtar@deploy1003> matmarex, anzx, samtar: Backport for [[gerrit:1135850|Enable SUL3 on most remaining beta cluster wikis]], [[gerrit:1136104|punjabiwikimedia, maiwikimedia: fix tagline (T348611)]] synced to the testservers (https://wikitech.wikimedia.org/wiki/Mwdebug) [production]
13:30 <bking@cumin2002> END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host cirrussearch2104 [production]
13:30 <bking@cumin2002> START - Cookbook sre.hosts.move-vlan for host cirrussearch2104 [production]
13:30 <bking@cumin2002> START - Cookbook sre.hosts.reimage for host cirrussearch2104.codfw.wmnet with OS bullseye [production]