1351-1400 of 10000 results (122ms)
2025-10-14 §
05:53 <marostegui@cumin1003> START - Cookbook sre.mysql.depool es1032 - Depool es1032.eqiad.wmnet to then clone it to es1055.eqiad.wmnet - marostegui@cumin1003 [production]
05:53 <marostegui@cumin1003> START - Cookbook sre.mysql.clone_es of es1032.eqiad.wmnet onto es1055.eqiad.wmnet [production]
05:52 <marostegui@cumin1003> dbctl commit (dc=all): 'db1244 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P83840 and previous config saved to /var/cache/conftool/dbconfig/20251014-055206-root.json [production]
05:46 <marostegui@cumin1003> dbctl commit (dc=all): 'db1221 (re)pooling @ 100%: 10', diff saved to https://phabricator.wikimedia.org/P83839 and previous config saved to /var/cache/conftool/dbconfig/20251014-054631-root.json [production]
05:44 <marostegui@cumin1003> dbctl commit (dc=all): 'es1053 (re)pooling @ 5%: Host provisioned T406488', diff saved to https://phabricator.wikimedia.org/P83838 and previous config saved to /var/cache/conftool/dbconfig/20251014-054432-root.json [production]
05:43 <marostegui@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1244.eqiad.wmnet with reason: Maintenance [production]
05:42 <marostegui@cumin1003> dbctl commit (dc=all): 'Depool db1244 T407176', diff saved to https://phabricator.wikimedia.org/P83837 and previous config saved to /var/cache/conftool/dbconfig/20251014-054200-marostegui.json [production]
05:41 <marostegui@cumin1003> dbctl commit (dc=all): 'Promote db1160 to s4 primary T407176', diff saved to https://phabricator.wikimedia.org/P83836 and previous config saved to /var/cache/conftool/dbconfig/20251014-054118-marostegui.json [production]
05:41 <marostegui> Starting s4 eqiad failover from db1244 to db1160 - T407176 [production]
05:40 <marostegui@cumin1003> dbctl commit (dc=all): 'es1050 (re)pooling @ 5%: Host provisioned T406488', diff saved to https://phabricator.wikimedia.org/P83835 and previous config saved to /var/cache/conftool/dbconfig/20251014-054014-root.json [production]
05:37 <root@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 32 hosts with reason: Primary switchover s4 T407176 [production]
05:36 <marostegui@cumin1003> dbctl commit (dc=all): 'Set db1160 with weight 0 T407176', diff saved to https://phabricator.wikimedia.org/P83834 and previous config saved to /var/cache/conftool/dbconfig/20251014-053654-marostegui.json [production]
05:31 <marostegui@cumin1003> dbctl commit (dc=all): 'db1221 (re)pooling @ 75%: 10', diff saved to https://phabricator.wikimedia.org/P83833 and previous config saved to /var/cache/conftool/dbconfig/20251014-053125-root.json [production]
05:29 <marostegui@cumin1003> dbctl commit (dc=all): 'es1053 (re)pooling @ 1%: Host provisioned T406488', diff saved to https://phabricator.wikimedia.org/P83832 and previous config saved to /var/cache/conftool/dbconfig/20251014-052926-root.json [production]
05:27 <marostegui@cumin1003> END (PASS) - Cookbook sre.mysql.depool (exit_code=0) es1031 - Depool es1031.eqiad.wmnet to then clone it to es1054.eqiad.wmnet - marostegui@cumin1003 [production]
05:26 <marostegui@cumin1003> END (PASS) - Cookbook sre.mysql.clone_es (exit_code=0) of es1033.eqiad.wmnet onto es1056.eqiad.wmnet [production]
05:26 <marostegui@cumin1003> END (PASS) - Cookbook sre.mysql.pool (exit_code=0) es1033 gradually with 4 steps - Pool es1033.eqiad.wmnet in after cloning [production]
05:25 <marostegui@cumin1003> dbctl commit (dc=all): 'es1050 (re)pooling @ 1%: Host provisioned T406488', diff saved to https://phabricator.wikimedia.org/P83830 and previous config saved to /var/cache/conftool/dbconfig/20251014-052508-root.json [production]
05:20 <marostegui@cumin1003> START - Cookbook sre.mysql.depool es1031 - Depool es1031.eqiad.wmnet to then clone it to es1054.eqiad.wmnet - marostegui@cumin1003 [production]
05:20 <marostegui@cumin1003> START - Cookbook sre.mysql.clone_es of es1031.eqiad.wmnet onto es1054.eqiad.wmnet [production]
05:16 <marostegui@cumin1003> dbctl commit (dc=all): 'db1221 (re)pooling @ 50%: 10', diff saved to https://phabricator.wikimedia.org/P83828 and previous config saved to /var/cache/conftool/dbconfig/20251014-051619-root.json [production]
05:14 <marostegui@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on es[1031-1032].eqiad.wmnet with reason: Cloning [production]
05:01 <marostegui@cumin1003> dbctl commit (dc=all): 'db1221 (re)pooling @ 25%: 10', diff saved to https://phabricator.wikimedia.org/P83826 and previous config saved to /var/cache/conftool/dbconfig/20251014-050113-root.json [production]
04:53 <marostegui@cumin1003> dbctl commit (dc=all): 'Depool db1221 for migration to mariadb 10.11', diff saved to https://phabricator.wikimedia.org/P83824 and previous config saved to /var/cache/conftool/dbconfig/20251014-045305-marostegui.json [production]
04:53 <marostegui@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db1221.eqiad.wmnet with reason: Maintenance [production]
04:52 <marostegui@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 14 hosts with reason: Upgrading [production]
04:41 <marostegui@cumin1003> START - Cookbook sre.mysql.pool es1033 gradually with 4 steps - Pool es1033.eqiad.wmnet in after cloning [production]
04:02 <mwpresync@deploy2002> Pruned MediaWiki: 1.45.0-wmf.20 (duration: 02m 42s) [production]
03:48 <mwpresync@deploy2002> Finished scap sync-world: testwikis to 1.45.0-wmf.23 refs T405679 (duration: 45m 02s) [production]
03:03 <mwpresync@deploy2002> Started scap sync-world: testwikis to 1.45.0-wmf.23 refs T405679 [production]
02:24 <denisse@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf2003.codfw.wmnet [production]
02:20 <denisse@cumin2002> START - Cookbook sre.hosts.reboot-single for host webperf2003.codfw.wmnet [production]
02:09 <denisse@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host webperf1003.eqiad.wmnet [production]
02:05 <denisse@cumin2002> START - Cookbook sre.hosts.reboot-single for host webperf1003.eqiad.wmnet [production]
01:58 <denisse@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog2002.codfw.wmnet [production]
01:52 <denisse@cumin2002> START - Cookbook sre.hosts.reboot-single for host mwlog2002.codfw.wmnet [production]
01:45 <denisse@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwlog1002.eqiad.wmnet [production]
01:39 <denisse@cumin2002> START - Cookbook sre.hosts.reboot-single for host mwlog1002.eqiad.wmnet [production]
01:14 <mwpresync@deploy2002> Finished scap build-images: Publishing wmf/next image (duration: 13m 20s) [production]
01:00 <mwpresync@deploy2002> Started scap build-images: Publishing wmf/next image [production]
2025-10-13 §
23:50 <musikanimal@deploy2002> Finished scap sync-world: Backport for [[gerrit:1195756|Add 'accepted' status (T406674)]] (duration: 40m 01s) [production]
23:38 <musikanimal@deploy2002> musikanimal: Continuing with sync [production]
23:36 <musikanimal@deploy2002> musikanimal: Backport for [[gerrit:1195756|Add 'accepted' status (T406674)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [production]
23:29 <btullis@cumin1003> END (PASS) - Cookbook sre.presto.reboot-workers (exit_code=0) for Presto an-presto cluster: Reboot Presto nodes [production]
23:10 <musikanimal@deploy2002> Started scap sync-world: Backport for [[gerrit:1195756|Add 'accepted' status (T406674)]] [production]
22:34 <denisse@cumin2002> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon2003.codfw.wmnet [production]
22:30 <denisse@cumin2002> START - Cookbook sre.hosts.reboot-single for host kafkamon2003.codfw.wmnet [production]
22:01 <btullis@cumin1003> START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes [production]
22:01 <btullis@deploy2002> helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'. [production]
22:01 <btullis@cumin1003> END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid analytics cluster: Reboot Druid nodes [production]