651-700 of 10000 results (20ms)
2026-01-19 ยง
11:35 <marostegui@cumin1003> dbctl commit (dc=all): 'Promote db1244 to s4 primary T414542', diff saved to https://phabricator.wikimedia.org/P87747 and previous config saved to /var/cache/conftool/dbconfig/20260119-113518-marostegui.json [production]
11:34 <marostegui> Starting s4 eqiad failover from db1160 to db1244 - T414542 [production]
11:34 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply [production]
11:34 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply [production]
11:33 <btullis@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1187.eqiad.wmnet [production]
11:31 <vgutierrez@cumin1003> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp7004.magru.wmnet [production]
11:31 <vgutierrez@cumin1003> START - Cookbook sre.hosts.remove-downtime for cp7004.magru.wmnet [production]
11:30 <vgutierrez@cumin1003> END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 111 hosts [production]
11:29 <vgutierrez@cumin1003> START - Cookbook sre.hosts.remove-downtime for 111 hosts [production]
11:28 <marostegui@cumin1003> dbctl commit (dc=all): 'Set db1244 with weight 0 T414542', diff saved to https://phabricator.wikimedia.org/P87746 and previous config saved to /var/cache/conftool/dbconfig/20260119-112825-marostegui.json [production]
11:28 <marostegui@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 42 hosts with reason: Primary switchover s4 T414542 [production]
11:26 <btullis@cumin1003> START - Cookbook sre.hosts.reboot-single for host an-worker1187.eqiad.wmnet [production]
10:58 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply [production]
10:56 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply [production]
10:55 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
10:54 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
10:52 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
10:51 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
10:39 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db1262 (T413525)', diff saved to https://phabricator.wikimedia.org/P87745 and previous config saved to /var/cache/conftool/dbconfig/20260119-103917-marostegui.json [production]
10:29 <Emperor> restart apus rgws in eqiad [production]
10:29 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P87744 and previous config saved to /var/cache/conftool/dbconfig/20260119-102909-marostegui.json [production]
10:24 <ayounsi@cumin1003> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2003.codfw.wmnet [production]
10:19 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P87743 and previous config saved to /var/cache/conftool/dbconfig/20260119-101901-marostegui.json [production]
10:12 <ayounsi@cumin1003> START - Cookbook sre.hosts.reboot-single for host sretest2003.codfw.wmnet [production]
10:11 <marostegui@cumin1003> dbctl commit (dc=all): 'Depooling db2240 (T413525)', diff saved to https://phabricator.wikimedia.org/P87742 and previous config saved to /var/cache/conftool/dbconfig/20260119-101136-marostegui.json [production]
10:11 <marostegui@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance [production]
10:11 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db2248 (T413525)', diff saved to https://phabricator.wikimedia.org/P87741 and previous config saved to /var/cache/conftool/dbconfig/20260119-101111-marostegui.json [production]
10:08 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db1262 (T413525)', diff saved to https://phabricator.wikimedia.org/P87740 and previous config saved to /var/cache/conftool/dbconfig/20260119-100852-marostegui.json [production]
10:01 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P87739 and previous config saved to /var/cache/conftool/dbconfig/20260119-100103-marostegui.json [production]
09:50 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P87738 and previous config saved to /var/cache/conftool/dbconfig/20260119-095055-marostegui.json [production]
09:40 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db2248 (T413525)', diff saved to https://phabricator.wikimedia.org/P87737 and previous config saved to /var/cache/conftool/dbconfig/20260119-094048-marostegui.json [production]
09:35 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply [production]
09:35 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply [production]
09:16 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
09:15 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
08:58 <physikerwelt> running `mkdir -p /srv/qlever2` cd [wikiqlever]
08:50 <dpogorzelski@deploy2002> helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. [production]
08:49 <dpogorzelski@deploy2002> helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. [production]
08:47 <XioNoX> continue asw1-b12-drmrs troubleshooting - T413181 [production]
08:46 <dpogorzelski@deploy2002> helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'. [production]
08:45 <dpogorzelski@deploy2002> helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'. [production]
08:20 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'. [production]
08:19 <brouberol@deploy2002> helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'. [production]
08:16 <brouberol@dns1004> END - running authdns-update [production]
08:15 <brouberol@dns1004> START - running authdns-update [production]
08:11 <brouberol@dns1004> START - running authdns-update [production]
07:44 <bwojtowicz@deploy2002> helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' . [production]
06:45 <marostegui@cumin1003> dbctl commit (dc=all): 'Depooling db2176 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P87736 and previous config saved to /var/cache/conftool/dbconfig/20260119-064555-marostegui.json [production]
06:45 <marostegui@cumin1003> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db2176.codfw.wmnet with reason: Maintenance [production]
06:45 <marostegui@cumin1003> dbctl commit (dc=all): 'Repooling after maintenance db2174 (T411163 T411164)', diff saved to https://phabricator.wikimedia.org/P87735 and previous config saved to /var/cache/conftool/dbconfig/20260119-064531-marostegui.json [production]