2301-2350 of 10000 results (42ms)
2025-06-16 ยง
14:53 <vgutierrez@cumin1003> START - Cookbook sre.loadbalancer.admin pooling P{lvs7002.magru.wmnet} and A:liberica [production]
14:53 <vgutierrez@cumin1003> END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P{lvs7002.magru.wmnet} and A:liberica [production]
14:53 <vgutierrez@cumin1003> START - Cookbook sre.loadbalancer.admin depooling P{lvs7002.magru.wmnet} and A:liberica [production]
14:53 <vgutierrez@cumin1003> START - Cookbook sre.loadbalancer.upgrade upgradeing P{lvs7002.magru.wmnet} and A:liberica (T397036) [production]
14:50 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P78060 and previous config saved to /var/cache/conftool/dbconfig/20250616-145032-fceratto.json [production]
14:50 <andrew@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephmon2004-dev.codfw.wmnet with OS bullseye [production]
14:49 <mszabo@deploy1003> mszabo: Continuing with sync [production]
14:48 <mszabo@deploy1003> mszabo: Backport for [[gerrit:1159438|Add missing labels for email confirmation reminder preferences (T58074)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there. [production]
14:41 <marostegui@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance [production]
14:41 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1226 (T396130)', diff saved to https://phabricator.wikimedia.org/P78059 and previous config saved to /var/cache/conftool/dbconfig/20250616-144127-marostegui.json [production]
14:36 <jmm@cumin1003> END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of prometheus7002.magru.wmnet to drbd [production]
14:35 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P78058 and previous config saved to /var/cache/conftool/dbconfig/20250616-143525-fceratto.json [production]
14:32 <andrew@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephmon2004-dev.codfw.wmnet with reason: host reimage [production]
14:31 <James_F> Docker: [quibble-bullseye] Switch MariaDB to 10.6 Wikimedia package, again, for T366646 [releng]
14:29 <vgutierrez@cumin1003> END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) upgradeing P{lvs1013.eqiad.wmnet} and A:liberica (T397036) [production]
14:28 <vgutierrez@cumin1003> START - Cookbook sre.loadbalancer.upgrade upgradeing P{lvs1013.eqiad.wmnet} and A:liberica (T397036) [production]
14:28 <vgutierrez> upgrade to liberica 0.19 in lvs1013 - T397036 [production]
14:26 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P78057 and previous config saved to /var/cache/conftool/dbconfig/20250616-142620-marostegui.json [production]
14:25 <andrew@cumin1002> START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephmon2004-dev.codfw.wmnet with reason: host reimage [production]
14:24 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply [production]
14:23 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply [production]
14:21 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply [production]
14:21 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply [production]
14:20 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2159 (T395241)', diff saved to https://phabricator.wikimedia.org/P78056 and previous config saved to /var/cache/conftool/dbconfig/20250616-142017-fceratto.json [production]
14:19 <vgutierrez> upload liberica 0.19 to apt.wm.o (bookworm-wikimedia) - T397036 [production]
14:17 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply [production]
14:17 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply [production]
14:13 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply [production]
14:13 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply [production]
14:11 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P78055 and previous config saved to /var/cache/conftool/dbconfig/20250616-141113-marostegui.json [production]
14:10 <fceratto@cumin1002> dbctl commit (dc=all): 'Depooling db2159 (T395241)', diff saved to https://phabricator.wikimedia.org/P78054 and previous config saved to /var/cache/conftool/dbconfig/20250616-141044-fceratto.json [production]
14:10 <fceratto@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2159.codfw.wmnet with reason: Maintenance [production]
14:10 <mszabo@deploy1003> Started scap sync-world: Backport for [[gerrit:1159438|Add missing labels for email confirmation reminder preferences (T58074)]] [production]
14:10 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2150 (T395241)', diff saved to https://phabricator.wikimedia.org/P78053 and previous config saved to /var/cache/conftool/dbconfig/20250616-141016-fceratto.json [production]
14:09 <andrew@cumin1002> START - Cookbook sre.hosts.reimage for host cloudcephmon2004-dev.codfw.wmnet with OS bullseye [production]
14:08 <marostegui@cumin1002> dbctl commit (dc=all): 'es1026 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P78052 and previous config saved to /var/cache/conftool/dbconfig/20250616-140807-root.json [production]
14:05 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply [production]
14:05 <brouberol@deploy1003> helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply [production]
14:05 <jelto@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab-runner2002.codfw.wmnet with OS bookworm [production]
14:01 <andrew@cumin1002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephmon2004-dev.codfw.wmnet with OS bullseye [production]
13:57 <vgutierrez> use Google Trust Services (GTS) unified TLS certificate on esams - T395131 [production]
13:56 <marostegui@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db1226 (T396130)', diff saved to https://phabricator.wikimedia.org/P78051 and previous config saved to /var/cache/conftool/dbconfig/20250616-135605-marostegui.json [production]
13:55 <fceratto@cumin1002> dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P78050 and previous config saved to /var/cache/conftool/dbconfig/20250616-135507-fceratto.json [production]
13:54 <phuedx@deploy1003> Finished scap sync-world: Backport for [[gerrit:1159444|Try subresource JS autologin on SUL3 domain first if configured (T391284)]], [[gerrit:1159446|Fix adding warnings to ParserOutput (T396768)]] (duration: 13m 09s) [production]
13:53 <marostegui@cumin1002> dbctl commit (dc=all): 'es1026 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P78049 and previous config saved to /var/cache/conftool/dbconfig/20250616-135301-root.json [production]
13:52 <jynus@cumin1002> DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on ms-backup2001.codfw.wmnet with reason: Maintenance and reboot [production]
13:47 <phuedx@deploy1003> phuedx, matmarex: Continuing with sync [production]
13:47 <sukhe> enable puppet and run agent on cephosd1001 [production]
13:46 <wmbot~dcaro@acme> END (PASS) - Cookbook wmcs.toolforge.component.deploy (exit_code=0) for component components-api [toolsbeta]
13:45 <jelto@cumin1002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab-runner2002.codfw.wmnet with reason: host reimage [production]