2901-2950 of 10000 results (82ms)
2023-02-07 §
11:40 <jiji@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2044.codfw.wmnet with reason: host reimage [production]
11:37 <jiji@cumin1001> START - Cookbook sre.hosts.downtime for 2:00:00 on mc2044.codfw.wmnet with reason: host reimage [production]
11:33 <moritzm> installing imagemagick security updates on buster [production]
11:29 <jiji@cumin1001> START - Cookbook sre.hosts.reimage for host mc1041.eqiad.wmnet with OS bullseye [production]
11:21 <jiji@cumin1001> START - Cookbook sre.hosts.reimage for host mc2044.codfw.wmnet with OS bullseye [production]
10:51 <elukey@cumin1001> START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-logging-eqiad cluster: Roll restart of jvm daemons. [production]
10:49 <elukey@cumin1001> END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-logging-codfw cluster: Roll restart of jvm daemons. [production]
10:19 <oblivian@cumin2002> END (PASS) - Cookbook sre.discovery.datacenter-route (exit_code=0) pool all active/active services in eqiad: Pooling eqiad for codfw depool today [production]
10:19 <oblivian@cumin2002> START - Cookbook sre.discovery.datacenter-route pool all active/active services in eqiad: Pooling eqiad for codfw depool today [production]
10:17 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast1003.wikimedia.org with OS bullseye [production]
10:13 <oblivian@cumin2002> END (FAIL) - Cookbook sre.discovery.datacenter-route (exit_code=93) pool all active/active services in eqiad: Pooling eqiad for codfw depool today [production]
10:12 <oblivian@cumin2002> START - Cookbook sre.discovery.datacenter-route pool all active/active services in eqiad: Pooling eqiad for codfw depool today [production]
10:01 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast1003.wikimedia.org with reason: host reimage [production]
09:56 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on bast1003.wikimedia.org with reason: host reimage [production]
09:44 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host bast1003.wikimedia.org with OS bullseye [production]
09:42 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast2002.wikimedia.org with OS bullseye [production]
09:24 <akosiaris@deploy1002> helmfile [eqiad] DONE helmfile.d/services/changeprop: sync [production]
09:23 <akosiaris@deploy1002> helmfile [eqiad] START helmfile.d/services/changeprop: sync [production]
09:22 <jmm@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast2002.wikimedia.org with reason: host reimage [production]
09:20 <akosiaris@deploy1002> helmfile [codfw] DONE helmfile.d/services/changeprop: sync [production]
09:20 <akosiaris@deploy1002> helmfile [codfw] START helmfile.d/services/changeprop: sync [production]
09:20 <akosiaris> add wiktionary to mobile-sections rerenders. T226931 [production]
09:19 <jmm@cumin2002> START - Cookbook sre.hosts.downtime for 2:00:00 on bast2002.wikimedia.org with reason: host reimage [production]
09:19 <akosiaris@deploy1002> helmfile [staging] DONE helmfile.d/services/changeprop: sync [production]
09:19 <akosiaris@deploy1002> helmfile [staging] START helmfile.d/services/changeprop: sync [production]
09:08 <elukey@cumin1001> START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-logging-codfw cluster: Roll restart of jvm daemons. [production]
09:02 <jmm@cumin2002> START - Cookbook sre.hosts.reimage for host bast2002.wikimedia.org with OS bullseye [production]
08:50 <vgutierrez> rolling upgrade to HAProxy 2.4.21 in cp nodes [production]
08:48 <kostajh> UTC morning deploys done [production]
08:48 <kharlan@deploy1002> Finished scap: Backport for [[gerrit:883236|[Growth] Remove mentor list variables (T321501)]], [[gerrit:883153|Remove GEMentorProvider (T321501)]] (duration: 12m 48s) [production]
08:37 <kharlan@deploy1002> urbanecm and kharlan: Backport for [[gerrit:883236|[Growth] Remove mentor list variables (T321501)]], [[gerrit:883153|Remove GEMentorProvider (T321501)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet [production]
08:35 <kharlan@deploy1002> Started scap: Backport for [[gerrit:883236|[Growth] Remove mentor list variables (T321501)]], [[gerrit:883153|Remove GEMentorProvider (T321501)]] [production]
08:30 <moritzm> installing imagemagick security updates on Thumbor T328901 [production]
08:28 <kharlan@deploy1002> Finished scap: Backport for [[gerrit:886343|GrowthExperiments: Disable leveling up features in production (T328757)]] (duration: 12m 11s) [production]
08:18 <kharlan@deploy1002> kharlan: Backport for [[gerrit:886343|GrowthExperiments: Disable leveling up features in production (T328757)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet [production]
08:16 <kharlan@deploy1002> Started scap: Backport for [[gerrit:886343|GrowthExperiments: Disable leveling up features in production (T328757)]] [production]
08:14 <kharlan@deploy1002> backport aborted: (duration: 00m 07s) [production]
07:00 <marostegui> Failover m3 from db1159 to db1164 - T328404 [production]
06:31 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db2110 in API', diff saved to https://phabricator.wikimedia.org/P43758 and previous config saved to /var/cache/conftool/dbconfig/20230207-063147-root.json [production]
06:28 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1187', diff saved to https://phabricator.wikimedia.org/P43757 and previous config saved to /var/cache/conftool/dbconfig/20230207-062826-root.json [production]
04:58 <mwpresync@deploy1002> Pruned MediaWiki: 1.40.0-wmf.20 (duration: 02m 20s) [production]
04:55 <mwpresync@deploy1002> Finished scap: testwikis wikis to 1.40.0-wmf.22 refs T325585 (duration: 53m 11s) [production]
04:02 <mwpresync@deploy1002> Started scap: testwikis wikis to 1.40.0-wmf.22 refs T325585 [production]
2023-02-06 §
23:17 <pt1979@cumin2002> END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2421.mgmt.codfw.wmnet with reboot policy FORCED [production]
23:01 <pt1979@cumin2002> START - Cookbook sre.hosts.provision for host mw2421.mgmt.codfw.wmnet with reboot policy FORCED [production]
22:55 <ryankemper> T327925 Depooled codfw wdqs hosts: `ryankemper@cumin2002:~$ sudo -E cumin -b 3 'wdqs[2003-2004,2009]*' 'sudo depool'` [production]
22:51 <bking@cumin2002> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 13 hosts with reason: switch upgrade [production]
22:51 <bking@cumin2002> START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 13 hosts with reason: switch upgrade [production]
22:48 <ryankemper> T327925 Banned `elastic[2037-2040,2055-2056,2061-2062,2069,2073-2076]` on codfw elastic [production]
22:42 <inflatador> bking@cumin2002 banning Elastic nodes from cluster in preparation for T327925 [production]