5301-5350 of 10000 results (95ms)
2023-04-24 ยง
13:24 <eoghan@cumin1001> START - Cookbook sre.gitlab.failover Failover of gitlab from gitlab1003.wikimedia.org to gitlab1004.wikimedia.org [production]
13:24 <cgoubert@deploy2002> helmfile [codfw] DONE helmfile.d/services/push-notifications: apply [production]
13:23 <cgoubert@deploy2002> helmfile [codfw] START helmfile.d/services/push-notifications: apply [production]
13:20 <urbanecm@deploy2002> Started scap: Backport for [[gerrit:910723|Update InterwikiSortOrders (T335019)]] [production]
13:15 <urbanecm@deploy2002> Finished scap: Backport for [[gerrit:910018|Disable wmgNewUserMessageOnAutoCreate from Extension:NewUserMessage on knwikisource (T335090)]] (duration: 11m 02s) [production]
13:14 <cgoubert@deploy2002> helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply [production]
13:14 <cgoubert@deploy2002> helmfile [eqiad] START helmfile.d/services/push-notifications: apply [production]
13:13 <claime> Deploying push-notifications production for switch to mw-api-int - T334061 [production]
13:05 <urbanecm@deploy2002> urbanecm and anzx: Backport for [[gerrit:910018|Disable wmgNewUserMessageOnAutoCreate from Extension:NewUserMessage on knwikisource (T335090)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet [production]
13:04 <urbanecm@deploy2002> Started scap: Backport for [[gerrit:910018|Disable wmgNewUserMessageOnAutoCreate from Extension:NewUserMessage on knwikisource (T335090)]] [production]
12:56 <bking@cumin1001> START - Cookbook sre.wdqs.data-transfer [production]
12:29 <cgoubert@deploy2002> helmfile [staging] DONE helmfile.d/services/push-notifications: apply [production]
12:28 <cgoubert@deploy2002> helmfile [staging] START helmfile.d/services/push-notifications: apply [production]
12:28 <claime> Deploying push-notifications staging for switch to mw-api-int - T334061 [production]
11:23 <cgoubert@cumin1001> conftool action : set/weight=30; selector: dc=codfw,cluster=api_appserver,service=canary [production]
11:21 <cgoubert@cumin1001> conftool action : set/weight=25; selector: dc=codfw,cluster=appserver,service=canary [production]
11:19 <cgoubert@cumin1001> conftool action : set/weight=30; selector: dc=eqiad,cluster=appserver,service=canary [production]
11:18 <cgoubert@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
11:17 <cgoubert@cumin1001> START - Cookbook sre.dns.netbox [production]
11:14 <cgoubert@cumin1001> conftool action : set/weight=10; selector: dc=codfw,cluster=parsoid,service=canary [production]
11:13 <cgoubert@cumin1001> conftool action : set/weight=10; selector: dc=eqiad,cluster=parsoid,service=canary [production]
11:13 <claime> Fixing appserver clusters canary weights [production]
10:56 <jynus> deployed new ssh key for jcrespo on production cluster [production]
10:29 <claime> Datacenter switchover live testing setting db to read-only and back in eqiad successful - T327920 [production]
10:29 <cgoubert@cumin1001> END (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0) [production]
10:29 <cgoubert@cumin1001> START - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite [production]
10:29 <cgoubert@cumin1001> END (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0) [production]
10:29 <cgoubert@cumin1001> START - Cookbook sre.switchdc.mediawiki.03-set-db-readonly [production]
10:27 <claime> Datacenter switchover live testing setting db to read-only and back in eqiad - T327920 [production]
10:26 <jmm@cumin2002> END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Ilooremeta out of all services on: 801 hosts [production]
10:26 <jmm@cumin2002> START - Cookbook sre.idm.logout Logging Ilooremeta out of all services on: 801 hosts [production]
10:24 <jmm@cumin2002> END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Ilooremeta out of all services on: 1262 hosts [production]
10:22 <jmm@cumin2002> START - Cookbook sre.idm.logout Logging Ilooremeta out of all services on: 1262 hosts [production]
10:22 <jmm@cumin2002> END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Hghani out of all services on: 1262 hosts [production]
10:20 <jmm@cumin2002> START - Cookbook sre.idm.logout Logging Hghani out of all services on: 1262 hosts [production]
10:18 <jmm@cumin2002> END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Hghani out of all services on: 801 hosts [production]
10:18 <jmm@cumin2002> START - Cookbook sre.idm.logout Logging Hghani out of all services on: 801 hosts [production]
10:17 <jmm@cumin2002> END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Hibashaath out of all services on: 801 hosts [production]
10:17 <jmm@cumin2002> START - Cookbook sre.idm.logout Logging Hibashaath out of all services on: 801 hosts [production]
10:16 <jmm@cumin2002> END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Hibashaath out of all services on: 1262 hosts [production]
10:14 <jmm@cumin2002> START - Cookbook sre.idm.logout Logging Hibashaath out of all services on: 1262 hosts [production]
10:11 <marostegui> Enable replication eqiad -> codfw on s1 dbmaint eqiad T335266 [production]
10:10 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 38 hosts with reason: Enabling replication T335266 [production]
10:09 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 0:15:00 on 38 hosts with reason: Enabling replication T335266 [production]
10:08 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 35 hosts with reason: Enabling replication T335266 [production]
10:07 <marostegui> Enable replication eqiad -> codfw on s4 dbmaint eqiad T335266 [production]
10:07 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 0:15:00 on 35 hosts with reason: Enabling replication T335266 [production]
10:07 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on 24 hosts with reason: Enabling replication T335266 [production]
10:06 <marostegui> Enable replication eqiad -> codfw on s3 dbmaint eqiad T335266 [production]
10:06 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 0:15:00 on 24 hosts with reason: Enabling replication T335266 [production]