1401-1450 of 10000 results (76ms)
2023-03-27 §
10:03 <Emperor> depool ms-fe2009 [production]
09:47 <elukey@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main1003.eqiad.wmnet with reason: stop kafka and dist-upgrade [production]
09:47 <elukey@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main1003.eqiad.wmnet with reason: stop kafka and dist-upgrade [production]
09:45 <ayounsi@cumin1001> END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 45295 [production]
09:44 <ayounsi@cumin1001> START - Cookbook sre.network.peering with action 'email' for AS: 45295 [production]
09:41 <cgoubert@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
09:39 <cgoubert@cumin1001> START - Cookbook sre.dns.netbox [production]
08:58 <cgoubert@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
08:58 <cgoubert@cumin1001> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for mw-api-int - cgoubert@cumin1001" [production]
08:57 <cgoubert@cumin1001> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for mw-api-int - cgoubert@cumin1001" [production]
08:55 <cgoubert@cumin1001> START - Cookbook sre.dns.netbox [production]
08:47 <elukey@cumin1001> START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons. [production]
08:39 <ladsgroup@deploy1002> Finished scap: Backport for [[gerrit:903186|EntityUsageTable: Mark query as read-only (T332941)]] (duration: 18m 15s) [production]
08:30 <ladsgroup@deploy1002> ladsgroup: Backport for [[gerrit:903186|EntityUsageTable: Mark query as read-only (T332941)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet [production]
08:28 <jynus> restarting bacula at backup1001 T331510 [production]
08:25 <urbanecm@deploy2002> Synchronized wmf-config/InitialiseSettings.php: 63dd23b5ceaba35c8d9682493dd21d99a20fc8f7: [Growth] eswiki: Enable mentorship for 50% of newcomers (T332737, T285235) (duration: 06m 09s) [production]
08:20 <ladsgroup@deploy1002> Started scap: Backport for [[gerrit:903186|EntityUsageTable: Mark query as read-only (T332941)]] [production]
08:18 <urbanecm@deploy2002> Backport cancelled. [production]
08:06 <urbanecm@deploy2002> Finished scap: Backport for [[gerrit:902734|GrowthMentors.json: Add a write-only username field (T331444)]] (duration: 07m 52s) [production]
08:03 <marostegui> Failover m1 from db1164 to db1101 - T331510 [production]
07:59 <urbanecm@deploy2002> urbanecm: Backport for [[gerrit:902734|GrowthMentors.json: Add a write-only username field (T331444)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet [production]
07:58 <urbanecm@deploy2002> Started scap: Backport for [[gerrit:902734|GrowthMentors.json: Add a write-only username field (T331444)]] [production]
07:55 <urbanecm@deploy2002> Finished scap: Backport for [[gerrit:902741|SpecialWikiSets: Avoid calling WikiSet::getId on null (T333075)]] (duration: 16m 45s) [production]
07:52 <marostegui@cumin1001> dbctl commit (dc=all): 'db1120 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P45949 and previous config saved to /var/cache/conftool/dbconfig/20230327-075206-root.json [production]
07:48 <urbanecm@deploy2002> urbanecm: Backport for [[gerrit:902741|SpecialWikiSets: Avoid calling WikiSet::getId on null (T333075)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet [production]
07:39 <jynus> disabling puppet and shutding down bacula at backup1001 T331510 [production]
07:38 <urbanecm@deploy2002> Started scap: Backport for [[gerrit:902741|SpecialWikiSets: Avoid calling WikiSet::getId on null (T333075)]] [production]
07:37 <marostegui@cumin1001> dbctl commit (dc=all): 'db1120 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P45948 and previous config saved to /var/cache/conftool/dbconfig/20230327-073701-root.json [production]
07:21 <marostegui@cumin1001> dbctl commit (dc=all): 'db1120 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P45947 and previous config saved to /var/cache/conftool/dbconfig/20230327-072156-root.json [production]
07:06 <marostegui@cumin1001> dbctl commit (dc=all): 'db1120 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P45946 and previous config saved to /var/cache/conftool/dbconfig/20230327-070651-root.json [production]
06:51 <marostegui> dbmaint s3 eqiad Rename flaggedrevs tables on db1123 ptwikisource T332594 [production]
06:51 <marostegui@cumin1001> dbctl commit (dc=all): 'db1120 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P45945 and previous config saved to /var/cache/conftool/dbconfig/20230327-065147-root.json [production]
06:40 <marostegui> Rename flaggedrevs tables on db1123 ptwikisource T332594 [production]
06:36 <marostegui@cumin1001> dbctl commit (dc=all): 'db1120 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P45944 and previous config saved to /var/cache/conftool/dbconfig/20230327-063642-root.json [production]
05:40 <kart_> Updated cxserver to 2023-03-17-133444-production (T332379 + build changes) [production]
05:38 <kartik@deploy2002> helmfile [codfw] DONE helmfile.d/services/cxserver: apply [production]
05:37 <kartik@deploy2002> helmfile [codfw] START helmfile.d/services/cxserver: apply [production]
05:28 <kartik@deploy2002> helmfile [eqiad] DONE helmfile.d/services/cxserver: apply [production]
05:28 <kartik@deploy2002> helmfile [eqiad] START helmfile.d/services/cxserver: apply [production]
05:24 <kartik@deploy2002> helmfile [staging] DONE helmfile.d/services/cxserver: apply [production]
05:23 <kartik@deploy2002> helmfile [staging] START helmfile.d/services/cxserver: apply [production]
05:19 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1120 T332292', diff saved to https://phabricator.wikimedia.org/P45942 and previous config saved to /var/cache/conftool/dbconfig/20230327-051941-root.json [production]
05:14 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2132,2160].codfw.wmnet,db[1101,1117,1164].eqiad.wmnet with reason: m1 master switch T331510 [production]
05:14 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db[2132,2160].codfw.wmnet,db[1101,1117,1164].eqiad.wmnet with reason: m1 master switch T331510 [production]
2023-03-25 §
07:54 <hashar@deploy2002> Finished deploy [integration/docroot@ab848e3]: build: Updating eslint-config-wikimedia to 0.24.0 (duration: 00m 08s) [production]
07:54 <hashar@deploy2002> Started deploy [integration/docroot@ab848e3]: build: Updating eslint-config-wikimedia to 0.24.0 [production]
00:59 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doc1002.eqiad.wmnet with reason: WIP-known-to-be-debugged-new-host [production]
00:58 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on doc1002.eqiad.wmnet with reason: WIP-known-to-be-debugged-new-host [production]
00:57 <mutante> doc1002 - issue is mismatched UIDs again, most likely. doc-uploader is debmonitor on new host [production]
00:56 <mutante> doc1002 - manually running rsync to doc2002 - which failed with status 23 when started by timer [production]