5251-5300 of 10000 results (99ms)
2023-03-27 §
08:57 <cgoubert@cumin1001> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for mw-api-int - cgoubert@cumin1001" [production]
08:55 <cgoubert@cumin1001> START - Cookbook sre.dns.netbox [production]
08:47 <elukey@cumin1001> START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons. [production]
08:39 <ladsgroup@deploy1002> Finished scap: Backport for [[gerrit:903186|EntityUsageTable: Mark query as read-only (T332941)]] (duration: 18m 15s) [production]
08:30 <ladsgroup@deploy1002> ladsgroup: Backport for [[gerrit:903186|EntityUsageTable: Mark query as read-only (T332941)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet [production]
08:28 <jynus> restarting bacula at backup1001 T331510 [production]
08:25 <urbanecm@deploy2002> Synchronized wmf-config/InitialiseSettings.php: 63dd23b5ceaba35c8d9682493dd21d99a20fc8f7: [Growth] eswiki: Enable mentorship for 50% of newcomers (T332737, T285235) (duration: 06m 09s) [production]
08:20 <ladsgroup@deploy1002> Started scap: Backport for [[gerrit:903186|EntityUsageTable: Mark query as read-only (T332941)]] [production]
08:18 <urbanecm@deploy2002> Backport cancelled. [production]
08:06 <urbanecm@deploy2002> Finished scap: Backport for [[gerrit:902734|GrowthMentors.json: Add a write-only username field (T331444)]] (duration: 07m 52s) [production]
08:03 <marostegui> Failover m1 from db1164 to db1101 - T331510 [production]
07:59 <urbanecm@deploy2002> urbanecm: Backport for [[gerrit:902734|GrowthMentors.json: Add a write-only username field (T331444)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet [production]
07:58 <urbanecm@deploy2002> Started scap: Backport for [[gerrit:902734|GrowthMentors.json: Add a write-only username field (T331444)]] [production]
07:55 <urbanecm@deploy2002> Finished scap: Backport for [[gerrit:902741|SpecialWikiSets: Avoid calling WikiSet::getId on null (T333075)]] (duration: 16m 45s) [production]
07:52 <marostegui@cumin1001> dbctl commit (dc=all): 'db1120 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P45949 and previous config saved to /var/cache/conftool/dbconfig/20230327-075206-root.json [production]
07:48 <urbanecm@deploy2002> urbanecm: Backport for [[gerrit:902741|SpecialWikiSets: Avoid calling WikiSet::getId on null (T333075)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet [production]
07:39 <jynus> disabling puppet and shutding down bacula at backup1001 T331510 [production]
07:38 <urbanecm@deploy2002> Started scap: Backport for [[gerrit:902741|SpecialWikiSets: Avoid calling WikiSet::getId on null (T333075)]] [production]
07:37 <marostegui@cumin1001> dbctl commit (dc=all): 'db1120 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P45948 and previous config saved to /var/cache/conftool/dbconfig/20230327-073701-root.json [production]
07:21 <marostegui@cumin1001> dbctl commit (dc=all): 'db1120 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P45947 and previous config saved to /var/cache/conftool/dbconfig/20230327-072156-root.json [production]
07:06 <marostegui@cumin1001> dbctl commit (dc=all): 'db1120 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P45946 and previous config saved to /var/cache/conftool/dbconfig/20230327-070651-root.json [production]
06:51 <marostegui> dbmaint s3 eqiad Rename flaggedrevs tables on db1123 ptwikisource T332594 [production]
06:51 <marostegui@cumin1001> dbctl commit (dc=all): 'db1120 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P45945 and previous config saved to /var/cache/conftool/dbconfig/20230327-065147-root.json [production]
06:40 <marostegui> Rename flaggedrevs tables on db1123 ptwikisource T332594 [production]
06:36 <marostegui@cumin1001> dbctl commit (dc=all): 'db1120 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P45944 and previous config saved to /var/cache/conftool/dbconfig/20230327-063642-root.json [production]
05:40 <kart_> Updated cxserver to 2023-03-17-133444-production (T332379 + build changes) [production]
05:38 <kartik@deploy2002> helmfile [codfw] DONE helmfile.d/services/cxserver: apply [production]
05:37 <kartik@deploy2002> helmfile [codfw] START helmfile.d/services/cxserver: apply [production]
05:28 <kartik@deploy2002> helmfile [eqiad] DONE helmfile.d/services/cxserver: apply [production]
05:28 <kartik@deploy2002> helmfile [eqiad] START helmfile.d/services/cxserver: apply [production]
05:24 <kartik@deploy2002> helmfile [staging] DONE helmfile.d/services/cxserver: apply [production]
05:23 <kartik@deploy2002> helmfile [staging] START helmfile.d/services/cxserver: apply [production]
05:19 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1120 T332292', diff saved to https://phabricator.wikimedia.org/P45942 and previous config saved to /var/cache/conftool/dbconfig/20230327-051941-root.json [production]
05:14 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2132,2160].codfw.wmnet,db[1101,1117,1164].eqiad.wmnet with reason: m1 master switch T331510 [production]
05:14 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime for 1:00:00 on db[2132,2160].codfw.wmnet,db[1101,1117,1164].eqiad.wmnet with reason: m1 master switch T331510 [production]
2023-03-25 §
07:54 <hashar@deploy2002> Finished deploy [integration/docroot@ab848e3]: build: Updating eslint-config-wikimedia to 0.24.0 (duration: 00m 08s) [production]
07:54 <hashar@deploy2002> Started deploy [integration/docroot@ab848e3]: build: Updating eslint-config-wikimedia to 0.24.0 [production]
00:59 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doc1002.eqiad.wmnet with reason: WIP-known-to-be-debugged-new-host [production]
00:58 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on doc1002.eqiad.wmnet with reason: WIP-known-to-be-debugged-new-host [production]
00:57 <mutante> doc1002 - issue is mismatched UIDs again, most likely. doc-uploader is debmonitor on new host [production]
00:56 <mutante> doc1002 - manually running rsync to doc2002 - which failed with status 23 when started by timer [production]
00:09 <tzatziki> removing 2 files for legal compliance [production]
2023-03-24 §
23:58 <denisse@cumin1001> END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "doc2002 - denisse@cumin1001 - T332819" [production]
23:57 <denisse@cumin1001> START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "doc2002 - denisse@cumin1001 - T332819" [production]
23:50 <tzatziki> removing 1 file for legal compliance [production]
21:08 <mutante> mwmaint1002 ferm rules for rsyncd_access from miscweb removed by puppet after I4fe17f397856361 which reverted a8af0339bde14018e8. manually deleted rsyncd config and stopped rsync service. complete noop on mwmaint2002 which is currently the active mwmaint server. T328907 [production]
18:50 <ebernhardson@deploy2002> Finished deploy [airflow-dags/search@fc69bf4]: Make mw rev recommendation create start_date configurable (duration: 00m 13s) [production]
18:50 <ebernhardson@deploy2002> Started deploy [airflow-dags/search@fc69bf4]: Make mw rev recommendation create start_date configurable [production]
18:30 <ebernhardson@deploy2002> Finished deploy [airflow-dags/search@220221d]: set start dates from transfer_to_es dags (duration: 00m 16s) [production]
18:30 <ebernhardson@deploy2002> Started deploy [airflow-dags/search@220221d]: set start dates from transfer_to_es dags [production]