production SAL

1401-1450 of 10000 results (79ms)

2023-03-27 §
10:03	<Emperor>	depool ms-fe2009	[production]
09:47	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on kafka-main1003.eqiad.wmnet with reason: stop kafka and dist-upgrade	[production]
09:47	<elukey@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on kafka-main1003.eqiad.wmnet with reason: stop kafka and dist-upgrade	[production]
09:45	<ayounsi@cumin1001>	END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 45295	[production]
09:44	<ayounsi@cumin1001>	START - Cookbook sre.network.peering with action 'email' for AS: 45295	[production]
09:41	<cgoubert@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
09:39	<cgoubert@cumin1001>	START - Cookbook sre.dns.netbox	[production]
08:58	<cgoubert@cumin1001>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
08:58	<cgoubert@cumin1001>	END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for mw-api-int - cgoubert@cumin1001"	[production]
08:57	<cgoubert@cumin1001>	START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for mw-api-int - cgoubert@cumin1001"	[production]
08:55	<cgoubert@cumin1001>	START - Cookbook sre.dns.netbox	[production]
08:47	<elukey@cumin1001>	START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons.	[production]
08:39	<ladsgroup@deploy1002>	Finished scap: Backport for [[gerrit:903186\|EntityUsageTable: Mark query as read-only (T332941)]] (duration: 18m 15s)	[production]
08:30	<ladsgroup@deploy1002>	ladsgroup: Backport for [[gerrit:903186\|EntityUsageTable: Mark query as read-only (T332941)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet	[production]
08:28	<jynus>	restarting bacula at backup1001 T331510	[production]
08:25	<urbanecm@deploy2002>	Synchronized wmf-config/InitialiseSettings.php: 63dd23b5ceaba35c8d9682493dd21d99a20fc8f7: [Growth] eswiki: Enable mentorship for 50% of newcomers (T332737, T285235) (duration: 06m 09s)	[production]
08:20	<ladsgroup@deploy1002>	Started scap: Backport for [[gerrit:903186\|EntityUsageTable: Mark query as read-only (T332941)]]	[production]
08:18	<urbanecm@deploy2002>	Backport cancelled.	[production]
08:06	<urbanecm@deploy2002>	Finished scap: Backport for [[gerrit:902734\|GrowthMentors.json: Add a write-only username field (T331444)]] (duration: 07m 52s)	[production]
08:03	<marostegui>	Failover m1 from db1164 to db1101 - T331510	[production]
07:59	<urbanecm@deploy2002>	urbanecm: Backport for [[gerrit:902734\|GrowthMentors.json: Add a write-only username field (T331444)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet	[production]
07:58	<urbanecm@deploy2002>	Started scap: Backport for [[gerrit:902734\|GrowthMentors.json: Add a write-only username field (T331444)]]	[production]
07:55	<urbanecm@deploy2002>	Finished scap: Backport for [[gerrit:902741\|SpecialWikiSets: Avoid calling WikiSet::getId on null (T333075)]] (duration: 16m 45s)	[production]
07:52	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1120 (re)pooling @ 100%: Repooling', diff saved to https://phabricator.wikimedia.org/P45949 and previous config saved to /var/cache/conftool/dbconfig/20230327-075206-root.json	[production]
07:48	<urbanecm@deploy2002>	urbanecm: Backport for [[gerrit:902741\|SpecialWikiSets: Avoid calling WikiSet::getId on null (T333075)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet	[production]
07:39	<jynus>	disabling puppet and shutding down bacula at backup1001 T331510	[production]
07:38	<urbanecm@deploy2002>	Started scap: Backport for [[gerrit:902741\|SpecialWikiSets: Avoid calling WikiSet::getId on null (T333075)]]	[production]
07:37	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1120 (re)pooling @ 75%: Repooling', diff saved to https://phabricator.wikimedia.org/P45948 and previous config saved to /var/cache/conftool/dbconfig/20230327-073701-root.json	[production]
07:21	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1120 (re)pooling @ 50%: Repooling', diff saved to https://phabricator.wikimedia.org/P45947 and previous config saved to /var/cache/conftool/dbconfig/20230327-072156-root.json	[production]
07:06	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1120 (re)pooling @ 25%: Repooling', diff saved to https://phabricator.wikimedia.org/P45946 and previous config saved to /var/cache/conftool/dbconfig/20230327-070651-root.json	[production]
06:51	<marostegui>	dbmaint s3 eqiad Rename flaggedrevs tables on db1123 ptwikisource T332594	[production]
06:51	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1120 (re)pooling @ 10%: Repooling', diff saved to https://phabricator.wikimedia.org/P45945 and previous config saved to /var/cache/conftool/dbconfig/20230327-065147-root.json	[production]
06:40	<marostegui>	Rename flaggedrevs tables on db1123 ptwikisource T332594	[production]
06:36	<marostegui@cumin1001>	dbctl commit (dc=all): 'db1120 (re)pooling @ 5%: Repooling', diff saved to https://phabricator.wikimedia.org/P45944 and previous config saved to /var/cache/conftool/dbconfig/20230327-063642-root.json	[production]
05:40	<kart_>	Updated cxserver to 2023-03-17-133444-production (T332379 + build changes)	[production]
05:38	<kartik@deploy2002>	helmfile [codfw] DONE helmfile.d/services/cxserver: apply	[production]
05:37	<kartik@deploy2002>	helmfile [codfw] START helmfile.d/services/cxserver: apply	[production]
05:28	<kartik@deploy2002>	helmfile [eqiad] DONE helmfile.d/services/cxserver: apply	[production]
05:28	<kartik@deploy2002>	helmfile [eqiad] START helmfile.d/services/cxserver: apply	[production]
05:24	<kartik@deploy2002>	helmfile [staging] DONE helmfile.d/services/cxserver: apply	[production]
05:23	<kartik@deploy2002>	helmfile [staging] START helmfile.d/services/cxserver: apply	[production]
05:19	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db1120 T332292', diff saved to https://phabricator.wikimedia.org/P45942 and previous config saved to /var/cache/conftool/dbconfig/20230327-051941-root.json	[production]
05:14	<marostegui@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2132,2160].codfw.wmnet,db[1101,1117,1164].eqiad.wmnet with reason: m1 master switch T331510	[production]
05:14	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on db[2132,2160].codfw.wmnet,db[1101,1117,1164].eqiad.wmnet with reason: m1 master switch T331510	[production]
2023-03-25 §
07:54	<hashar@deploy2002>	Finished deploy [integration/docroot@ab848e3]: build: Updating eslint-config-wikimedia to 0.24.0 (duration: 00m 08s)	[production]
07:54	<hashar@deploy2002>	Started deploy [integration/docroot@ab848e3]: build: Updating eslint-config-wikimedia to 0.24.0	[production]
00:59	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doc1002.eqiad.wmnet with reason: WIP-known-to-be-debugged-new-host	[production]
00:58	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on doc1002.eqiad.wmnet with reason: WIP-known-to-be-debugged-new-host	[production]
00:57	<mutante>	doc1002 - issue is mismatched UIDs again, most likely. doc-uploader is debmonitor on new host	[production]
00:56	<mutante>	doc1002 - manually running rsync to doc2002 - which failed with status 23 when started by timer	[production]