production SAL

4651-4700 of 10000 results (57ms)

2021-08-31 §
14:19	<ottomata>	merged change to service_auto_restart.pp that changes the way service names are matched to be more explicit. tested in deployment prep and nothing bad happened. Logging in case something bad does happen in prod. https://gerrit.wikimedia.org/r/c/operations/puppet/+/697605	[production]
14:09	<otto@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .	[production]
14:09	<otto@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .	[production]
14:07	<otto@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .	[production]
14:05	<otto@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .	[production]
14:05	<otto@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .	[production]
14:03	<otto@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .	[production]
14:03	<otto@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .	[production]
14:02	<otto@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .	[production]
14:02	<jbond@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on puppetdb2002.codfw.wmnet with reason: puppetdb maintance - T289779	[production]
14:02	<jbond@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on puppetdb2002.codfw.wmnet with reason: puppetdb maintance - T289779	[production]
14:02	<jbond@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on puppetdb1002.eqiad.wmnet with reason: puppetdb maintance - T289779	[production]
14:02	<jbond@cumin1001>	START - Cookbook sre.hosts.downtime for 4:00:00 on puppetdb1002.eqiad.wmnet with reason: puppetdb maintance - T289779	[production]
14:01	<otto@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .	[production]
14:00	<otto@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .	[production]
13:47	<jbond>	disable puppet fleet wide to preform puppetdb maintance T263578	[production]
13:41	<otto@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .	[production]
13:37	<urbanecm>	Start `mwscript extensions/GrowthExperiments/maintenance/refreshLinkRecommendations.php --wiki=nlwiki --verbose` in a tmux session at mwmaint2002	[production]
13:28	<hnowlan@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=maps1010.eqiad.wmnet	[production]
13:06	<dzahn@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .	[production]
13:04	<jayme@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .	[production]
12:59	<urbanecm>	[urbanecm@mwmaint2002 ~]$ sudo -u www-data kill 133282 # stop updateMenteeData.php at frwiki	[production]
12:52	<jelto>	run kubectl scale deployments.apps -n ci mediawiki-bruce --replicas=0 to stop ImagePulling and reduce io on kubestage1001	[production]
12:42	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue	[production]
12:42	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue	[production]
11:38	<jbond>	sudo gnt-instance modify --disk add:size=100G puppetdb2002.codfw.wmnet T263578	[production]
11:38	<jbond>	sudo gnt-instance modify --disk add:size=100G puppetdb1002.eqiad.wmnet T263578	[production]
11:37	<jbond>	sudo gnt-instance modify --disk add:size=100G puppetdb2002.codfw.wmnet	[production]
11:35	<mwdebug-deploy@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
11:33	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
11:31	<urbanecm@deploy1002>	Synchronized php-1.37.0-wmf.20/extensions/GrowthExperiments/maintenance/updateMenteeData.php: 53a1856128edb4ec3a5ea8840fb6755a1703f7ac: updateMenteeData: Send timing to statsd (T278971) (duration: 00m 57s)	[production]
11:11	<mwdebug-deploy@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
11:09	<mwdebug-deploy@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
11:07	<urbanecm>	EU B&C window done	[production]
11:06	<urbanecm@deploy1002>	Synchronized wmf-config/InitialiseSettings.php: eb482e3fa88a87166b990fd9b87d0ccbbf971290: Offer the DiscussionTools reply tool as opt-out setting at 21 phase 2 Wikipedias (T288483) (duration: 00m 57s)	[production]
10:38	<hnowlan@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on maps1010.eqiad.wmnet with reason: Resyncing from master	[production]
10:38	<hnowlan@cumin1001>	START - Cookbook sre.hosts.downtime for 5:00:00 on maps1010.eqiad.wmnet with reason: Resyncing from master	[production]
10:23	<hnowlan@puppetmaster1001>	conftool action : set/pooled=no; selector: name=maps1010.eqiad.wmnet	[production]
10:23	<hnowlan@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=maps1008.eqiad.wmnet	[production]
10:14	<marostegui>	Optimize huwiki.flaggedtemplates T290057	[production]
10:11	<hnowlan@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=maps1008.eqiad.wmnet	[production]
08:39	<marostegui>	Optimize plwiki.flaggedtemplates T290057	[production]
08:18	<marostegui>	Optimize cewiki.flaggedtemplates T290057	[production]
08:05	<marostegui>	Optimize plwiktionary.flaggedtemplates T290057	[production]
07:44	<marostegui>	Optimize ruwiki.flaggedtemplates T290057	[production]
07:01	<XioNoX>	drain eqsin-codfw link	[production]
06:56	<marostegui@cumin1001>	dbctl commit (dc=all): 'db2110 (re)pooling @ 100%: Slowly repool after reimage T288803', diff saved to https://phabricator.wikimedia.org/P17113 and previous config saved to /var/cache/conftool/dbconfig/20210831-065600-root.json	[production]
06:40	<marostegui@cumin1001>	dbctl commit (dc=all): 'db2110 (re)pooling @ 75%: Slowly repool after reimage T288803', diff saved to https://phabricator.wikimedia.org/P17112 and previous config saved to /var/cache/conftool/dbconfig/20210831-064056-root.json	[production]
06:25	<marostegui@cumin1001>	dbctl commit (dc=all): 'db2110 (re)pooling @ 50%: Slowly repool after reimage T288803', diff saved to https://phabricator.wikimedia.org/P17111 and previous config saved to /var/cache/conftool/dbconfig/20210831-062553-root.json	[production]
06:10	<marostegui@cumin1001>	dbctl commit (dc=all): 'db2110 (re)pooling @ 25%: Slowly repool after reimage T288803', diff saved to https://phabricator.wikimedia.org/P17110 and previous config saved to /var/cache/conftool/dbconfig/20210831-061049-root.json	[production]