production SAL

2651-2700 of 10000 results (41ms)

2021-07-20 §
11:06	<oblivian@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
11:03	<oblivian@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
10:58	<oblivian@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
10:57	<oblivian@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
10:53	<oblivian@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .	[production]
10:43	<hnowlan@puppetmaster1001>	conftool action : set/weight=10; selector: name=maps100[79].eqiad.wmnet	[production]
10:35	<hnowlan@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=maps100[79].eqiad.wmnet	[production]
10:11	<jgiannelos@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .	[production]
09:39	<kormat@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 14 hosts with reason: Deploying schema change to s6 T281058	[production]
09:39	<kormat@cumin1001>	START - Cookbook sre.hosts.downtime for 1:00:00 on 14 hosts with reason: Deploying schema change to s6 T281058	[production]
08:27	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mw2352.codfw.wmnet	[production]
08:21	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host mw2352.codfw.wmnet	[production]
08:02	<btullis>	racadm serveraction powercycle on an-worker1106 due to CPU soft lock-ups on host	[production]
07:54	<jmm@cumin2002>	END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host idp-test2001.wikimedia.org	[production]
07:50	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host idp-test2001.wikimedia.org	[production]
07:10	<jmm@puppetmaster1001>	conftool action : set/pooled=no; selector: name=ldap-replica1004.wikimedia.org	[production]
03:17	<eileen>	civicrm revision changed from 20e9ef6bbb to 819c11307d, config revision is bb405c5232	[production]
2021-07-19 §
20:48	<urbanecm>	Deploy security patch for T286884	[production]
20:29	<vgutierrez>	pool text@codfw - T286921	[production]
20:23	<volans@cumin2002>	END (PASS) - Cookbook sre.dns.netbox (exit_code=0)	[production]
20:18	<volans@cumin2002>	START - Cookbook sre.dns.netbox	[production]
20:08	<dancy@deploy1002>	Synchronized php-1.37.0-wmf.14/includes/export/WikiExporter.php: Backport: [[gerrit:705467\|prevent PageIdentity checks in RevisionStore from breaking xml dumps (T286877)]] (duration: 00m 58s)	[production]
19:21	<Jeff_Green>	authdns-update to remove payments100[1-4].frack.eqiad.wmnet	[production]
19:14	<dancy@deploy1002>	Synchronized php-1.37.0-wmf.14/includes/Revision/RevisionStore.php: Backport: [[gerrit:705448\|Add sanity check to newRevisionFromRowAndSlots. (T286877)]] (duration: 00m 57s)	[production]
18:53	<vgutierrez>	running puppet and restarting pybal on lvs2009 - T286921	[production]
18:46	<topranks>	Running homer to re-enable port xe-2/0/43 on asw2-a2-codfw (lvs2009) - T286921	[production]
18:46	<brennen>	gerrit1001: restarting gerrit	[production]
18:40	<vgutierrez>	stop pybal on lvs2009 - T286921	[production]
18:38	<brennen>	re-enabling puppet on gerrit1001]	[production]
18:35	<vgutierrez>	running puppet and restarting pybal on lvs2010 - T286921	[production]
18:27	<ryankemper>	T264053 Deploying fix for timer issue on relforge: `ryankemper@cumin1001:~$ sudo cumin -b 2 'P{relforge*}' 'sudo systemctl stop elasticsearch-disable-readahead.timer && sudo systemctl disable elasticsearch-disable-readahead.timer && rm -fv /etc/systemd/system/elasticsearch-disable-readahead.timer && rm -fv /usr/lib/systemd/system/elasticsearch-disable-readahead.timer && sudo run-puppet-agent'`	[production]
18:27	<topranks>	Running homer to re-enable port xe-2/0/44 on asw2-a2-codfw (lvs2010)	[production]
18:27	<ryankemper>	T264053 Deploying fix for timer issue on cloudelastic: `ryankemper@cumin1001:~$ sudo cumin -b 6 'P{cloudelastic*}' 'sudo systemctl stop elasticsearch-disable-readahead.timer && sudo systemctl disable elasticsearch-disable-readahead.timer && rm -fv /etc/systemd/system/elasticsearch-disable-readahead.timer && rm -fv /usr/lib/systemd/system/elasticsearch-disable-readahead.timer && sudo run-puppet-agent'`	[production]
18:22	<vgutierrez>	disable puppet & stop pybal on lvs2010 - T286921	[production]
18:20	<vgutierrez>	enabling pybal on lvs2007 - T286921	[production]
18:19	<ryankemper>	T264053 Deploying fix for timer issue: `ryankemper@cumin1001:~$ sudo cumin -b 36 'P{elastic*}' 'sudo systemctl stop elasticsearch-disable-readahead.timer && sudo systemctl disable elasticsearch-disable-readahead.timer && rm -fv /etc/systemd/system/elasticsearch-disable-readahead.timer && rm -fv /usr/lib/systemd/system/elasticsearch-disable-readahead.timer && sudo run-puppet-agent'`	[production]
18:14	<topranks>	Running homer to re-enable asw-a2-codfw xe-2/0/45 port [lvs2007]	[production]
18:06	<dancy@deploy1002>	Synchronized .pipeline: Config: [[gerrit:705437\|pipeline: Perform mergeMessageFileList and rebuildLocalisationCache separately]] (duration: 00m 56s)	[production]
17:54	<mbsantos@deploy1002>	Finished deploy [tilerator/deploy@82e5f94]: (no justification provided) (duration: 00m 22s)	[production]
17:54	<mbsantos@deploy1002>	Started deploy [tilerator/deploy@82e5f94]: (no justification provided)	[production]
17:53	<mbsantos@deploy1002>	Finished deploy [tilerator/deploy@82e5f94]: (no justification provided) (duration: 00m 22s)	[production]
17:53	<mbsantos@deploy1002>	Started deploy [tilerator/deploy@82e5f94]: (no justification provided)	[production]
17:53	<mbsantos@deploy1002>	Finished deploy [tilerator/deploy@82e5f94]: (no justification provided) (duration: 00m 21s)	[production]
17:53	<mbsantos@deploy1002>	Started deploy [tilerator/deploy@82e5f94]: (no justification provided)	[production]
17:52	<mbsantos@deploy1002>	Finished deploy [tilerator/deploy@82e5f94]: (no justification provided) (duration: 00m 15s)	[production]
17:52	<mbsantos@deploy1002>	Started deploy [tilerator/deploy@82e5f94]: (no justification provided)	[production]
17:52	<mbsantos@deploy1002>	Finished deploy [tilerator/deploy@82e5f94]: (no justification provided) (duration: 00m 16s)	[production]
17:51	<mbsantos@deploy1002>	Started deploy [tilerator/deploy@82e5f94]: (no justification provided)	[production]
17:42	<ryankemper>	[Elastic] Noted `Jul 16 18:31:20 elastic2038 elasticsearch[957]: 2021-07-16 18:31:20,657 main ERROR Unknown GELF server hostname:udp:logstash.svc.eqiad.wmnet` in elasticsearch service logs (unit had been running for 2 days) thus the restart of the elasticsearch service	[production]
17:41	<ryankemper>	[Elastic] Restarted elasticsearch services on `elastic2038`; afterwards restarted prometheus exporters; no units failed any longer	[production]