production SAL

7401-7450 of 10000 results (43ms)

2021-07-26 §
08:31	<jmm@cumin2002>	END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet	[production]
08:27	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet	[production]
08:26	<jmm@cumin2002>	END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host sretest1001.eqiad.wmnet	[production]
08:11	<jmm@cumin2002>	START - Cookbook sre.hosts.reboot-single for host sretest1001.eqiad.wmnet	[production]
07:18	<_joe_>	docker-image prune on deneb T287222	[production]
07:17	<_joe_>	manage-production-images prune on deneb, T287222	[production]
07:08	<marostegui>	Optimize dewiki.logging in eqiad (there will be lag)	[production]
06:39	<moritzm>	installing krb5 security updates	[production]
05:55	<Amir1>	start cleaning up auto-review flagged revs logs in plwiki	[production]
2021-07-24 §
11:04	<urbanecm>	[urbanecm@mwmaint2002 ~]$ mwscript extensions/Translate/scripts/moveTranslatablePage.php --wiki=commonswiki --reason='OTRS -> VRTS renaming process; see [[Phab:T280392]] and [[Phab:T280397]]' --move-subpages 'Commons:OTRS' 'Commons:Volunteer Response Team' 'Martin Urbanec' # T287321	[production]
2021-07-23 §
19:11	<topranks>	Successfully re-pooled eqiad - reversed change from yesterday after successful line card replacement in cr2-codfw - T287110	[production]
19:02	<topranks>	De-pooling eqiad again after successful replacement of linecard in cr2-codfw T287110	[production]
18:26	<legoktm@deploy1002>	helmfile [codfw] Ran 'sync' command on namespace 'shellbox' for release 'main' .	[production]
18:24	<legoktm@deploy1002>	helmfile [eqiad] Ran 'sync' command on namespace 'shellbox' for release 'main' .	[production]
18:14	<topranks>	Turning up et-0/0/[0-1] and et-0/2/[0-1] interfaces on cr2-codfw after line card replacement slot 0.	[production]
18:12	<legoktm@deploy1002>	helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' .	[production]
16:15	<effie>	enable puppet on mc-gp* hosts	[production]
15:47	<papaul>	powerdown wdqs2002 for IDRAC reset	[production]
15:45	<elukey@deploy1002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
15:44	<elukey@deploy1002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
15:11	<elukey>	stop ml-serve-ctrl1001 + gnt-instance modify -t plain ml-serve-ctrl1001.eqiad.wmnet on ganeti1009 + start instance back - T287238	[production]
14:36	<_joe_>	rebuilding httpd-fcgi, mediawiki-http fixing logging T285384	[production]
14:16	<brennen>	gitlab1001: running ansible to deploy [[gerrit:707236\|fix puma exporter listen address]] (T275170)	[production]
13:35	<otto@deploy1002>	Finished deploy [analytics/refinery@15521b3]: Add property disabling gobblin lock - T271232 (duration: 03m 32s)	[production]
13:31	<otto@deploy1002>	Started deploy [analytics/refinery@15521b3]: Add property disabling gobblin lock - T271232	[production]
12:16	<jelto@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on mw[1440-1442].eqiad.wmnet with reason: setup new canary mw api servers in eqiad D8 https://phabricator.wikimedia.org/T279309	[production]
12:16	<jelto@cumin1001>	START - Cookbook sre.hosts.downtime for 3:00:00 on mw[1440-1442].eqiad.wmnet with reason: setup new canary mw api servers in eqiad D8 https://phabricator.wikimedia.org/T279309	[production]
12:15	<jelto@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on mw1439.eqiad.wmnet with reason: setup new canary mw api servers in eqiad D8 https://phabricator.wikimedia.org/T279309	[production]
12:15	<jelto@cumin1001>	START - Cookbook sre.hosts.downtime for 3:00:00 on mw1439.eqiad.wmnet with reason: setup new canary mw api servers in eqiad D8 https://phabricator.wikimedia.org/T279309	[production]
11:50	<marostegui>	Change innodb_checksum_algorithm to full_crc32 on pc1011-1014 and pc2011-2014 - T287244	[production]
11:17	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw1446.eqiad.wmnet	[production]
11:17	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw1445.eqiad.wmnet	[production]
11:11	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw1443.eqiad.wmnet	[production]
11:11	<dzahn@cumin1001>	conftool action : set/weight=30; selector: name=mw144[3-6].eqiad.wmnet	[production]
11:00	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw[1443,1445-1446].eqiad.wmnet with reason: new host	[production]
11:00	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw[1443,1445-1446].eqiad.wmnet with reason: new host	[production]
10:58	<arturo>	adding packages to buster-wikimedia/thirdparty/kubeadm-k8s-1-19 @ apt1001	[production]
10:02	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw1442.eqiad.wmnet	[production]
09:57	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw1441.eqiad.wmnet	[production]
09:49	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw1440.eqiad.wmnet	[production]
09:47	<dzahn@cumin1001>	conftool action : set/pooled=yes; selector: name=mw1439.eqiad.wmnet	[production]
09:20	<hashar@deploy1002>	Finished deploy [integration/docroot@edae2b4]: doc: add footer link to wikitech documentation (duration: 00m 11s)	[production]
09:20	<hashar@deploy1002>	Started deploy [integration/docroot@edae2b4]: doc: add footer link to wikitech documentation	[production]
08:59	<dzahn@cumin1001>	conftool action : set/weight=30; selector: name=mw144[0-2].eqiad.wmnet	[production]
08:58	<dzahn@cumin1001>	conftool action : set/weight=30; selector: name=mw1439.eqiad.wmnet	[production]
08:56	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on mw[1439-1442].eqiad.wmnet with reason: new host	[production]
08:56	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on mw[1439-1442].eqiad.wmnet with reason: new host	[production]
08:24	<elukey>	run 'gnt-instance modify -t plain ml-serve-ctrl1002.eqiad.wmnet' on ganeti1009 as test to track down latency/perf issues with kubelets	[production]
03:11	<ryankemper>	T287223 Installed `nginx-light` on all of `cloudelastic*`, and it looks like `relforge` didn't need the upgrade. This operation is done.	[production]
03:09	<ryankemper>	T287223 Installed `nginx-light` on all of `elastic1*` (eqiad)	[production]