production SAL

5801-5850 of 10000 results (30ms)

2020-05-21 §
21:10	<foks>	removing two files for legal compliance	[production]
20:44	<bstorm_>	labstore1005 is now running stretch and drbd devices are resyncing after several reboots and some significant effort T224582	[production]
18:24	<twentyafterfour>	restarting phabricator on phab1001 to deploy https://phabricator.wikimedia.org/rPHEX2687d08786a9dadcbaa96709de991f471f239830	[production]
17:24	<bblack>	anycast experiment done, all back to normal	[production]
17:20	<bblack>	anycast experimentation commencing in ulsfo (test route withdrawal)...	[production]
17:04	<bstorm_>	starting labstore1005 upgrades T224582	[production]
16:14	<andrew@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
16:12	<andrew@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
16:04	<sbassett@deploy1001>	Synchronized private/PrivateSettings.php: Update mitigations for T250887 (duration: 01m 08s)	[production]
15:48	<andrewbogott>	rebuilding cloudnet1003.eqiad.wmnet with Debian Buster for T253124	[production]
15:22	<XioNoX>	Add BGP between cr1/2-eqiad and authdns1001 - T253196	[production]
15:09	<dzahn@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
15:09	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
15:08	<dzahn@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
15:08	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
15:07	<dzahn@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
15:07	<dzahn@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
14:59	<dzahn@cumin1001>	conftool action : set/pooled=inactive; selector: name=mw217[0-2].codfw.wmnet	[production]
14:59	<dzahn@cumin1001>	conftool action : set/pooled=inactive; selector: name=mw216[0-9].codfw.wmnet	[production]
14:58	<dzahn@cumin1001>	conftool action : set/pooled=inactive; selector: name=mw215[8-9].codfw.wmnet	[production]
14:50	<bblack@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
14:47	<bblack@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
14:44	<akosiaris@deploy1001>	helmfile [CODFW] Ran 'sync' command on namespace 'mathoid' for release 'canary' .	[production]
14:33	<akosiaris>	upload helmfile 0.109.0 to apt.wikimedia.org/buster-wikimedia and stretch-wikimedia, component main	[production]
13:51	<vgutierrez>	depool cp4032 for some ats tests	[production]
13:22	<mutante>	cloudnet1004 - reboot to test PXE boot	[production]
12:44	<andrewbogott>	reimaging cloudnet1004.eqiad.wmnet for T253124	[production]
12:29	<elukey>	roll restart druid-public cluster (druid100[4-6], backend for the AQS API) to apply new settings + openjdk upgrade - T252771	[production]
12:13	<mutante>	depooled mw2158 through mw2172 to make room again in C3 as planned (T247018)	[production]
12:12	<marostegui>	Repool labsdb1011 into the analytics role 🤞- T249188	[production]
12:12	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw217[0-2].codfw.wmnet	[production]
12:10	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw216[0-9].codfw.wmnet	[production]
12:05	<marostegui@cumin1001>	dbctl commit (dc=all): 'Fully repool db1143 and db1091', diff saved to https://phabricator.wikimedia.org/P11270 and previous config saved to /var/cache/conftool/dbconfig/20200521-120555-marostegui.json	[production]
12:05	<dzahn@cumin1001>	conftool action : set/pooled=no; selector: name=mw215[8-9].codfw.wmnet	[production]
11:18	<hnowlan>	Removed changeprop from scb hosts	[production]
11:04	<vgutierrez>	rolling restart of ncredir servers for kernel update	[production]
10:17	<vgutierrez>	restart of acme-chief servers for kernel update	[production]
10:13	<jbond42>	deploy CI for pupet privcate repo	[production]
10:11	<marostegui@cumin1001>	dbctl commit (dc=all): 'Slowly repool db1143 and db1091', diff saved to https://phabricator.wikimedia.org/P11268 and previous config saved to /var/cache/conftool/dbconfig/20200521-101100-marostegui.json	[production]
10:07	<mutante>	replaced backend of people.wikimedia.org - people1001 will be inaccessible, replaced with people1002 on buster. all home dirs have been synced over, there should be no difference except you have to use people1002 now for uploads (T247649)	[production]
10:06	<godog>	test adding --sni to check_http -S on icinga2001 - T253292	[production]
09:51	<marostegui@cumin1001>	dbctl commit (dc=all): 'Slowly repool db1143 and db1091', diff saved to https://phabricator.wikimedia.org/P11267 and previous config saved to /var/cache/conftool/dbconfig/20200521-095100-marostegui.json	[production]
09:28	<mutante>	deneb - sudo systemctl reset-failed to clear Icinga alerts about systemd degraded state	[production]
09:12	<marostegui@cumin1001>	dbctl commit (dc=all): 'Slowly repool db1143 and db1091', diff saved to https://phabricator.wikimedia.org/P11266 and previous config saved to /var/cache/conftool/dbconfig/20200521-091245-marostegui.json	[production]
09:01	<mutante>	LDAP - added lmata to wmf group (T253277)	[production]
08:55	<XioNoX>	Advertise Anycast 198.35.27.0/24 from esams - T253196	[production]
08:52	<XioNoX>	Advertise Anycast 198.35.27.0/24 from eqsin - T253196	[production]
08:49	<marostegui@cumin1001>	dbctl commit (dc=all): 'Pool db1143 with minimal weight for the first time T252512', diff saved to https://phabricator.wikimedia.org/P11265 and previous config saved to /var/cache/conftool/dbconfig/20200521-084933-marostegui.json	[production]
08:47	<XioNoX>	Advertise Anycast 198.35.27.0/24 from eqiad/eqord - T253196	[production]
08:42	<marostegui@cumin1001>	dbctl commit (dc=all): 'Add db1143 to the list of s4 hosts, depooled - T252512', diff saved to https://phabricator.wikimedia.org/P11264 and previous config saved to /var/cache/conftool/dbconfig/20200521-084226-marostegui.json	[production]