production SAL

1651-1700 of 10000 results (87ms)

2023-04-04 §
12:57	<steve_munene>	putting pdfs into safe mode as part of T331882	[production]
12:52	<ayounsi@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on 228 hosts with reason: eqiad row C upgrade	[production]
12:52	<ayounsi@cumin1001>	START - Cookbook sre.hosts.downtime for 2:00:00 on 228 hosts with reason: eqiad row C upgrade	[production]
12:44	<akosiaris@cumin1001>	START - Cookbook sre.discovery.datacenter depool all active/active services in eqiad: eqiad row C switches upgrade - T331882	[production]
12:43	<Emperor>	depool thanos-fe1003 re T331882	[production]
12:38	<Emperor>	depool ms-fe1011 re T331882	[production]
12:32	<sukhe>	[finished] run authdns-update for CR: 905603 depool eqiad	[production]
12:31	<ladsgroup@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on 38 hosts with reason: Row c switch maint T331882	[production]
12:31	<sukhe>	run authdns-update for CR: 905603 depool eqiad	[production]
12:31	<ladsgroup@cumin1001>	START - Cookbook sre.hosts.downtime for 6:00:00 on 38 hosts with reason: Row c switch maint T331882	[production]
12:28	<stevemunene@puppetmaster1001>	conftool action : set/pooled=no; selector: name=aqs1018.eqiad.wmnet	[production]
12:28	<volans@cumin1001>	END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling update on A:netbox	[production]
12:28	<stevemunene@puppetmaster1001>	conftool action : set/pooled=no; selector: name=aqs1013.eqiad.wmnet	[production]
12:28	<volans@cumin1001>	START - Cookbook sre.netbox.update-extras rolling update on A:netbox	[production]
12:28	<stevemunene@puppetmaster1001>	conftool action : set/pooled=no; selector: name=aqs1012.eqiad.wmnet	[production]
12:28	<volans@cumin1001>	END (FAIL) - Cookbook sre.netbox.update-extras (exit_code=1) rolling update on A:netbox-canary	[production]
12:27	<volans@cumin1001>	START - Cookbook sre.netbox.update-extras rolling update on A:netbox-canary	[production]
12:26	<stevemunene@puppetmaster1001>	conftool action : set/pooled=no; selector: name=datahubsearch1003.eqiad.wmnet	[production]
12:24	<TimStarling>	I noticed that mw2382 was still talking to mwlog1002. It still had old php-fpm7.4 processes despite the scap. So I manually restarted php-fpm on it.	[production]
12:17	<tstarling@deploy2002>	Synchronized src/Profiler.php: T331882 disable profiling for switch maintenance (duration: 05m 58s)	[production]
11:35	<hnowlan@puppetmaster1001>	conftool action : set/pooled=inactive; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet	[production]
11:24	<moritzm>	installing joblib security updates	[production]
10:17	<hnowlan@puppetmaster1001>	conftool action : set/pooled=yes:weight=5; selector: service=thumbor,name=kubernetes101[0123].eqiad.wmnet	[production]
09:51	<hashar@deploy2002>	rebuilt and synchronized wikiversions files: Revert "group0 wikis to 1.41.0-wmf.3" \| T330209	[production]
09:42	<hashar@deploy2002>	rebuilt and synchronized wikiversions files: group0 wikis to 1.41.0-wmf.3 refs T330209	[production]
09:20	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1201 (T333332)', diff saved to https://phabricator.wikimedia.org/P46025 and previous config saved to /var/cache/conftool/dbconfig/20230404-091639-ladsgroup.json	[production]
09:19	<hashar@deploy2002>	Pruned MediaWiki: 1.41.0-wmf.1 (duration: 02m 16s)	[production]
09:12	<hashar@deploy2002>	Finished scap: testwikis wikis to 1.41.0-wmf.3 refs T330209 (duration: 40m 20s)	[production]
09:09	<moritzm>	installing libmicrohttpd security updates	[production]
09:07	<moritzm>	installing libdatetime-timezone-perl updates	[production]
09:04	<akosiaris@deploy2002>	helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'sync'.	[production]
09:04	<akosiaris@deploy2002>	helmfile [aux-k8s-eqiad] START helmfile.d/admin 'sync'.	[production]
09:04	<akosiaris@deploy2002>	helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.	[production]
09:04	<akosiaris@deploy2002>	helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.	[production]
09:03	<akosiaris@deploy2002>	helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.	[production]
09:03	<akosiaris@deploy2002>	helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.	[production]
09:03	<akosiaris@deploy2002>	helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.	[production]
09:03	<akosiaris@deploy2002>	helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.	[production]
09:02	<akosiaris@deploy2002>	helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.	[production]
09:02	<akosiaris@deploy2002>	helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.	[production]
09:02	<akosiaris@deploy2002>	helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.	[production]
09:02	<akosiaris@deploy2002>	helmfile [staging-codfw] START helmfile.d/admin 'sync'.	[production]
09:02	<akosiaris@deploy2002>	helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.	[production]
09:02	<akosiaris@deploy2002>	helmfile [staging-eqiad] START helmfile.d/admin 'sync'.	[production]
09:01	<akosiaris@deploy2002>	helmfile [codfw] DONE helmfile.d/admin 'sync'.	[production]
09:01	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'Repooling after maintenance db1201', diff saved to https://phabricator.wikimedia.org/P46024 and previous config saved to /var/cache/conftool/dbconfig/20230404-090133-ladsgroup.json	[production]
09:01	<akosiaris@deploy2002>	helmfile [codfw] START helmfile.d/admin 'sync'.	[production]
08:55	<ladsgroup@cumin1001>	dbctl commit (dc=all): 'db1162 (re)pooling @ 100%: Maint over', diff saved to https://phabricator.wikimedia.org/P46023 and previous config saved to /var/cache/conftool/dbconfig/20230404-085553-ladsgroup.json	[production]
08:55	<vgutierrez@cumin1001>	END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad	[production]
08:53	<vgutierrez@cumin1001>	END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad	[production]