production SAL

2251-2300 of 10000 results (60ms)

2019-05-06 §
17:11	<jynus>	restart dbprov* hosts, in sequence, for kernel upgrade	[production]
16:42	<jynus>	restart db1114 mysql for upgrade testing	[production]
16:38	<andrewbogott>	re-imaging cloudvirt1024	[production]
16:34	<jynus>	restart db2102 mysql for upgrade testing	[production]
16:11	<hashar>	CI queue drained. Should be working fine again now	[production]
15:57	<hashar>	CI / Zuul is being slowed down and being investigated	[production]
15:48	<moritzm>	updating firmware-bnx2x (from stretch point release, this is a NOP, the source package firmware-nonfree was updated for various Wifi chipsets we don't use, doublechecked by comparing check sums for old and new bnx2x firmware)	[production]
15:37	<moritzm>	updating firmware-bnx2 (from stretch point release, this is a NOP, the source package firmware-nonfree was updated for various Wifi chipsets we don't use, doublechecked by comparing check sums for old and new bnx2 firmware)	[production]
15:35	<papaul>	shutting down elastic2038 for DIMM swap	[production]
15:30	<moritzm>	updating base-files from recent stretch point release	[production]
15:14	<ema>	pool cp4026 w/ ATS backend T219967	[production]
14:57	<godog>	capture strace / core for rsyslog on wezen / lithium and restart - T199406	[production]
14:42	<ema>	powercycle cp1083	[production]
14:41	<ema@puppetmaster1001>	conftool action : set/pooled=no; selector: name=cp1083.eqiad.wmnet	[production]
14:35	<godog>	swift eqiad-prod: finish decom ms-be101[45] - T220590	[production]
14:25	<moritzm>	installing vips security updates	[production]
14:19	<ema>	depool cp4026 and reimage as upload_ats T219967	[production]
14:11	<otto@deploy1001>	scap-helm eventgate-analytics finished	[production]
14:11	<otto@deploy1001>	scap-helm eventgate-analytics cluster staging completed	[production]
14:11	<otto@deploy1001>	scap-helm eventgate-analytics upgrade analytics -f analytics/staging-values.yaml --reset-values stable/eventgate [namespace: eventgate-analytics, clusters: staging]	[production]
14:09	<hashar>	CI workflow fixed by reverting a change deployed around 10:00 UTC # T222614	[production]
14:03	<ema>	cp3038: restart varnish-be	[production]
13:56	<otto@deploy1001>	scap-helm eventgate-analytics finished	[production]
13:56	<otto@deploy1001>	scap-helm eventgate-analytics cluster staging completed	[production]
13:56	<otto@deploy1001>	scap-helm eventgate-analytics install -n analytics -f analytics/staging-values.yaml stable/eventgate [namespace: eventgate-analytics, clusters: staging]	[production]
13:54	<moritzm>	installing zziplib security updates	[production]
13:52	<hashar>	CI does not run sometime for some reason ... https://phabricator.wikimedia.org/T222614 :(	[production]
13:22	<moritzm>	installing audiofile security updates	[production]
13:20	<moritzm>	installing unzip security updates	[production]
12:43	<moritzm>	installing rsync security updates	[production]
12:24	<moritzm>	installing golang security updates on jessie	[production]
11:44	<Amir1>	EU SWAT is done	[production]
11:40	<ladsgroup@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:508303\|Enable Suggestion Constraint Status on Wikidata]] (duration: 00m 52s)	[production]
11:32	<arturo>	reverting puppet change to the sudo module	[production]
11:17	<arturo>	merging puppet change to the sudo module https://gerrit.wikimedia.org/r/c/operations/puppet/+/507376	[production]
10:59	<ema>	manual puppet-merge $sha on failed puppetmasters https://phabricator.wikimedia.org/P8477	[production]
10:44	<jdrewniak@deploy1001>	Synchronized portals: Wikimedia Portals Update: [[gerrit:508302\| Bumping portals to master (T128546)]] (duration: 00m 51s)	[production]
10:43	<jdrewniak@deploy1001>	Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:508302\| Bumping portals to master (T128546)]] (duration: 00m 52s)	[production]
10:05	<arturo>	upgrade udev in cloudservices2002-dev	[production]
09:59	<arturo>	T222148 upgrade udev & libudev1 on cloudvirt[1001-1003,1005].eqiad.wmnet	[production]
09:35	<elukey>	restart netbox on netmon1002 (trying to reproduce the segfault) - T212697	[production]
09:03	<godog>	upgrade labmon1001 to prometheus 2 - T187987	[production]
06:01	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Give some API traffic to db1093 (duration: 00m 52s)	[production]
05:08	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Give some weight to db1093 (duration: 00m 58s)	[production]
04:08	<ariel@deploy1001>	Finished deploy [dumps/dumps@b4b7733]: reduce sleep time more between wikis for incrs (duration: 00m 05s)	[production]
04:08	<ariel@deploy1001>	Started deploy [dumps/dumps@b4b7733]: reduce sleep time more between wikis for incrs	[production]
2019-05-05 §
14:42	<elukey>	restart pdfrender on scb1004	[production]
03:10	<chaomodus>	fyi scb* flapping on some endpoints seems to be just noise, there is high load from mobileapi but things appear to be operating normally otherwise, several boxes are in the process of checking md which may account for service lags	[production]
02:40	<andrewbogott>	restarting mariadb on cloudservices1003	[production]
2019-05-04 §
22:20	<reedy@deploy1001>	Synchronized docroot/mediawiki/xml/index.html: Add extra xml namespace links (duration: 01m 06s)	[production]