__all__ SAL

7351-7400 of 10000 results (102ms)

2019-04-17 §
14:34	<otto@deploy1001>	scap-helm eventgate-analytics cluster codfw completed	[production]
14:34	<otto@deploy1001>	scap-helm eventgate-analytics upgrade production -f eventgate-analytics-codfw-values.yaml --reset-values stable/eventgate-analytics [namespace: eventgate-analytics, clusters: codfw]	[production]
14:13	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
14:12	<elukey@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
13:56	<otto@deploy1001>	scap-helm eventgate-analytics finished	[production]
13:56	<otto@deploy1001>	scap-helm eventgate-analytics cluster staging completed	[production]
13:56	<otto@deploy1001>	scap-helm eventgate-analytics upgrade staging -f eventgate-analytics-staging-values.yaml --reset-values stable/eventgate-analytics [namespace: eventgate-analytics, clusters: staging]	[production]
13:52	<elukey>	upgrading hadoop cdh distrubition to 5.16.1 on all the Hadoop-related nodes - T218343	[production]
13:52	<Lucas_WMDE>	wikidata-shex sudo systemctl restart apache2	[wikidata-dev]
13:48	<elukey@cumin1001>	END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)	[production]
13:48	<elukey@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
13:47	<godog>	reimage prometheus2004 - T187987	[production]
13:41	<Lucas_WMDE>	wikidata-shex vagrant provision (T221231)	[wikidata-dev]
13:40	<Lucas_WMDE>	wikidata-shex `git pull` in /srv/mediawiki-vagrant, then `vagrant reload` (T221231)	[wikidata-dev]
12:57	<filippo@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=prometheus1004.eqiad.wmnet	[production]
12:44	<godog>	bounce prometheus instances on prometheus[12]003 after https://gerrit.wikimedia.org/r/c/operations/puppet/+/499742	[production]
12:33	<moritzm>	running some ferm tests on graphite2002	[production]
12:10	<godog>	briefly stop all prometheus on prometheus1003 to finish metrics rsync - T187987	[production]
12:08	<arturo>	T221225 rebooting bastions to clean sssd. We are back to nscd/nslcd until we figure out what's wrong here	[tools]
11:58	<arturo>	T221205 sssd was deployed successfully into all webgrid nodes	[tools]
11:39	<Lucas_WMDE>	EU SWAT done	[production]
11:39	<arturo>	deploy sssd to tools-sge-services-03/04 (includes reboot)	[tools]
11:38	<lucaswerkmeister-wmde@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:504380\|Enable suggestion constraint status on testwikidata (T221108, T204439)]] (duration: 01m 01s)	[production]
11:31	<arturo>	reboot bastions for sssd deployment	[tools]
11:30	<arturo>	deploy sssd to bastions	[tools]
11:24	<arturo>	disable puppet in bastions to deploy sssd	[tools]
10:58	<volans@deploy1001>	Finished deploy [debmonitor/deploy@f049b3b]: Deploy Debmonitor v0.1.9 (duration: 01m 00s)	[production]
10:57	<volans@deploy1001>	Started deploy [debmonitor/deploy@f049b3b]: Deploy Debmonitor v0.1.9	[production]
10:40	<moritzm>	installing Java security updates on kafka/analytics cluster	[production]
09:52	<arturo>	T221205 tools-sgewebgrid-lighttpd-0915 requires some manual intervention because issues in the dpkg database prevents deleting nscd/nslcd packages	[tools]
09:45	<arturo>	T221205 tools-sgewebgrid-lighttpd-0913 requires some manual intervention because unconfigured packages prevents a clean puppet agent run	[tools]
09:17	<godog>	swift eqiad-prod continue ms-be1013 decom - T220590	[production]
09:12	<arturo>	T221205 start deploying sssd to sgewebgrid nodes	[tools]
09:09	<elukey>	restart eventlogging on eventlog1002 due to errors in processors and consumer lag accumulated after the last Kafka Jumbo roll restart	[production]
09:06	<elukey>	restart eventlogging on eventlog1002 due to errors in processors and consumer lag accumulated after the last Kafka Jumbo roll restart	[analytics]
09:00	<arturo>	T221205 add `profile::ldap::client::labs::client_stack: sssd` in horizon for the puppet prefixes `tools-sgewebgrid-lighttpd` and `tools-sgewebgrid-generic`	[tools]
08:56	<arturo>	T221205 disable puppet in all tools-sgewebgrid-* nodes	[tools]
08:47	<godog>	reimage prometheus1004 - T187987	[production]
08:38	<jynus@deploy1001>	Synchronized wmf-config/db-eqiad.php: Repool db1078 fully (duration: 01m 00s)	[production]
08:29	<moritzm>	installing ghostscript security updates	[production]
07:51	<gilles@deploy1001>	Synchronized php-1.33.0-wmf.25/extensions/NavigationTiming: T216597 Event timing support (duration: 01m 01s)	[production]
07:45	<gilles@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: T216597 Enable Event Timing origin trial on ruwiki and eswiki (duration: 01m 04s)	[production]
07:21	<jynus@deploy1001>	Synchronized wmf-config/db-eqiad.php: Repool db1078 with low load (duration: 01m 18s)	[production]
07:07	<moritzm>	rolling reboots of Swift backends in codfw for combined kernel/glibc/OpenSSL update	[production]
2019-04-16 §
23:42	<ebernhardson@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: Return CirrusSearch to standard execution against eqiad cluster (duration: 01m 00s)	[production]
23:37	<ebernhardson@deploy1001>	Synchronized php-1.33.0-wmf.25/extensions/CirrusSearch/includes/: Fix fatals on malformed search queries against overridden clusters (duration: 01m 06s)	[production]
22:55	<bd808>	cloudcontrol2003-dev: added `exit 0` to /etc/cron.hourly/keystone to stop cron spam on partially configured cluster	[admin]
22:42	<thcipriani>	gerrit back	[production]
22:39	<thcipriani>	restarting gerrit for configuration update https://gerrit.wikimedia.org/r/504448	[production]
22:24	<jforrester@deploy1001>	Synchronized wmf-config/InitialiseSettings.php: T165795 Give bureaucrats the usermerge right (duration: 00m 59s)	[production]