production SAL

2351-2400 of 10000 results (63ms)

2019-08-19 §
10:32	<elukey@cumin1001>	END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)	[production]
10:22	<elukey@cumin1001>	START - Cookbook sre.ganeti.makevm	[production]
09:57	<jbond42>	add mapped ipv6 to conf200* servers https://gerrit.wikimedia.org/r/c/operations/puppet/+/528475	[production]
09:26	<marostegui@cumin1001>	END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)	[production]
09:24	<marostegui@cumin1001>	START - Cookbook sre.hosts.downtime	[production]
08:57	<godog>	add 100G to graphite1004 / graphite2003 /srv LVs	[production]
07:59	<onimisionipe>	shutdown elastic2050 to prepare for mgmt reset - T230597	[production]
07:40	<marostegui>	Redact napwikisource on db1124 and db2094 - T210762	[production]
07:19	<moritzm>	installing golang-1.11 security updates on buster	[production]
07:08	<moritzm>	installing ffmpeg security updates on buster	[production]
06:37	<vgutierrez>	upgrading acme-chief to version 0.20 on production servers - T229096	[production]
06:30	<vgutierrez@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=ncredir1001.eqiad.wmnet	[production]
06:29	<vgutierrez@puppetmaster1001>	conftool action : set/pooled=no; selector: name=ncredir1001.eqiad.wmnet	[production]
06:28	<vgutierrez@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=ncredir1002.eqiad.wmnet	[production]
06:27	<vgutierrez@puppetmaster1001>	conftool action : set/pooled=no; selector: name=ncredir1002.eqiad.wmnet	[production]
06:26	<moritzm>	installing ghostscript security updates on scb/proton/notebook* hosts	[production]
06:25	<vgutierrez@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=ncredir2001.codfw.wmnet	[production]
06:25	<vgutierrez@puppetmaster1001>	conftool action : set/pooled=no; selector: name=ncredir2001.codfw.wmnet	[production]
06:24	<vgutierrez@puppetmaster1001>	conftool action : set/pooled=yes; selector: name=ncredir2002.codfw.wmnet	[production]
06:22	<vgutierrez@puppetmaster1001>	conftool action : set/pooled=no; selector: name=ncredir2002.codfw.wmnet	[production]
06:21	<vgutierrez>	rolling upgrade of nginx in ncredir hosts	[production]
06:03	<moritzm>	installing php5 security updates	[production]
05:51	<marostegui@deploy1001>	Synchronized wmf-config/db-eqiad.php: Remove db2067 from config T230705 (duration: 00m 47s)	[production]
05:50	<marostegui@deploy1001>	Synchronized wmf-config/db-codfw.php: Remove db2067 from config T230705 (duration: 00m 50s)	[production]
05:46	<marostegui@cumin1001>	dbctl commit (dc=all): 'Depool db2067, will be moved to m1 T230705', diff saved to https://phabricator.wikimedia.org/P8930 and previous config saved to /var/cache/conftool/dbconfig/20190819-054606-marostegui.json	[production]
05:29	<elukey>	reboot cp2004 due to bnx2x crash (kern.log saved into my home on the host if needed)	[production]
2019-08-18 §
08:28	<onimisionipe>	running `_cluster/reroute?pretty&explain=true&retry_failed` on eqiad production-search cluster to force allocation of shards	[production]
2019-08-16 §
19:48	<sbassett>	Deployed security patch for T230576 (ex:MobileFrontend)	[production]
18:57	<@>	helmfile [STAGING] Ran 'apply' command on namespace 'sessionstore' for release 'staging' .	[production]
16:38	<XioNoX>	add BGP sessions to Scaleway (AS12876) in esams	[production]
16:12	<elukey>	upload prometheus-druid-exporter 0.7-1 to stretch/buster-wikimedia	[production]
15:42	<elukey>	roll restart of druid broker/historicals to pick up new logging/metrics settings	[production]
14:39	<onimisionipe>	run `bmc-device --cold-reset; echo $?` in elastic2050 hoping it resets mgmt interface -T230597	[production]
14:24	<gehel>	rolling reboot of cloudelastic	[production]
13:52	<mholloway-shell@deploy1001>	Synchronized wmf-config/InitialiseSettings-labs.php: MachineVision (beta): Request labels targeting Beta Wikidata (duration: 00m 50s)	[production]
08:18	<_joe_>	stopping php on phab1003, to restart it with systemd	[production]
06:50	<_joe_>	upgrading envoyproxy across production (http2 CVEs)	[production]
02:51	<vgutierrez>	repooling cp5002, running compress.so experiment	[production]
2019-08-15 §
23:35	<smalyshev@deploy1001>	Finished deploy [wdqs/wdqs@b4da6e4]: Rollback blazegraph due to T230588 (duration: 09m 48s)	[production]
23:25	<smalyshev@deploy1001>	Started deploy [wdqs/wdqs@b4da6e4]: Rollback blazegraph due to T230588	[production]
21:54	<smalyshev@deploy1001>	Finished deploy [wdqs/wdqs@fce8177]: Weekly deploy (duration: 25m 28s)	[production]
21:28	<smalyshev@deploy1001>	Started deploy [wdqs/wdqs@fce8177]: Weekly deploy	[production]
21:27	<ebernhardson>	finish restarting cloudelastic-chi-eqiad with -XX:NewRatio=3	[production]
21:18	<ebernhardson>	increase cloudelastic indices.recovery.max_bytes_per_sec from 40mbit to 512mbit as these have 10G networking	[production]
21:07	<ebernhardson>	restart cloudelastic1002 with -XX:NewRatio=3 to match cloudelastic1001	[production]
20:22	<gehel@cumin1001>	END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)	[production]
19:37	<ema>	depool cp5002 during the EU night, running compress.so experiment	[production]
19:28	<gehel@cumin1001>	END (PASS) - Cookbook sre.wdqs.reboot-wdqs (exit_code=0)	[production]
19:19	<sbassett>	Deployed security patch for T230402 (1.34.0-wmf.17)	[production]
19:18	<gehel@cumin1001>	START - Cookbook sre.wdqs.data-transfer	[production]