production SAL

6551-6600 of 10000 results (38ms)

2017-08-07 §
14:51	<gehel>	reducing elasticsearch eqiad concurrent rebalance to 4 (from 8)	[production]
14:38	<elukey>	updated librdkafka1 and ++1 to 0.9.4.1 on hafnium	[production]
14:32	<mutante>	phab2001 - stopping Apache,schedule downtime for http and puppet	[production]
14:22	<herron>	mx[1,2]001, fermium: Installed libmail-dkim-perl and restarted spamassassin service - T172689	[production]
13:15	<jynus>	reboot db1098	[production]
12:39	<_joe_>	restarting pdfrender on scb1001, T159922	[production]
12:39	<elukey>	restart kafka on kafka1018 to force it out of the kafka topic leaders - T172681	[production]
12:26	<marostegui@tin>	Synchronized wmf-config/db-codfw.php: Repool db2074 - T171321 (duration: 00m 45s)	[production]
12:08	<gehel>	deploying https://gerrit.wikimedia.org/r/#/c/299825/ - some logs will be lost during logstash restart	[production]
10:02	<marostegui>	Add dbstore2002:3313 to tendril - T171321	[production]
09:47	<jynus>	stopping db1050's mysql and cloning it to db1089	[production]
09:06	<elukey>	set net.netfilter.nf_conntrack_tcp_timeout_time_wait=65 (was 120) on all the analytics kafka brokers - T136094	[production]
09:03	<marostegui@tin>	Synchronized wmf-config/db-codfw.php: Repool db2065 after fixing: linter, page and watchlist tables (duration: 00m 47s)	[production]
08:12	<marostegui>	Force BBU re-learn on db1016 - T166344	[production]
07:02	<marostegui>	Stop replication on db2065 to reimport: page, linter and watchlist tables	[production]
07:02	<marostegui@tin>	Synchronized wmf-config/db-codfw.php: Depool db2065 to reimport: page, linter and watchlist tables (duration: 00m 47s)	[production]
06:38	<marostegui>	Stop MySQL on db2074 - T171321	[production]
06:37	<marostegui@tin>	Synchronized wmf-config/db-codfw.php: Depool db2074 - T171321 (duration: 00m 46s)	[production]
06:33	<marostegui>	Stop replication on db2075 - T170662	[production]
06:27	<marostegui@tin>	Synchronized wmf-config/db-codfw.php: Repool db2073 - T171321 (duration: 00m 47s)	[production]
06:20	<marostegui>	Force BBU re-learn on db1016 - T166344	[production]
02:57	<l10nupdate@tin>	ResourceLoader cache refresh completed at Mon Aug 7 02:57:42 UTC 2017 (duration 6m 42s)	[production]
02:51	<l10nupdate@tin>	scap sync-l10n completed (1.30.0-wmf.12) (duration: 07m 56s)	[production]
02:30	<l10nupdate@tin>	scap sync-l10n completed (1.30.0-wmf.11) (duration: 10m 16s)	[production]
2017-08-06 §
13:17	<elukey>	powercycle mw2256 - com2 frozen - T163346	[production]
13:13	<elukey>	restart pdfrender on scb1002	[production]
06:18	<ebernhardson@tin>	Synchronized wmf-config/PoolCounterSettings.php: T169498: Reduce cirrus search pool counter to 200 parallel requests cluster wide (duration: 02m 54s)	[production]
01:28	<chasemp>	conf2002:~# service etcdmirror-conftool-eqiad-wmnet restart (not sure what else to do the service failed)	[production]
2017-08-05 §
14:40	<Reedy>	created oauth tables on foundationwiki T172591	[production]
14:13	<reedy@tin>	Synchronized php-1.30.0-wmf.12/extensions/WikimediaMaintenance/createExtensionTables.php: add oauth (duration: 00m 48s)	[production]
2017-08-04 §
23:51	<mutante>	phab2001 - removed outdated /etc/hosts entries, that fixed rsync, syncing /srv/repos/ from phab1001	[production]
23:35	<mutante>	phab2001 rebooting	[production]
23:35	<mutante>	phab2001 - installing various package upgrades, apt-get autoremove old kernel images	[production]
23:12	<mutante>	"reserved" UID 498 for phd on https://wikitech.wikimedia.org/wiki/UID \| phab2001: find -exec chown to fix all the files , restart cron	[production]
23:04	<mutante>	phab2001 - changing UID/GID for phd user from 997:997 to 498:498 to make it match phab1001, to fix rsync breaking permissions. (rsync forces --numeric-ids when fetching from and rsyncd configured with chroot=yes). chown -R phw:www-data /srv/repos/	[production]
22:37	<ejegg>	restarted donations and refund queue consumers	[production]
21:44	<ejegg>	stopped donations and refund queue consumers	[production]
21:24	<urandom>	T172384: Disabling Puppet in dev environment to prevent unattended Cassandra restarts	[production]
20:19	<mutante>	renewing SSL cert for status.wm.org (just like wikitech-static, but that one didnt have monitoring?)	[production]
20:02	<mutante>	wikitech-static-ord - apt-get install certbot	[production]
19:41	<ejegg>	updated CiviCRM from f1fd7f0f9e89f59a8fc4daaa5e95803a2f60acbb to f24ba787f711ed38029594f3f3049bd79221ddd7	[production]
19:38	<mutante>	renaming graphite varnish director/fixing config, running puppet on cache misc, tested on cp1045	[production]
18:17	<andrewbogott>	switched most cloud instance to new puppetmasters, as per https://phabricator.wikimedia.org/T171786	[production]
11:46	<marostegui>	Deploy schema change directly on s3 master for maiwikimedia - T172485	[production]
11:30	<marostegui>	Deploy schema change directly on s3 master for kbpwiki - T172485	[production]
11:14	<marostegui>	Deploy schema change directly on s3 master for dinwiki - T172485	[production]
10:14	<marostegui>	Deploy schema change directly on s3 master for atjwiki - T172485	[production]
10:05	<marostegui>	Stop replication on db2073 for maintenance	[production]
09:22	<marostegui>	Add dbstore2002 to tendril - T171321	[production]
09:19	<marostegui>	Deploy schema change directly on s3 master for techconductwiki - T172485	[production]