production SAL

2801-2850 of 10000 results (43ms)

2016-07-25 §
18:31	<ottomata>	upgrading kafka to 0.9 in main-codfw, first kafka2001 then 2002	[production]
18:15	<mutante>	ytterbium - revoke puppet cert, delete salt-key, remove from icinga	[production]
16:16	<urandom>	T134016: Restarting Cassandra to apply stream timeout (restbase1013-b.eqiad.wmnet)	[production]
16:10	<urandom>	T134016: Restarting Cassandra to apply stream timeout (restbase1013-a.eqiad.wmnet)	[production]
16:06	<urandom>	T140825, T134016: Restarting Cassandra to apply stream timeout, and disable trickle_fsync (restbase1012-c.eqiad.wmnet)	[production]
16:02	<urandom>	T140825, T134016: Restarting Cassandra to apply stream timeout, and disable trickle_fsync (restbase1012-b.eqiad.wmnet)	[production]
15:54	<urandom>	T140825, T134016: Reststarting Cassandra to apply stream timeout, and disable trickle_fsync (restbase1012-a.eqiad.wmnet)	[production]
15:53	<urandom>	T140825: Setting vm.dirty_background_bytes=24M on restbase1012.eqiad.wmnet	[production]
15:43	<urandom>	T140825, T134016: Reststarting Cassandra to apply stream timeout, and 8MB trickle_fsync (restbase1008-c.eqiad.wmnet)	[production]
15:39	<urandom>	T140825, T134016: Reststarting Cassandra to apply stream timeout, and 8MB trickle_fsync (restbase1008-b.eqiad.wmnet)	[production]
15:34	<urandom>	T140825, T134016: Reststarting Cassandra to apply stream timeout, and 8MB trickle_fsync (restbase1008-a.eqiad.wmnet)	[production]
15:28	<elukey>	Standardized the jmxtrans GC metric names to pick up automatically variations in settings. This introduces metric name changes in Hadoop, Zookeeper, Kafka. (https://gerrit.wikimedia.org/r/#/c/299118/)	[production]
12:53	<moritzm>	installing squid security updates	[production]
10:10	<_joe_>	remove spurious puppet facts	[production]
10:10	<_joe_>	remove spurious puppet facts	[production]
10:04	<moritzm>	installing Django security updates	[production]
09:18	<godog>	swift eqiad-prod: ms-be102[3456] weight 1500	[production]
03:26	<hashar>	scandium: migrating zuul-merger repos from lead to gerrit.wikimedia.org: find /srv/ssd/zuul/git -path '*/.git/config' -print -execdir sed -i -e 's/lead.wikimedia.org/gerrit.wikimedia.org/' config \;	[production]
02:28	<l10nupdate@tin>	ResourceLoader cache refresh completed at Mon Jul 25 02:28:21 UTC 2016 (duration 5m 52s)	[production]
02:22	<mwdeploy@tin>	scap sync-l10n completed (1.28.0-wmf.11) (duration: 09m 09s)	[production]
02:03	<ostriches>	gerrit: reindexing lucene now that we have new data. searches/dashboards may look a tad weird for a bit	[production]
01:53	<hashar>	starting Zuul	[production]
01:51	<mutante>	restarted grrrit-wm	[production]
01:39	<ostriches>	lead: turning puppet back on, here we go	[production]
01:38	<jynus>	m2 replication on db2011 stopped, master binlog pos: db1020-bin.000968:1013334195	[production]
01:37	<hashar>	scandium: restarted zuul-merger	[production]
01:36	<ostriches>	ytterbium: Stopped puppet, stopped gerrit process.	[production]
01:34	<mutante>	switched gerrit-new to gerrit in DNS	[production]
01:30	<ostriches>	lead: stopped puppet for a few minutes	[production]
01:17	<hashar>	scandium: migrating zuul-merger repos to lead find /srv/ssd/zuul/git -path '*/.git/config' -print -execdir sed -i -e 's/ytterbium.wikimedia.org/lead.wikimedia.org/' config \;	[production]
01:10	<hashar>	stopping CI	[production]
01:09	<jynus>	reviewdb backup finished, available on db1020:/srv/tmp/2016-07-25_00-54-31/	[production]
01:02	<ostriches>	rsyncing latest git data from ytterbium to lead	[production]
00:57	<mutante>	manually deleted reviewer-counts cron from gerrit2 user, runs as root and puppet does not remove crons unless ensure=>absent	[production]
00:55	<jynus>	starting hot backup of db1020's reviewdb	[production]
2016-07-24 §
02:25	<l10nupdate@tin>	ResourceLoader cache refresh completed at Sun Jul 24 02:25:08 UTC 2016 (duration 4m 34s)	[production]
02:20	<mwdeploy@tin>	scap sync-l10n completed (1.28.0-wmf.11) (duration: 08m 59s)	[production]
2016-07-23 §
15:38	<godog>	stop swift in esams test cluster, lots of logging from there	[production]
15:37	<godog>	lithium sudo lvextend --size +10G -r /dev/mapper/lithium--vg-syslog	[production]
04:58	<ori>	Gerrit is back up after service restart; was unavailable between ~ 04:29 - 04:57 UTC	[production]
04:56	<ori>	Restarting Gerrit on ytterbium	[production]
04:48	<ori>	Users report Gerrit is down; on ytterbium java is occupying two cores at 100%	[production]
03:48	<chasemp>	gnt-instance reboot seaborgium.wikimedia.org	[production]
02:26	<l10nupdate@tin>	ResourceLoader cache refresh completed at Sat Jul 23 02:26:49 UTC 2016 (duration 5m 41s)	[production]
02:21	<mwdeploy@tin>	scap sync-l10n completed (1.28.0-wmf.11) (duration: 08m 24s)	[production]
01:02	<tgr@tin>	Synchronized php-1.28.0-wmf.11/extensions/CentralAuth/includes/CentralAuthPlugin.php: T141160 (duration: 00m 29s)	[production]
01:01	<tgr@tin>	Synchronized php-1.28.0-wmf.11/extensions/CentralAuth/includes/CentralAuthHooks.php: T141160 (duration: 00m 27s)	[production]
01:00	<tgr@tin>	Synchronized php-1.28.0-wmf.11/extensions/CentralAuth/includes/CentralAuthPrimaryAuthenticationProvider.php: T141160 (duration: 00m 28s)	[production]
00:37	<tgr>	doing an emergency deploy of https://gerrit.wikimedia.org/r/#/c/300679 for T141160, creates dozens of new users per hour to be unattached on loginwiki which probably has weird consequences	[production]
2016-07-22 §
22:19	<aaron@tin>	Synchronized wmf-config/InitialiseSettings.php: Enable debug logging for DBTransaction (duration: 00m 38s)	[production]