1-50 of 10000 results (20ms)
2015-11-30 §
21:55 <mutante> re-wrote l10nupdate cron; restarted cron service on tin [production]
20:05 <apergos> re-enabled puppet on neodymium, minion testing concluded for now [production]
19:47 <gwicke> running `nodetool decommission` on restbase1009 in preparation for the conversion to the multi-instance setup, per https://phabricator.wikimedia.org/T95253# [production]
19:31 <demon@tin> Synchronized wmf-config/InitialiseSettings.php: rm deprecated/unused rate limit log config (duration: 00m 28s) [production]
17:27 <demon@tin> Synchronized php-1.27.0-wmf.7/extensions/WikimediaMaintenance/: need maint script errywhere (duration: 00m 28s) [production]
16:51 <thcipriani@tin> Synchronized php-1.27.0-wmf.7/extensions/ContentTranslation/modules/draft/ext.cx.draft.js: SWAT: Add some extra information to save failure logging [[gerrit:255956]] (duration: 00m 28s) [production]
16:38 <thcipriani@tin> Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable QuickSurveys reader segmentation survey [[gerrit:255448]] (duration: 00m 28s) [production]
16:30 <paravoid> mw1002 service hhvm restart [production]
16:17 <paravoid> rolling back to kernel 3.19 on lvs2001/2/3 [production]
15:29 <paravoid> stopping pybal on lvs2001/2/3 [production]
15:21 <paravoid> switching lvs2004/5/6 traffic back to lvs2001/2/3 [production]
15:13 <paravoid> switching lvs2001/2/3 traffic to lvs2004/5/6 and upgrading kernels [production]
15:12 <_joe_> restarting HHVM on mw1147 too, same reason as mw1114 [production]
15:10 <_joe_> restarting hhvm on mw1114, stuck in __pthread_cond_wait () [folly::EventBase::runInEventBaseThreadAndWait ()], apparently blocked in writing to stdout [production]
15:02 <paravoid> switching traffic from lvs4002 to lvs4004; upgrading lvs4002's kernel [production]
15:02 <paravoid> switching traffic back to lvs4001 [production]
14:57 <paravoid> switching traffic from lvs4001 to lvs4003; upgrading lvs4001's kernel [production]
14:45 <paravoid> switching traffic from lvs3001 to lvs3003; upgrading lvs3001's kernel [production]
14:38 <paravoid> switching traffic back to lvs3002 [production]
14:31 <paravoid> switching traffic from lvs3002 to lvs3004; upgrading lvs3002's kernel [production]
14:07 <bblack> upgrading varnishkafka package on all caches [production]
13:52 <bblack> updating varnishkafka on cp1065 [production]
11:03 <godog> upgrade python-statsd to 3.0.1 in eqiad [production]
10:59 <godog> upgrade python-statsd to 3.0.1 in codfw [production]
10:15 <godog> reenable puppet on graphite1001 [production]
10:10 <paravoid> re-enabling OSPF over cr2-eqiad:xe-5/2/2 <-> cr1-ulsfo:xe-0/0/3.538 [production]
10:09 <paravoid> re-enabling cr2-eqiad:xe-5/2/0 and xe-5/2/1 [production]
10:01 <jynus> performing schema change on db1046 (analytics master) [production]
09:32 <jynus> removing old snapshots from db1046 [production]
06:38 <ori> Restarted statsv on hafnium [production]
02:00 <l10nupdate@tin> LocalisationUpdate failed: git pull of core failed [production]
01:56 <gwicke> started `nodetool cleanup` on restbase1002 to get rid of unnecessary data from earlier 1001 decommission attempt [production]
01:05 <bd808@tin> sync-l10n completed (1.27.0-wmf.7) (duration: 01m 19s) [production]
01:04 <bd808> testing l10n cache rebuild as l10nupdate user (take 2) [production]
00:57 <Krenair> test [production]
00:49 <bd808@tin> sync-l10nupdate completed (1.27.0-wmf.7) (duration: 04m 37s) [production]
00:45 <bd808> testing l10n cache rebuild as l10nupdate user [production]
00:01 <bd808> Tried to update scap to 1879fd4 (Add sync-l10n command for l10nupdate); trebuchet reported 0/483 minions completing fetch and 3/483 minions completing checkout [production]
2015-11-29 §
21:25 <jynus> importing user.user_touched (s7) from dbstore1002 to sanitarium. s7 lag on labs replicas will be higher for some minutes. [production]
20:51 <jynus> importing user.user_touched (s6) from dbstore1002 to sanitarium. s6 lag on labs replicas will be higher for some minutes. [production]
20:28 <jynus> importing user.user_touched (s5) from dbstore1002 to sanitarium. s5 lag on labs replicas will be higher for some minutes. [production]
19:51 <jynus> importing user.user_touched (s4) from dbstore1002 to sanitarium. s4 lab will be affected for some minutes. [production]
04:50 <gwicke> restarted cassandra on restbase1009 to avoid it running out of disk space; had large compaction (~2TB) at 80% and only 64G disk space left [production]
03:01 <YuviPanda> run chown -R l10nupdate: /var/lib/l10nupdate/mediawiki for Reedy on tin [production]
02:28 <Reedy> l10nupdate failed because some git objects owned by 997:l10nupdate [production]
02:00 <l10nupdate@tin> LocalisationUpdate failed: git pull of core failed [production]
2015-11-28 §
22:48 <bd808@tin> Synchronized php-1.27.0-wmf.5/cache/l10n: bd808 testing l10nupdate sync-dir using stale branch (duration: 01m 29s) [production]
20:49 <l10nupdate@tin> LocalisationUpdate failed: Failed to sync-dir 'php-1.27.0-wmf.7/cache/l10n' [production]
20:49 <krenair@tin> Synchronized php-1.27.0-wmf.7/cache/l10n: l10nupdate for 1.27.0-wmf.7 (duration: 07m 11s) [production]
20:35 <ori@tin> Synchronized wmf-config/InitialiseSettings.php: Ie33ae3b6a: Increase $wgCopyUploadTimeout to 90 seconds (from default 25) (T118887) (duration: 00m 27s) [production]