401-450 of 10000 results (28ms)
2015-12-22 §
15:04 <mutante> enabling puppet on labtestcontrol2001 [production]
15:04 <akosiaris> upgraded cassandra on maps-test2004 [production]
11:54 <apergos> salt packages with wmf packages precise running on ms-{bf}e* in esams; trusty running on analytics103* in eqiad; jessie running on restbase2* in codfw [production]
11:43 <godog> restart cassandra bootstrap on restbase1004 [production]
10:09 <jynus> online resizing /srv/postgres on labsdb1006 +100GB [production]
10:06 <hashar> Restarting Jenkins [production]
09:54 <apergos> precise and trusty salt packages with wmf patches deployed manually on dataset1001 and analytics1001, seem to work fine [production]
08:42 <jynus> restarting and reconfiguring mysql at db2036 [production]
02:30 <l10nupdate@tin> ResourceLoader cache refresh completed at Tue Dec 22 02:30:28 UTC 2015 (duration 6m 54s) [production]
02:23 <mwdeploy@tin> sync-l10n completed (1.27.0-wmf.9) (duration: 09m 47s) [production]
00:29 <krenair@tin> Synchronized php-1.27.0-wmf.9/extensions/VisualEditor: https://gerrit.wikimedia.org/r/#/c/260492/ (duration: 00m 32s) [production]
00:22 <krenair@tin> Synchronized php-1.27.0-wmf.9/extensions/SyntaxHighlight_GeSHi/modules/ve-syntaxhighlight/ve.ui.MWSyntaxHighlightDialogTool.js: https://gerrit.wikimedia.org/r/#/c/260429/ (duration: 00m 30s) [production]
2015-12-21 §
20:49 <godog> restbase1004 bootstrap failed, restbase1007-a is down java.lang.RuntimeException: A node required to move the data consistently is down (/10.64.0.230). [production]
19:27 <legoktm> running checkLocalUser.php --delete=1 for real this time on terbium [production]
19:22 <godog> reimage restbase1004 [production]
19:14 <paravoid> powercycling mw1011 [production]
19:11 <paravoid> rolling restart of hhvm on the eqiad jobrunners [production]
18:47 <jynus> common-sync: Copying to mw1016.eqiad.wmnet from tin.eqiad.wmnet [production]
18:35 <ori> correction: previous log message was for mw1015, not mw1017 [production]
18:27 <ori> mw1017: enabled jemalloc profiling, restarted hhvm, now running hhvm-collect-heaps [production]
17:48 <akosiaris> restarted hhvm on mw1012.eqiad.wmnet [production]
16:57 <thcipriani> timeout on sync-file to mw1016.eqiad.wmnet [production]
16:56 <thcipriani@tin> Synchronized php-1.27.0-wmf.9/extensions/Popups/Popups.hooks.php: SWAT: Use ExtensionRegistry to determine whether TextExtracts is installed [[gerrit:260346]] (duration: 02m 48s) [production]
16:34 <jynus> sync-common to mw1085 [production]
16:26 <jynus> powercycling mw1085.eqiad.wmnet [production]
16:22 <thcipriani> mw1085.eqiad.wmnet times out on SSH connection [production]
16:19 <godog> reboot restbase1007, load through the roof [production]
16:18 <thcipriani@tin> Synchronized php-1.27.0-wmf.9/extensions/CentralNotice/resources/subscribing/ext.centralNotice.geoIP.js: SWAT: Update CentralNotice [[gerrit:260316]] (duration: 03m 03s) [production]
16:08 <godog> depool restbase1007 [production]
16:01 <apergos> jessie packages for salt with local patches deployed on restbase1001, looks fine but just in case. [production]
15:44 <godog> adding new 1TB disk to restbase1007 [production]
14:22 <andrewbogott> disabling puppet on labnet1002 for dnsmasq tests [production]
14:07 <MaxSem> me and yurik are nuking old maps data and reimporting planet [production]
13:46 <jynus> extending online s2-master data disk by +100GB [production]
13:15 <akosiaris> disabled puppet on maps-test2001 and commented out osmupdater crontab entry until we fix the sync process [production]
11:02 <jynus> emergency restart of db1047's mysql [production]
09:54 <jynus> reenabling semisync replication on s3 [production]
09:07 <godog> stop cassandra on restbase1004, decomissioned [production]
02:29 <l10nupdate@tin> ResourceLoader cache refresh completed at Mon Dec 21 02:29:51 UTC 2015 (duration 6m 47s) [production]
02:23 <mwdeploy@tin> sync-l10n completed (1.27.0-wmf.9) (duration: 09m 45s) [production]
02:20 <andrewbogott> disabling puppet on labnet1002 to mess with dnsmasq [production]
01:44 <andrewbogott> disabled puppet on holmium and labservices1001 to control roll-out of https://gerrit.wikimedia.org/r/#/c/260037/ [production]
2015-12-20 §
23:24 <Reedy> Katie and Jeff paged about bellatrix [production]
18:46 <andrewbogott> graceful restart of zuul as per https://www.mediawiki.org/wiki/Continuous_integration/Zuul#Restart [production]
18:31 <andrewbogott> restarting stuck Jenkins [production]
17:47 <reedy@tin> Purged l10n cache for 1.27.0-wmf.6 [production]
17:11 <godog> depool mw1228, reported ro fs [production]
15:53 <reedy@tin> Synchronized README: noop (duration: 00m 32s) [production]
15:50 <Reedy> reedy@tin Purged l10n cache for 1.27.0-wmf.6 (hanging due to mw1228 issue) [production]
15:42 <Reedy> mw1228 reporting readonly fs [production]