2015-12-22
§
|
15:53 |
<mutante> |
kafka1001,1002 - crit - eventlogging not running (?) |
[production] |
15:52 |
<mutante> |
restbase1003 - disk space, restbase1008 - disk space, restbase1004 - cassandra cql refused |
[production] |
15:23 |
<akosiaris> |
upgrade cassandra on maps-test2003 |
[production] |
15:06 |
<jynus> |
restarting and reconfiguring mysql at dbstore2001 |
[production] |
15:06 |
<mutante> |
labtestcontrol2001 - puppet had not been running for a while, a bunch of changes have been applied incl. keys and passwords |
[production] |
15:04 |
<mutante> |
enabling puppet on labtestcontrol2001 |
[production] |
15:04 |
<akosiaris> |
upgraded cassandra on maps-test2004 |
[production] |
11:54 |
<apergos> |
salt packages with wmf packages precise running on ms-{bf}e* in esams; trusty running on analytics103* in eqiad; jessie running on restbase2* in codfw |
[production] |
11:43 |
<godog> |
restart cassandra bootstrap on restbase1004 |
[production] |
10:09 |
<jynus> |
online resizing /srv/postgres on labsdb1006 +100GB |
[production] |
10:06 |
<hashar> |
Restarting Jenkins |
[production] |
09:54 |
<apergos> |
precise and trusty salt packages with wmf patches deployed manually on dataset1001 and analytics1001, seem to work fine |
[production] |
08:42 |
<jynus> |
restarting and reconfiguring mysql at db2036 |
[production] |
02:30 |
<l10nupdate@tin> |
ResourceLoader cache refresh completed at Tue Dec 22 02:30:28 UTC 2015 (duration 6m 54s) |
[production] |
02:23 |
<mwdeploy@tin> |
sync-l10n completed (1.27.0-wmf.9) (duration: 09m 47s) |
[production] |
00:29 |
<krenair@tin> |
Synchronized php-1.27.0-wmf.9/extensions/VisualEditor: https://gerrit.wikimedia.org/r/#/c/260492/ (duration: 00m 32s) |
[production] |
00:22 |
<krenair@tin> |
Synchronized php-1.27.0-wmf.9/extensions/SyntaxHighlight_GeSHi/modules/ve-syntaxhighlight/ve.ui.MWSyntaxHighlightDialogTool.js: https://gerrit.wikimedia.org/r/#/c/260429/ (duration: 00m 30s) |
[production] |
2015-12-21
§
|
20:49 |
<godog> |
restbase1004 bootstrap failed, restbase1007-a is down java.lang.RuntimeException: A node required to move the data consistently is down (/10.64.0.230). |
[production] |
19:27 |
<legoktm> |
running checkLocalUser.php --delete=1 for real this time on terbium |
[production] |
19:22 |
<godog> |
reimage restbase1004 |
[production] |
19:14 |
<paravoid> |
powercycling mw1011 |
[production] |
19:11 |
<paravoid> |
rolling restart of hhvm on the eqiad jobrunners |
[production] |
18:47 |
<jynus> |
common-sync: Copying to mw1016.eqiad.wmnet from tin.eqiad.wmnet |
[production] |
18:35 |
<ori> |
correction: previous log message was for mw1015, not mw1017 |
[production] |
18:27 |
<ori> |
mw1017: enabled jemalloc profiling, restarted hhvm, now running hhvm-collect-heaps |
[production] |
17:48 |
<akosiaris> |
restarted hhvm on mw1012.eqiad.wmnet |
[production] |
16:57 |
<thcipriani> |
timeout on sync-file to mw1016.eqiad.wmnet |
[production] |
16:56 |
<thcipriani@tin> |
Synchronized php-1.27.0-wmf.9/extensions/Popups/Popups.hooks.php: SWAT: Use ExtensionRegistry to determine whether TextExtracts is installed [[gerrit:260346]] (duration: 02m 48s) |
[production] |
16:34 |
<jynus> |
sync-common to mw1085 |
[production] |
16:26 |
<jynus> |
powercycling mw1085.eqiad.wmnet |
[production] |
16:22 |
<thcipriani> |
mw1085.eqiad.wmnet times out on SSH connection |
[production] |
16:19 |
<godog> |
reboot restbase1007, load through the roof |
[production] |
16:18 |
<thcipriani@tin> |
Synchronized php-1.27.0-wmf.9/extensions/CentralNotice/resources/subscribing/ext.centralNotice.geoIP.js: SWAT: Update CentralNotice [[gerrit:260316]] (duration: 03m 03s) |
[production] |
16:08 |
<godog> |
depool restbase1007 |
[production] |
16:01 |
<apergos> |
jessie packages for salt with local patches deployed on restbase1001, looks fine but just in case. |
[production] |
15:44 |
<godog> |
adding new 1TB disk to restbase1007 |
[production] |
14:22 |
<andrewbogott> |
disabling puppet on labnet1002 for dnsmasq tests |
[production] |
14:07 |
<MaxSem> |
me and yurik are nuking old maps data and reimporting planet |
[production] |
13:46 |
<jynus> |
extending online s2-master data disk by +100GB |
[production] |
13:15 |
<akosiaris> |
disabled puppet on maps-test2001 and commented out osmupdater crontab entry until we fix the sync process |
[production] |
11:02 |
<jynus> |
emergency restart of db1047's mysql |
[production] |
09:54 |
<jynus> |
reenabling semisync replication on s3 |
[production] |
09:07 |
<godog> |
stop cassandra on restbase1004, decomissioned |
[production] |
02:29 |
<l10nupdate@tin> |
ResourceLoader cache refresh completed at Mon Dec 21 02:29:51 UTC 2015 (duration 6m 47s) |
[production] |
02:23 |
<mwdeploy@tin> |
sync-l10n completed (1.27.0-wmf.9) (duration: 09m 45s) |
[production] |
02:20 |
<andrewbogott> |
disabling puppet on labnet1002 to mess with dnsmasq |
[production] |
01:44 |
<andrewbogott> |
disabled puppet on holmium and labservices1001 to control roll-out of https://gerrit.wikimedia.org/r/#/c/260037/ |
[production] |