2016-01-29
§
|
23:53 |
<jynus> |
restarted db1018 replication (and its codfw slaves) after a (somewhat) failed maintenance |
[production] |
23:41 |
<mutante> |
ruthenium - restart parsoid-rt-client, parsoid-vd-client |
[production] |
23:37 |
<mutante> |
ruthenium - git pull origin in /srv/visualdiff/ |
[production] |
23:22 |
<bd808@mira> |
Synchronized php-1.27.0-wmf.11/includes/session/SessionBackend.php: Testing proposed fix for T125267 (duration: 01m 26s) |
[production] |
22:50 |
<jynus> |
powercycling cp3042 to test it is really the broken one |
[production] |
22:37 |
<jynus> |
powercycle cp3049, not 42 |
[production] |
22:37 |
<jynus> |
powercycle cp3042 |
[production] |
22:27 |
<mutante> |
cp3042 - md0: unknown partition table |
[production] |
22:23 |
<mutante> |
powercycled cp1049 |
[production] |
22:06 |
<mutante> |
powercycle cp3049 |
[production] |
21:13 |
<mutante> |
bromine - stop and remove rsync service |
[production] |
20:16 |
<aaron@mira> |
Synchronized wmf-config/CommonSettings.php: Use the logical redis definition for GettingStarted (duration: 01m 26s) |
[production] |
19:36 |
<jynus> |
reinstall db1018 |
[production] |
18:11 |
<jynus> |
creating special partitioning for db2037 and db2044 (ETA:5 days, lag) |
[production] |
18:01 |
<jynus> |
creating special partitioning for db2034 and db2042 (ETA:5 days, lag) |
[production] |
17:51 |
<bd808@mira> |
Synchronized wmf-config/InitialiseSettings.php: Stop the first survey in fawiki and eswiki (f89621d) (duration: 01m 25s) |
[production] |
17:44 |
<bd808@mira> |
Synchronized php-1.27.0-wmf.11/includes/api/ApiMain.php: Log user-agents that are using HTTP when HTTPS is preferred (55ac0b7) (duration: 01m 26s) |
[production] |
17:41 |
<bd808@mira> |
Synchronized wmf-config/CommonSettings.php: Grant autocreateaccount to anons on loginwiki (d916008) (duration: 01m 27s) |
[production] |
17:39 |
<bd808@mira> |
Synchronized php-1.27.0-wmf.11/extensions/CentralAuth/includes/session/CentralAuthSessionProvider.php: CentralAuth: Take auto-creation into account (f526ef1) (duration: 01m 28s) |
[production] |
17:35 |
<bd808@mira> |
Synchronized php-1.27.0-wmf.11/includes/session/SessionBackend.php: SessionManager: Save user name to metadata even if the user doesn't exist locally (a39b4ac) (duration: 01m 29s) |
[production] |
17:01 |
<jynus> |
restarting mysql at db1018 |
[production] |
16:50 |
<robh> |
parsoid-vd restart was due to subbu irc request (i wasnt just randomly restarting things ;) |
[production] |
16:47 |
<robh> |
restarting parsoid-vd & parsoid-vd-client on ruthenium |
[production] |
16:33 |
<ottomata> |
uinstalling impala in analytics cluster |
[production] |
15:45 |
<bblack> |
upgrade packages (incl kernel) on eqiad caches hosts (cp1xxx) |
[production] |
15:37 |
<jynus@mira> |
Synchronized wmf-config/db-eqiad.php: Depool db1018 for maintenance (duration: 01m 49s) |
[production] |
15:32 |
<akosiaris> |
remove all networking configuration from asw-b-eqiad switch for nas1001-a, nas1001-b. Leave just descriptions |
[production] |
15:21 |
<bblack> |
upgrading packages (incl kernel) on esams cache hosts (cp3xxx) (codfw, ulsfo already done) |
[production] |
15:11 |
<akosiaris> |
powering off nas1001-a.eqiad.wmnet. https://phabricator.wikimedia.org/T124156 |
[production] |
15:08 |
<akosiaris> |
powering off nas1001-b.eqiad.wmnet. https://phabricator.wikimedia.org/T124156 |
[production] |
15:01 |
<elukey> |
re-enabled puppet on analytics1027 |
[production] |
14:39 |
<elukey> |
stopped kafka (service) on kafka1012 (the host that caused the outage) |
[production] |
14:24 |
<moritzm> |
rebooting bohrium for kernel update |
[production] |
14:03 |
<_joe_> |
installing the new hhvm package on all the codfw appserver |
[production] |
13:43 |
<_joe_> |
installing the new HHVM package to the canary appservers (main and api) |
[production] |
12:30 |
<paravoid> |
force-rebooting pollux |
[production] |
11:43 |
<_joe_> |
uploaded hhvm_3.6.5+dfsg1-1+wm8 to trusty-wikimedia |
[production] |
11:22 |
<moritzm> |
rolling restart of swift in codfw |
[production] |
11:14 |
<elukey> |
disabled puppet on analytics1027 due to issues with Camus and HDFS |
[production] |
10:17 |
<moritzm> |
rolling restart of swift in esams |
[production] |
02:32 |
<l10nupdate@tin> |
ResourceLoader cache refresh completed at Fri Jan 29 02:32:56 UTC 2016 (duration 7m 28s) |
[production] |
02:25 |
<mwdeploy@tin> |
sync-l10n completed (1.27.0-wmf.11) (duration: 10m 40s) |
[production] |
01:31 |
<ori@mira> |
Synchronized wmf-config: I83da57cf: Enable persistent redis connections for job runners (duration: 01m 11s) |
[production] |
01:03 |
<krenair@mira> |
Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/#/c/267186/ (duration: 01m 09s) |
[production] |
01:01 |
<krenair@mira> |
Synchronized wmf-config/InitialiseSettings-labs.php: https://gerrit.wikimedia.org/r/#/c/265292/ (duration: 01m 14s) |
[production] |
00:57 |
<krenair@mira> |
Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/267071/ (duration: 01m 11s) |
[production] |
00:53 |
<krenair@mira> |
Synchronized wmf-config/CirrusSearch-production.php: https://gerrit.wikimedia.org/r/#/c/266995/ (duration: 01m 11s) |
[production] |