2016-01-29
§
|
23:53 |
<jynus> |
restarted db1018 replication (and its codfw slaves) after a (somewhat) failed maintenance |
[production] |
23:41 |
<mutante> |
ruthenium - restart parsoid-rt-client, parsoid-vd-client |
[production] |
23:37 |
<mutante> |
ruthenium - git pull origin in /srv/visualdiff/ |
[production] |
23:22 |
<bd808@mira> |
Synchronized php-1.27.0-wmf.11/includes/session/SessionBackend.php: Testing proposed fix for T125267 (duration: 01m 26s) |
[production] |
22:50 |
<jynus> |
powercycling cp3042 to test it is really the broken one |
[production] |
22:37 |
<jynus> |
powercycle cp3049, not 42 |
[production] |
22:37 |
<jynus> |
powercycle cp3042 |
[production] |
22:27 |
<mutante> |
cp3042 - md0: unknown partition table |
[production] |
22:23 |
<mutante> |
powercycled cp1049 |
[production] |
22:06 |
<mutante> |
powercycle cp3049 |
[production] |
21:25 |
<YuviPanda> |
restarted image-resize-calc manually, no service.manifest file |
[tools] |
21:13 |
<mutante> |
bromine - stop and remove rsync service |
[production] |
20:16 |
<aaron@mira> |
Synchronized wmf-config/CommonSettings.php: Use the logical redis definition for GettingStarted (duration: 01m 26s) |
[production] |
19:36 |
<jynus> |
reinstall db1018 |
[production] |
18:42 |
<thcipriani> |
updated scap on beta |
[releng] |
18:11 |
<jynus> |
creating special partitioning for db2037 and db2044 (ETA:5 days, lag) |
[production] |
18:01 |
<jynus> |
creating special partitioning for db2034 and db2042 (ETA:5 days, lag) |
[production] |
17:51 |
<bd808@mira> |
Synchronized wmf-config/InitialiseSettings.php: Stop the first survey in fawiki and eswiki (f89621d) (duration: 01m 25s) |
[production] |
17:44 |
<bd808@mira> |
Synchronized php-1.27.0-wmf.11/includes/api/ApiMain.php: Log user-agents that are using HTTP when HTTPS is preferred (55ac0b7) (duration: 01m 26s) |
[production] |
17:41 |
<bd808@mira> |
Synchronized wmf-config/CommonSettings.php: Grant autocreateaccount to anons on loginwiki (d916008) (duration: 01m 27s) |
[production] |
17:39 |
<bd808@mira> |
Synchronized php-1.27.0-wmf.11/extensions/CentralAuth/includes/session/CentralAuthSessionProvider.php: CentralAuth: Take auto-creation into account (f526ef1) (duration: 01m 28s) |
[production] |
17:35 |
<bd808@mira> |
Synchronized php-1.27.0-wmf.11/includes/session/SessionBackend.php: SessionManager: Save user name to metadata even if the user doesn't exist locally (a39b4ac) (duration: 01m 29s) |
[production] |
17:01 |
<jynus> |
restarting mysql at db1018 |
[production] |
16:50 |
<robh> |
parsoid-vd restart was due to subbu irc request (i wasnt just randomly restarting things ;) |
[production] |
16:47 |
<robh> |
restarting parsoid-vd & parsoid-vd-client on ruthenium |
[production] |
16:44 |
<thcipriani> |
deployed scap updates on beta |
[releng] |
16:33 |
<ottomata> |
uinstalling impala in analytics cluster |
[production] |
15:51 |
<ottomata> |
restarting eventlogging |
[analytics] |
15:45 |
<bblack> |
upgrade packages (incl kernel) on eqiad caches hosts (cp1xxx) |
[production] |
15:37 |
<jynus@mira> |
Synchronized wmf-config/db-eqiad.php: Depool db1018 for maintenance (duration: 01m 49s) |
[production] |
15:32 |
<akosiaris> |
remove all networking configuration from asw-b-eqiad switch for nas1001-a, nas1001-b. Leave just descriptions |
[production] |
15:21 |
<bblack> |
upgrading packages (incl kernel) on esams cache hosts (cp3xxx) (codfw, ulsfo already done) |
[production] |
15:11 |
<akosiaris> |
powering off nas1001-a.eqiad.wmnet. https://phabricator.wikimedia.org/T124156 |
[production] |
15:08 |
<akosiaris> |
powering off nas1001-b.eqiad.wmnet. https://phabricator.wikimedia.org/T124156 |
[production] |
15:01 |
<elukey> |
re-enabled puppet on analytics1027 |
[production] |
14:59 |
<elukey> |
analytics1027 - puppet re-enabled, camus restarted |
[analytics] |
14:45 |
<elukey> |
disabled icinga on kafka1012 until Feb 07 |
[analytics] |
14:39 |
<joal> |
launching manual runs of camus to try to fix state |
[analytics] |
14:39 |
<elukey> |
stopped kafka (service) on kafka1012 (the host that caused the outage) |
[production] |
14:38 |
<elukey> |
stopped kafka on kafka1012 |
[analytics] |
14:24 |
<moritzm> |
rebooting bohrium for kernel update |
[production] |
14:03 |
<_joe_> |
installing the new hhvm package on all the codfw appserver |
[production] |
13:43 |
<_joe_> |
installing the new HHVM package to the canary appservers (main and api) |
[production] |
12:52 |
<elukey> |
analytics1027 - disabled puppet and camus for webrequest log |
[analytics] |
12:30 |
<paravoid> |
force-rebooting pollux |
[production] |
11:58 |
<_joe_> |
upgraded hhvm to 3.6 wm8 in deployment-prep |
[releng] |
11:43 |
<_joe_> |
uploaded hhvm_3.6.5+dfsg1-1+wm8 to trusty-wikimedia |
[production] |
11:22 |
<moritzm> |
rolling restart of swift in codfw |
[production] |