2018-04-15
§
|
20:38 |
<elukey> |
restart hhvm on mw12[22,79,82] - high load |
[production] |
20:32 |
<elukey> |
restart hhvm on mw12[32-35] - high load |
[production] |
20:24 |
<elukey> |
restart hhvm on mw1229-31 - high load |
[production] |
20:24 |
<_joe_> |
restarted mw1280-4, high load |
[production] |
20:17 |
<elukey> |
restart hhvm on mw122[6-8] - high load |
[production] |
20:05 |
<elukey> |
restart hhvm on mw122[3,4] - high load |
[production] |
13:42 |
<elukey> |
restart hhvm on mw1227 due to high load (hhvm dump debug in /tmp/hhvm.44071.bt) |
[production] |
10:53 |
<elukey> |
powercycle mw1272 - not responsive to ssh, mgmt com2 console showing "[OK" and no tty |
[production] |
2018-04-13
§
|
20:44 |
<imarlier@tin> |
Finished deploy [performance/navtiming@8b6ab4e]: initial attempt to deploy navtiming via scap (will not be active) (duration: 00m 02s) |
[production] |
20:44 |
<imarlier@tin> |
Started deploy [performance/navtiming@8b6ab4e]: initial attempt to deploy navtiming via scap (will not be active) |
[production] |
20:00 |
<demon@tin> |
Pruned MediaWiki: 1.31.0-wmf.28 [keeping static files] (duration: 01m 34s) |
[production] |
19:23 |
<demon@tin> |
Pruned MediaWiki: 1.31.0-wmf.25 (duration: 05m 03s) |
[production] |
17:17 |
<andrewbogott> |
upgraded packages on all labvirts and restarted nova-compute |
[production] |
16:55 |
<arturo> |
enable puppet in labstore1005 |
[production] |
16:42 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Give db1104 origina main traffic weight (duration: 01m 00s) |
[production] |
16:34 |
<andrewbogott> |
upgrading packages on labvirt1016 and rebooting (1016 is a spare server that won't affect VPS users) |
[production] |
16:26 |
<arturo> |
disable puppet in labstore1005 to hot-test https://gerrit.wikimedia.org/r/#/c/426103/ |
[production] |
16:24 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Give db1104 some main traffic - T191996 (duration: 01m 00s) |
[production] |
16:04 |
<hashar> |
cleaning up lost instances in nodepool (nodepool delete XXXXX) |
[production] |
15:50 |
<andrewbogott> |
upgrading lots of packages and rebooting labservices1002 and 1002 |
[production] |
15:43 |
<andrewbogott> |
restarting nodepool on labnodepool1001 |
[production] |
15:27 |
<andrewbogott> |
rebooting lots of packages on labnet1001 and labnet1002 for T145919 |
[production] |
15:14 |
<bd808> |
wiki replicas: added page_assessments views for frwiki & huwiki |
[production] |
15:09 |
<chasemp> |
labstore1004 stop nfs-exportd, cp export.bak to export.d, exportfs -ra (all exports were wiped out) |
[production] |
14:59 |
<andrewbogott> |
rebooting labcontrol1001 |
[production] |
14:42 |
<andrewbogott> |
upgrading lots of packages on labcontrol1001 and 1002 and rebooting. T145919 |
[production] |
14:38 |
<andrewbogott> |
stopping puppet and nodepool on labnodepool1001 |
[production] |
14:28 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Repool db1104 - T191996 (duration: 01m 07s) |
[production] |
14:22 |
<XioNoX> |
enable flow control on db1114's switch port - T191996 |
[production] |
14:16 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Depool db1104 - T191996 (duration: 00m 59s) |
[production] |
14:13 |
<andrewbogott> |
disabling puppet on labcontrol*, labnet*, labservices*, labvirt* before beginning T145919 |
[production] |
14:13 |
<moritzm> |
installing apache security updates on contint1001 |
[production] |
14:09 |
<andrewbogott> |
silencing alerts for labcontrol*, labnet*, labservices*, labvirt* before beginning T145919 |
[production] |
14:06 |
<moritzm> |
uploaded ivy-debian-helper to apt.wikimedia.org/jessie (needed for zookeeper backport) |
[production] |
13:52 |
<elukey> |
roll restart druid + zookeeper daemons on druid100[123] for openjdk-7 updates |
[production] |
13:49 |
<jynus@tin> |
Synchronized wmf-config/db-eqiad.php: Repool es1013 with full weight (duration: 01m 00s) |
[production] |
13:32 |
<elukey> |
restart druid and zookeeper daemons on druid100[456] for opejdk-7 updates |
[production] |
13:29 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Repool db1104 after alter table (duration: 01m 02s) |
[production] |
13:18 |
<urandom> |
increasing heap size to 16G -- T186751 |
[production] |
12:37 |
<moritzm> |
installing apache security updates on mendelevium (otrs) |
[production] |
12:36 |
<moritzm> |
installing apache security updates on bohrium (piwik) |
[production] |
11:58 |
<vgutierrez@neodymium> |
conftool action : set/pooled=yes; selector: name=maerlant.wikimedia.org,service=pdns_recursor |
[production] |
11:56 |
<jynus@tin> |
Synchronized wmf-config/db-eqiad.php: Repool es1013 with low load (duration: 01m 04s) |
[production] |
10:59 |
<moritzm> |
reimaging mw1261-mw1264 to stretch (T174431) |
[production] |
10:40 |
<vgutierrez@neodymium> |
conftool action : set/pooled=no; selector: name=maerlant.wikimedia.org,service=pdns_recursor |
[production] |
10:38 |
<vgutierrez> |
Depool and reimage maerlant.wikimedia.org as stretch |
[production] |
10:16 |
<vgutierrez@neodymium> |
conftool action : set/pooled=yes; selector: name=nescio.wikimedia.org,service=pdns_recursor |
[production] |
10:01 |
<moritzm> |
installing java security updates on meiterium/archive.wikimedia.org |
[production] |
09:33 |
<jynus> |
start reimage of es1013 |
[production] |
09:03 |
<moritzm> |
reimaging mw1276-mw1278 to stretch (T174431) |
[production] |