2017-06-21
ยง
|
13:39 |
<ema> |
reboot lvs[2001-2003] (codfw primaries) for kernel update |
[production] |
13:29 |
<elukey> |
reboot aqs1007 for kernel update |
[production] |
13:22 |
<marostegui> |
Deploy alter table on s7 - directly on codfw master (db2029) - this will generate lag on codfw - T166208 |
[production] |
13:21 |
<elukey> |
reboot kafka1013 for kernel updates |
[production] |
13:16 |
<marostegui> |
Deploy alter table s5 - labsdb1001 - T166207 |
[production] |
13:15 |
<marostegui> |
Deploy alter table s5 - db1045 - T166207 |
[production] |
13:14 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Depool db1045 - T166207 (duration: 00m 44s) |
[production] |
13:08 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Repool db1070 - T166207 (duration: 00m 46s) |
[production] |
13:05 |
<elukey> |
reboot analytics1003 (Hue, Camus, Oozie, Hive master) for kernel upgrade |
[production] |
12:32 |
<gehel> |
deploying T167871 and restarting kartotherian / tilerator on maps eqiad |
[production] |
12:32 |
<moritzm> |
rebooting mw1189-mw1199 for kernel update |
[production] |
12:10 |
<akosiaris@puppetmaster1001> |
conftool action : set/pooled=yes; selector: name=sca1004.eqiad.wmnet |
[production] |
12:09 |
<akosiaris@puppetmaster1001> |
conftool action : set/pooled=yes; selector: name=mwdebug1002.eqiad.wmnet |
[production] |
11:59 |
<moritzm> |
rebooting mw1209-mw1220 for kernel update |
[production] |
11:45 |
<moritzm> |
rebooting mediawiki api servers in codfw for kernel update |
[production] |
11:42 |
<akosiaris> |
rollback change in asw-a-eqiad for ganeti interface range due to alerts |
[production] |
11:23 |
<akosiaris> |
reboot ganeti1007 for insertion into ganeti cluster |
[production] |
11:14 |
<elukey> |
reboot aqs1006 for kernel update |
[production] |
11:04 |
<moritzm> |
rebooting mw1180-mw1188 for kernel update |
[production] |
11:02 |
<akosiaris> |
starting up all instances on ganeti01.svc.codfw.wmnet |
[production] |
11:01 |
<godog> |
reimage ms-be1018 / 1019 with stretch |
[production] |
10:58 |
<ema> |
reboot lvs[2004-2006] (codfw secondaries) for kernel update |
[production] |
10:50 |
<akosiaris> |
rebooting all ganeti200X nodes |
[production] |
10:47 |
<akosiaris> |
shutdown all VMs on the ganeti01.svc.codfw.wmnet cluster |
[production] |
10:43 |
<elukey> |
reboot analytics1001 (Hadoop master) for kernel update |
[production] |
10:35 |
<akosiaris> |
rebooting the entire codfw ganeti cluster for kernel upgrades. Silenced hosts in icinga already. T167643 |
[production] |
10:30 |
<moritzm> |
rebooting bast4001 for kernel update |
[production] |
10:21 |
<ema> |
reboot lvs[1001-1003] (eqiad primaries) for kernel update |
[production] |
10:17 |
<elukey> |
running a script in tmux on rdb[12]003 called "check" to dump periodically LLEN enwiki:jobqueue:enqueue:l-unclaimed and stopped the one on rdb2004 |
[production] |
10:07 |
<ema> |
reboot lvs[1004-1006] (eqiad secondaries) for kernel update |
[production] |
10:01 |
<elukey> |
reboot analytics1002 (Hadoop master standby) for kernel update |
[production] |
10:01 |
<moritzm> |
rebooting auth* servers for kernel update |
[production] |
09:48 |
<ema> |
reboot lvs[1010-1012] for kernel update |
[production] |
09:48 |
<elukey> |
reboot aqs1005 for kernel update |
[production] |
09:10 |
<elukey> |
reboot kafka2001 for kernel update (eventbus codfw) |
[production] |
09:06 |
<moritzm> |
rebooting restbase1017 for kernel update |
[production] |
08:52 |
<oblivian@puppetmaster1001> |
conftool action : set/pooled=inactive; selector: name=restbase2001.codfw.wmnet,dc=codfw,service=restbase |
[production] |
08:49 |
<_joe_> |
correction: restarting pybal |
[production] |
08:49 |
<_joe_> |
restarting etcd on lvs2003/2006, connection lost to etcd |
[production] |
08:34 |
<elukey> |
reboot kafka1012 for kernel upgrades |
[production] |
08:34 |
<marostegui> |
Deploy alter table db1070 s5 - T166207 |
[production] |
08:33 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Depool db1070 - T166207 (duration: 00m 44s) |
[production] |
08:27 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Repool db1082 - T166207 (duration: 00m 45s) |
[production] |
08:26 |
<godog> |
reimage ms-be1014 / 1015 with jessie |
[production] |
07:37 |
<marostegui> |
Stop and reset slave s5 on dbstore2001 - T168354 |
[production] |
06:23 |
<mutante> |
planet2001 wget missing unpuppetized logo file from https://en.planet.wikimedia.org/images/planet-wm2.png - should fix puppet run |
[production] |
06:19 |
<marostegui> |
Stop replication and puppet on db2066 for maintenance - T168354 |
[production] |
06:18 |
<marostegui@tin> |
Synchronized wmf-config/db-codfw.php: Depool db2066 - T168354 (duration: 00m 43s) |
[production] |
06:08 |
<elukey> |
reboot thorium for kernel upgrades (outage to all the analytics websites) |
[production] |
06:05 |
<marostegui> |
Deploy alter table s5 - db1082 - T166207 |
[production] |