2016-10-21
§
|
23:45 |
<mutante> |
depooling maps1002 (by running "depool" on the server itself) |
[production] |
23:35 |
<yurik> |
maps1002.eqiad is running older/incorrect/misbehaving software for some reason, restart didn't help. Need to depool |
[production] |
22:17 |
<mutante> |
cp4006,cp4014 gzipped some logs in home for disk space |
[production] |
22:08 |
<mutante> |
cp4006, cp4014 were running out of disk, apt-get clean |
[production] |
21:40 |
<mutante> |
phab2001 that IP was also on iridium/phab1001, it should not be hardcoded in puppet, causing issues in T143363 |
[production] |
21:37 |
<mutante> |
phab2001 - ip addr del 10.64.32.186/21 dev eth0 |
[production] |
21:06 |
<bblack> |
restarting varnish backends (depooled, etc) for eqiad cache_upload: cp1049, cp1072, cp1074 |
[production] |
19:50 |
<cmjohnson1> |
dataset1001 array 1 swap failed disk slot 4 |
[production] |
19:40 |
<cmjohnson1> |
labvirt1005 swapping disk 0 |
[production] |
19:40 |
<gehel> |
routing traffic for cache-maps in codfw -> eqiad |
[production] |
19:29 |
<gehel> |
running puppet on eqiad cache nodes to activate maps traffic redirection |
[production] |
19:06 |
<gehel> |
shutting down cassandra on maps2004, seems to have lost data |
[production] |
18:22 |
<ejegg> |
updated SmashPig from d1ca0632d00dfb608f70ca4b70251a5ba49f4411 to e28b2cd9f0c1429acdd2a08c68f95884dbffb594 |
[production] |
16:45 |
<ejegg> |
updated fundraising tools from 09ae6e24d8ca8350dc099d63a6ca0d9ec9fdef2b to f83e39291adc55677fc4b49307dc4807eba18019 |
[production] |
16:33 |
<mutante> |
rebooting planet1001 - *.planet.wm.org will be right back |
[production] |
16:30 |
<mutante> |
rebooting planet2001 |
[production] |
16:05 |
<elukey> |
reimaging mc1021 with wmf-auto-reimage (T137345) |
[production] |
15:28 |
<elukey> |
reimaging mc1019 with wmf-auto-reimage (T137345) |
[production] |
14:50 |
<elukey> |
reimaging mc1020 with wmf-auto-reimage (T137345) |
[production] |
14:31 |
<_joe_> |
rebooting all kubernetes worker nodes in production |
[production] |
14:31 |
<moritzm> |
rolling reboot of thumbor* for kernel update |
[production] |
14:29 |
<marostegui> |
Stopping replication on db2055 to use it to clone another host - T146261 |
[production] |
13:55 |
<bblack> |
restart isc-dhcp-server on carbon |
[production] |
13:55 |
<moritzm> |
rolling reboot of thumbor* for kernel update |
[production] |
13:40 |
<moritzm> |
completed rolling reboot of restbase in codfw |
[production] |
13:14 |
<marostegui> |
Deploying schema change S6 ruwiki for table ores_model - T147734 |
[production] |
12:24 |
<moritzm> |
rebooting ruthenium for kernel update |
[production] |
12:02 |
<moritzm> |
rebooting bromine for kernel update |
[production] |
11:28 |
<gehel> |
starting rolling restart of elasticsearch eqiad cluster |
[production] |
11:04 |
<moritzm> |
rebooting hafnium for kernel update |
[production] |
10:49 |
<jynus@mira> |
Synchronized wmf-config/db-eqiad.php: mariadb: pool db1053 as the new rc special slave after maintenance (duration: 01m 00s) |
[production] |
10:36 |
<marostegui> |
Deploying schema change S2 several wikis for table ores_model - T147734 |
[production] |
10:28 |
<bblack> |
rebooting radon (ns0) |
[production] |
10:22 |
<moritzm> |
rolling reboot of restbase in codfw for kernel update |
[production] |
10:09 |
<marostegui> |
Deploying schema change S7 fawiki.ores_model - T147734 |
[production] |
10:04 |
<moritzm> |
rebooting seaborgium (labs LDAP server) for kernel update |
[production] |
09:51 |
<marostegui> |
Deploying schema change S5 wikidatawiki.ores_model - T147734 |
[production] |
09:48 |
<moritzm> |
rebooting neon (icinga host) for kernel update |
[production] |
09:35 |
<marostegui> |
Deploying schema change S1 enwiki.ores_model in eqiad - T147734 |
[production] |
09:32 |
<elukey> |
rebooting kafka100[12] for kernel upgrades (EventBus hosts) |
[production] |
09:26 |
<moritzm> |
rebooting krypton for kernel update |
[production] |