2016-10-20
ยง
|
17:42 |
<urandom> |
T133395, T113805: Starting a primary-range, incremental repair of local_group_wiktionary_T_parsoid_html.data on restbase2001.codfw.wmnet |
[production] |
17:38 |
<mutante> |
rebooting kraz - short downtime of irc.wikimedia.org please prepare to reconnect your clients if they dont automatically do it |
[production] |
17:35 |
<apergos> |
reboot of last few stragglers for mw* hosts in codfw/eqiad: mw2152 mw2079 mw1239 |
[production] |
17:29 |
<mutante> |
rebooting install2001 |
[production] |
17:00 |
<apergos> |
rolling reboot of video scalers in codfw/eqiad: mw1259 mw1260 mw2152 mw2246 |
[production] |
16:48 |
<apergos> |
rolling reboot of testservers in codfw/eqiad: mw1017 mw1099 mw2017 mw2099 |
[production] |
16:45 |
<mutante> |
rebooting install1001 |
[production] |
16:44 |
<gehel@puppetmaster1001> |
conftool action : set/pooled=yes; selector: dc=eqiad,cluster=logstash,service=kibana |
[production] |
16:34 |
<godog> |
reboot graphite1001 for kernel upgrade |
[production] |
16:30 |
<apergos> |
rolling reboots for jobrunners in eqiad: mw1161-1169, mw1299-1306 |
[production] |
16:26 |
<gehel> |
deploying new LVS service for kibana - T132458 |
[production] |
16:25 |
<godog> |
reboot graphite1003 for kernel upgrade |
[production] |
16:08 |
<moritzm> |
bounced ntp on mw2089/mw2241 (XFAC state) |
[production] |
15:59 |
<mutante> |
short downtime of ganglia web ui |
[production] |
15:59 |
<mutante> |
rebooting uranium |
[production] |
15:36 |
<apergos> |
rolling reboots for jobrunners in codfw: mw2080-2085, mw2153-mw2162, mw2247-2250 |
[production] |
15:14 |
<apergos> |
rolling reboot of image scalers for codfw, eqiad: mw2086-2089, mw2148-2151, mw1293-1298 |
[production] |
15:10 |
<ottomata> |
restarted statsv on hafnium |
[production] |
14:55 |
<moritzm> |
bounced ntp on mw2196/mw2197 (XFAC state) |
[production] |
14:34 |
<moritzm> |
rebooting rutherfordium for kernel update |
[production] |
14:27 |
<filippo@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=prometheus1001.eqiad.wmnet |
[production] |
14:26 |
<filippo@puppetmaster1001> |
conftool action : set/pooled=yes; selector: name=prometheus1002.eqiad.wmnet |
[production] |
14:24 |
<akosiaris> |
bounce ntpd on bast4001 |
[production] |
14:20 |
<moritzm> |
rebooting auth* servers |
[production] |
14:20 |
<ottomata> |
starting rolling restart of analytics-eqiad kafka brokers to apply kernel update |
[production] |
14:18 |
<filippo@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=prometheus2001.codfw.wmnet |
[production] |
14:18 |
<filippo@puppetmaster1001> |
conftool action : set/pooled=yes; selector: name=prometheus2002.codfw.wmnet |
[production] |
14:17 |
<apergos> |
rolling reboot of remaining app servers in codfw: mw2221-2245, and in eqiad: mw1261-1275 |
[production] |
14:11 |
<jmm@puppetmaster1001> |
conftool action : set/pooled=inactive; selector: mw2098.codfw.wmnet |
[production] |
14:09 |
<jynus@mira> |
Synchronized wmf-config/db-eqiad.php: mariadb: move db1053 from s1 to s4 (duration: 02m 06s) |
[production] |
13:38 |
<moritzm> |
restarting mx1001 for kernel update |
[production] |
13:22 |
<moritzm> |
restarting francium for kernel update |
[production] |
13:15 |
<godog> |
rolling reboot of prometheus machines for kernel update |
[production] |
13:14 |
<moritzm> |
restarting ms1001 for kernel update |
[production] |
13:10 |
<elukey> |
force failover from temporary Hadoop Master node (an1002) to its stanby (an1001) to restore the standard configuration |
[production] |
13:05 |
<elukey> |
correction: force failover for Hadoop Master node (an1001) to its stanby (an1002) and rebooting an1001 for kernel upgrades |
[production] |
12:59 |
<elukey> |
force failover for Hadoop Master node (an1002) to its stanby (an1002) and rebooting an1001 for kernel upgrades |
[production] |
12:59 |
<moritzm> |
ferm on baham (failed to start due to failing DNS resolution in early boot) |
[production] |
12:52 |
<moritzm> |
restarting mx2001 for kernel update |
[production] |
12:48 |
<moritzm> |
bounced ntp on mw2116 (XFAC state) |
[production] |
12:39 |
<elukey> |
restarting an1003 for kernel upgrades (oozie/hive master) |
[production] |
12:35 |
<moritzm> |
bounced ntp on baham (was stick in INIT phase) |
[production] |
12:31 |
<apergos> |
more app server rolling restarts for codfw: mw2163-2199 |
[production] |
12:29 |
<apergos> |
more API server rolling restarts for eqiad: mw1221-1235, 1276-1290 |
[production] |
12:27 |
<apergos> |
more APP server rolling restarts for eqiad: mw1209-1216, 128-1220, 1236-38, 1240-1258 |
[production] |
12:12 |
<moritzm> |
restarting bast2001 for kernel update |
[production] |
12:11 |
<apergos> |
retaction. those are app servers, not starting them yet |
[production] |
12:10 |
<apergos> |
more api server rolling restarts for eqiad: mw1209-1216, 128-1220, 1236-38, 1240-1258 |
[production] |
12:08 |
<moritzm> |
bounced ntp on mw2206 (XFAC state) |
[production] |
12:05 |
<bblack> |
correction: rebooting baham / ns1.wikimedia.org for kernel |
[production] |