2018-04-11
§
|
13:09 |
<marostegui> |
Drop prefstats table on s3 codfw master - db2043 (this might generate lag on codfw) - T154490 |
[production] |
13:01 |
<vgutierrez> |
Reimage lvs4007 as stretch |
[production] |
13:00 |
<jynus@tin> |
Synchronized wmf-config/db-codfw.php: Repool es2012 (duration: 01m 00s) |
[production] |
12:39 |
<mobrovac@tin> |
Synchronized wmf-config/InitialiseSettings.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 1/2 (retry #2) (duration: 01m 01s) |
[production] |
12:32 |
<mobrovac@tin> |
Synchronized wmf-config/InitialiseSettings.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 1/2 (retry) - T190327 (duration: 01m 00s) |
[production] |
12:21 |
<mobrovac@tin> |
Synchronized wmf-config/InitialiseSettings.php: Switch a bulk of low-traffic jobs to EventBus for testwikis, file 1/2 - T190327 (duration: 01m 01s) |
[production] |
12:21 |
<moritzm> |
enable production traffic for mw1265 (stretch app server) for a brief test period |
[production] |
12:09 |
<jynus> |
start reimage of es2012 |
[production] |
12:05 |
<jynus@tin> |
Synchronized wmf-config/db-codfw.php: Repool es2011, depool es2012 (duration: 01m 01s) |
[production] |
11:47 |
<jynus> |
start reimage of es2011 |
[production] |
11:09 |
<ema> |
start pybal on lvs5001, test completed on lvs5003 |
[production] |
11:04 |
<marostegui> |
Drop table prefstats in s7 - T154490 |
[production] |
10:59 |
<jynus@tin> |
Synchronized wmf-config/db-codfw.php: Repool es2015, depool es2011 (duration: 00m 59s) |
[production] |
10:56 |
<ema> |
stop pybal on lvs5001 to test requests through lvs5003, reimaged as stretch T191897 |
[production] |
10:50 |
<moritzm> |
installing openssl updates |
[production] |
10:43 |
<marostegui> |
Drop table prefstats in s2 - T154490 |
[production] |
10:33 |
<marostegui> |
Drop table prefstats in s4 - T154490 |
[production] |
10:31 |
<marostegui> |
Drop table prefstats in s6 - T154490 |
[production] |
10:28 |
<marostegui> |
Drop table prefstats in s5 - T154490 |
[production] |
10:04 |
<jynus> |
start reimage of es2015 |
[production] |
10:00 |
<moritzm> |
installing java security updates on kafka/jumbo cluster |
[production] |
09:57 |
<jynus@tin> |
Synchronized wmf-config/db-codfw.php: Repool es2014, depool es2015 (duration: 01m 02s) |
[production] |
09:52 |
<moritzm> |
installing java security updates on kafka/analytics cluster |
[production] |
09:29 |
<arturo> |
doing some testing in labtestvirt2001 mounting instance's qcow2 files into /home/aborrero/mnt |
[production] |
09:17 |
<jynus> |
start reimage of es2014 |
[production] |
09:08 |
<jynus@tin> |
Synchronized wmf-config/db-codfw.php: Depool es2014 (duration: 01m 03s) |
[production] |
09:03 |
<ema> |
restart pybal on lvs1003 for UDP monitoring config changes https://gerrit.wikimedia.org/r/#/c/425251/ |
[production] |
08:59 |
<moritzm> |
reimaging mw1265 to stretch (T174431) |
[production] |
08:18 |
<jynus> |
rerunning eqiad misc backups |
[production] |
08:03 |
<marostegui@tin> |
Synchronized wmf-config/db-codfw.php: Repool db2069 as candidate master for x1 - T191275 (duration: 01m 03s) |
[production] |
07:45 |
<ema> |
cp2022: restart varnish-be due to child process crash https://phabricator.wikimedia.org/P6979 T191229 |
[production] |
07:27 |
<marostegui> |
Stop MySQL on db2033 to copy its data away before reimaging - T191275 |
[production] |
07:08 |
<vgutierrez> |
Reimaging lvs5003.eqsin as stretch (2nd attempt) |
[production] |
06:49 |
<elukey> |
restart Yarn Resource Manager daemons on analytics100[12] to pick up the new Prometheus configuration file |
[production] |
06:20 |
<marostegui> |
Stop MySQL on db2033 to clone db2069 - T191275 |
[production] |
06:17 |
<marostegui@tin> |
Synchronized wmf-config/db-eqiad.php: Add db2069 to the config as depooled x1 slave - T191275 (duration: 01m 03s) |
[production] |
06:15 |
<marostegui@tin> |
Synchronized wmf-config/db-codfw.php: Add db2069 to the config as depooled x1 slave - T191275 (duration: 01m 01s) |
[production] |
05:28 |
<Krinkle> |
manual coal back-fill still running with the normal coal disabled via systemd. Will restore normal coal when I wake up. |
[production] |
05:22 |
<marostegui> |
Deploy schema change on codfw s8 master (db2045) with replication enabled (this will generate lag on codfw) - T187089 T185128 T153182 |
[production] |
05:17 |
<marostegui> |
Reload haproxy on dbprox1010 to repool labsdb1010 |
[production] |
02:36 |
<l10nupdate@tin> |
scap sync-l10n completed (1.31.0-wmf.28) (duration: 05m 41s) |
[production] |
00:12 |
<bstorm_> |
Updated views and indexes on labsdb1011 |
[production] |
2018-04-10
§
|
23:32 |
<XioNoX> |
depolled eqsin due to router issue |
[production] |
23:04 |
<Krinkle> |
Seemingly from 22:53 - 23:03 global traffic dropped by 30-60%, presumably due to issues in eqiad where 10 Gbits dropped to 3 Gbits sharper than ever before. |
[production] |
22:49 |
<joal@tin> |
Finished deploy [analytics/refinery@33448cd]: Deploying fixes after todays deploy errors (duration: 04m 46s) |
[production] |
22:45 |
<joal@tin> |
Started deploy [analytics/refinery@33448cd]: Deploying fixes after todays deploy errors |
[production] |
21:18 |
<sbisson@tin> |
Finished deploy [kartotherian/deploy@8f3a903]: Rollback kartotherian to v0.0.35 (duration: 06m 27s) |
[production] |
21:12 |
<sbisson@tin> |
Started deploy [kartotherian/deploy@8f3a903]: Rollback kartotherian to v0.0.35 |
[production] |
20:41 |
<sbisson@tin> |
Finished deploy [kartotherian/deploy@bdf70ed]: Deploying kartotherian pre-i18n everywhere (downgrade snapshot) (duration: 03m 45s) |
[production] |
20:37 |
<sbisson@tin> |
Started deploy [kartotherian/deploy@bdf70ed]: Deploying kartotherian pre-i18n everywhere (downgrade snapshot) |
[production] |