2019-10-22
ยง
|
17:57 |
<sbassett> |
Deployed security fix for T234450 to wmf.2 |
[production] |
17:57 |
<mholloway-shell@deploy1001> |
Finished deploy [mobileapps/deploy@b4c484a]: Build structured talk pages by walking the DOM (T235213) (duration: 05m 14s) |
[production] |
17:54 |
<mutante> |
restarting gerrit to disable jgit gc (T236114) |
[production] |
17:51 |
<mholloway-shell@deploy1001> |
Started deploy [mobileapps/deploy@b4c484a]: Build structured talk pages by walking the DOM (T235213) |
[production] |
17:37 |
<arlolra> |
Updated Parsoid to cf01d91 (T234057, T234768, T235296, T235684, T235563) |
[production] |
17:26 |
<arlolra@deploy1001> |
Finished deploy [parsoid/deploy@4c64c9c]: Updating Parsoid to cf01d91 (duration: 07m 37s) |
[production] |
17:20 |
<bblack> |
geodns: re-pooling esams (at this point, we're entirely back in our "normal" state of affairs) |
[production] |
17:19 |
<arlolra@deploy1001> |
Started deploy [parsoid/deploy@4c64c9c]: Updating Parsoid to cf01d91 |
[production] |
16:51 |
<bblack> |
geodns: moving all "normal" eqiad traffic back to eqiad (in addition to the esams-diverted traffic which is still pointed mostly at eqiad right now) |
[production] |
16:21 |
<mutante> |
running puppet on deployment servers |
[production] |
16:20 |
<thcipriani> |
restarting gerrit |
[production] |
16:14 |
<thcipriani> |
stopping gerrit to run a fix for T222391 |
[production] |
15:58 |
<bblack> |
depooling esams temporarily to test traffic scenario on lvs1014 |
[production] |
15:47 |
<bblack> |
enable pybal+puppet on rebooted lvs1014 |
[production] |
15:40 |
<bblack> |
rebooting lvs1014 |
[production] |
15:28 |
<liw@deploy1001> |
Finished scap: testwiki to php-1.35.0-wmf.3 and rebuild l10n cache (duration: 37m 39s) |
[production] |
15:26 |
<XioNoX> |
repool esams |
[production] |
15:20 |
<XioNoX> |
rollback ns2 redirect |
[production] |
15:13 |
<bblack> |
re-disabling lvs1014 ... |
[production] |
15:10 |
<bblack> |
re-enabling lvs1014 pybal/puppet |
[production] |
15:03 |
<moritzm> |
rebooting kafka-main1005 for microcode debugging |
[production] |
15:01 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
15:01 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
14:52 |
<bblack> |
stopping puppet and pybal on lvs1014 (upload+maps traffic to 1016) |
[production] |
14:50 |
<liw@deploy1001> |
Started scap: testwiki to php-1.35.0-wmf.3 and rebuild l10n cache |
[production] |
14:45 |
<mbsantos@deploy1001> |
Finished deploy [kartotherian/deploy@85ea6e1]: Deploy kartotherian 1.1.5-wmf.0 (duration: 02m 44s) |
[production] |
14:42 |
<mbsantos@deploy1001> |
Started deploy [kartotherian/deploy@85ea6e1]: Deploy kartotherian 1.1.5-wmf.0 |
[production] |
14:13 |
<XioNoX> |
restart asw-esams for onsite work |
[production] |
13:52 |
<andrewbogott> |
restarted slapd on ldap-eqiad-replica01 |
[production] |
13:38 |
<gehel> |
silencing LVS check for katotherian (we know there is an issue) - T236163 |
[production] |
13:35 |
<liw@deploy1001> |
scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="labtestwiki" --outdir="/tmp/scap_l10n_2419219323" --threads=30 --lang en --quiet' returned non-zero exit status 1 (duration: 06m 40s) |
[production] |
13:28 |
<liw@deploy1001> |
Started scap: testwiki to php-1.34.0-wmf.3 and rebuild l10n cache |
[production] |
13:13 |
<ayounsi@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
13:13 |
<ayounsi@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
13:06 |
<XioNoX> |
depool esams for onsite work - T235805 |
[production] |
13:05 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Fully repool db1096:3316 db1105:3311 db1105:3312 after PDU and on-site maintenance', diff saved to https://phabricator.wikimedia.org/P9434 and previous config saved to /var/cache/conftool/dbconfig/20191022-130556-marostegui.json |
[production] |
12:54 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'More traffic to db1096:3316 db1105:3311 instance db1105:3312 after PDU and on-site maintenance', diff saved to https://phabricator.wikimedia.org/P9433 and previous config saved to /var/cache/conftool/dbconfig/20191022-125435-marostegui.json |
[production] |
12:46 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'More traffic to db1096:3316 db1105:3311 instance db1105:3312 after PDU and on-site maintenance', diff saved to https://phabricator.wikimedia.org/P9432 and previous config saved to /var/cache/conftool/dbconfig/20191022-124607-marostegui.json |
[production] |
12:37 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1096:3316 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9431 and previous config saved to /var/cache/conftool/dbconfig/20191022-123757-marostegui.json |
[production] |
12:32 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Slowly repool db1105:3312 and db1105:3311 after on-site maintenance T235877', diff saved to https://phabricator.wikimedia.org/P9430 and previous config saved to /var/cache/conftool/dbconfig/20191022-123257-marostegui.json |
[production] |
12:30 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db2089:3315', diff saved to https://phabricator.wikimedia.org/P9429 and previous config saved to /var/cache/conftool/dbconfig/20191022-123032-marostegui.json |
[production] |
12:29 |
<moritzm> |
rebooting miscweb2001 for some microcode tests |
[production] |
12:28 |
<marostegui> |
Compress db1096:3315 |
[production] |
12:27 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
12:27 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
12:25 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Repool pc1007 after PDU maintenance T227142 (duration: 00m 50s) |
[production] |
12:14 |
<jynus> |
reimage to buster dbmonitor2001.wikimedia.org T224589 |
[production] |
11:57 |
<liw> |
starting to cut branch for train 1.35-wmf.3 |
[production] |
11:51 |
<hashar> |
Restarted CI Jenkins on contint1001 |
[production] |
11:35 |
<marostegui> |
Stop MySQL on db1105:3311, db1105:3312 for firmware upgrade - T235877 |
[production] |