1151-1200 of 10000 results (59ms)
2019-10-22 ยง
17:57 <sbassett> Deployed security fix for T234450 to wmf.2 [production]
17:57 <mholloway-shell@deploy1001> Finished deploy [mobileapps/deploy@b4c484a]: Build structured talk pages by walking the DOM (T235213) (duration: 05m 14s) [production]
17:54 <mutante> restarting gerrit to disable jgit gc (T236114) [production]
17:51 <mholloway-shell@deploy1001> Started deploy [mobileapps/deploy@b4c484a]: Build structured talk pages by walking the DOM (T235213) [production]
17:37 <arlolra> Updated Parsoid to cf01d91 (T234057, T234768, T235296, T235684, T235563) [production]
17:26 <arlolra@deploy1001> Finished deploy [parsoid/deploy@4c64c9c]: Updating Parsoid to cf01d91 (duration: 07m 37s) [production]
17:20 <bblack> geodns: re-pooling esams (at this point, we're entirely back in our "normal" state of affairs) [production]
17:19 <arlolra@deploy1001> Started deploy [parsoid/deploy@4c64c9c]: Updating Parsoid to cf01d91 [production]
16:51 <bblack> geodns: moving all "normal" eqiad traffic back to eqiad (in addition to the esams-diverted traffic which is still pointed mostly at eqiad right now) [production]
16:21 <mutante> running puppet on deployment servers [production]
16:20 <thcipriani> restarting gerrit [production]
16:14 <thcipriani> stopping gerrit to run a fix for T222391 [production]
15:58 <bblack> depooling esams temporarily to test traffic scenario on lvs1014 [production]
15:47 <bblack> enable pybal+puppet on rebooted lvs1014 [production]
15:40 <bblack> rebooting lvs1014 [production]
15:28 <liw@deploy1001> Finished scap: testwiki to php-1.35.0-wmf.3 and rebuild l10n cache (duration: 37m 39s) [production]
15:26 <XioNoX> repool esams [production]
15:20 <XioNoX> rollback ns2 redirect [production]
15:13 <bblack> re-disabling lvs1014 ... [production]
15:10 <bblack> re-enabling lvs1014 pybal/puppet [production]
15:03 <moritzm> rebooting kafka-main1005 for microcode debugging [production]
15:01 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
15:01 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
14:52 <bblack> stopping puppet and pybal on lvs1014 (upload+maps traffic to 1016) [production]
14:50 <liw@deploy1001> Started scap: testwiki to php-1.35.0-wmf.3 and rebuild l10n cache [production]
14:45 <mbsantos@deploy1001> Finished deploy [kartotherian/deploy@85ea6e1]: Deploy kartotherian 1.1.5-wmf.0 (duration: 02m 44s) [production]
14:42 <mbsantos@deploy1001> Started deploy [kartotherian/deploy@85ea6e1]: Deploy kartotherian 1.1.5-wmf.0 [production]
14:13 <XioNoX> restart asw-esams for onsite work [production]
13:52 <andrewbogott> restarted slapd on ldap-eqiad-replica01 [production]
13:38 <gehel> silencing LVS check for katotherian (we know there is an issue) - T236163 [production]
13:35 <liw@deploy1001> scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="labtestwiki" --outdir="/tmp/scap_l10n_2419219323" --threads=30 --lang en --quiet' returned non-zero exit status 1 (duration: 06m 40s) [production]
13:28 <liw@deploy1001> Started scap: testwiki to php-1.34.0-wmf.3 and rebuild l10n cache [production]
13:13 <ayounsi@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
13:13 <ayounsi@cumin1001> START - Cookbook sre.hosts.downtime [production]
13:06 <XioNoX> depool esams for onsite work - T235805 [production]
13:05 <marostegui@cumin1001> dbctl commit (dc=all): 'Fully repool db1096:3316 db1105:3311 db1105:3312 after PDU and on-site maintenance', diff saved to https://phabricator.wikimedia.org/P9434 and previous config saved to /var/cache/conftool/dbconfig/20191022-130556-marostegui.json [production]
12:54 <marostegui@cumin1001> dbctl commit (dc=all): 'More traffic to db1096:3316 db1105:3311 instance db1105:3312 after PDU and on-site maintenance', diff saved to https://phabricator.wikimedia.org/P9433 and previous config saved to /var/cache/conftool/dbconfig/20191022-125435-marostegui.json [production]
12:46 <marostegui@cumin1001> dbctl commit (dc=all): 'More traffic to db1096:3316 db1105:3311 instance db1105:3312 after PDU and on-site maintenance', diff saved to https://phabricator.wikimedia.org/P9432 and previous config saved to /var/cache/conftool/dbconfig/20191022-124607-marostegui.json [production]
12:37 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1096:3316 after PDU maintenance', diff saved to https://phabricator.wikimedia.org/P9431 and previous config saved to /var/cache/conftool/dbconfig/20191022-123757-marostegui.json [production]
12:32 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1105:3312 and db1105:3311 after on-site maintenance T235877', diff saved to https://phabricator.wikimedia.org/P9430 and previous config saved to /var/cache/conftool/dbconfig/20191022-123257-marostegui.json [production]
12:30 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db2089:3315', diff saved to https://phabricator.wikimedia.org/P9429 and previous config saved to /var/cache/conftool/dbconfig/20191022-123032-marostegui.json [production]
12:29 <moritzm> rebooting miscweb2001 for some microcode tests [production]
12:28 <marostegui> Compress db1096:3315 [production]
12:27 <jmm@cumin2001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
12:27 <jmm@cumin2001> START - Cookbook sre.hosts.downtime [production]
12:25 <marostegui@deploy1001> Synchronized wmf-config/db-eqiad.php: Repool pc1007 after PDU maintenance T227142 (duration: 00m 50s) [production]
12:14 <jynus> reimage to buster dbmonitor2001.wikimedia.org T224589 [production]
11:57 <liw> starting to cut branch for train 1.35-wmf.3 [production]
11:51 <hashar> Restarted CI Jenkins on contint1001 [production]
11:35 <marostegui> Stop MySQL on db1105:3311, db1105:3312 for firmware upgrade - T235877 [production]