2020-02-26
ยง
|
18:09 |
<bstorm_> |
downtimed labstore1004/5, cloudstore1008/9 and cloudbackup1001/2 for merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/571821 |
[production] |
18:05 |
<mutante> |
phab1001 - manually running community_metrics and project_changes scripts (crons) (T244677) |
[production] |
17:49 |
<Amir1> |
setting cache type of mwdebug1001 to LCStoreStaticArray, this would break group1 and group2 in that node (T99740) |
[production] |
17:42 |
<XioNoX> |
remove ns2 redirect to eqiad on cr3-knams |
[production] |
17:40 |
<XioNoX> |
re-enable transits on cr3-esams |
[production] |
17:09 |
<robh> |
cr2-esasms work done, cr3-esams linecard swap starting now via T245825 |
[production] |
16:39 |
<robh> |
please note cr2-esams work is ongoing via T246009 and its downtime is expected |
[production] |
16:00 |
<jynus> |
deploy new grants to phabricator stats user to database on m3 T246105 |
[production] |
15:51 |
<jynus> |
starting s2, s3 eqiad backup source data check; expect increase read traffic on db1095:3313, db1140:3312, db1078, db1090:3312 T244958 |
[production] |
15:25 |
<addshore> |
addshore@mwmaint1002:~$ time mwscript extensions/Wikibase/repo/maintenance/rebuildItemTerms.php --wiki=wikidatawiki --batch-size=50 --sleep=1 --file=20to30holes-25feb2229 # T219123 |
[production] |
15:19 |
<volans@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
15:17 |
<volans@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
14:54 |
<volans@cumin2001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) |
[production] |
14:54 |
<volans@cumin2001> |
START - Cookbook sre.hosts.decommission |
[production] |
14:51 |
<volans@cumin2001> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) |
[production] |
14:46 |
<volans@cumin2001> |
START - Cookbook sre.ganeti.makevm |
[production] |
14:19 |
<volans@cumin2001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
14:19 |
<volans@cumin2001> |
START - Cookbook sre.hosts.decommission |
[production] |
14:12 |
<volans@cumin2001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) |
[production] |
14:11 |
<volans@cumin2001> |
START - Cookbook sre.hosts.decommission |
[production] |
14:05 |
<gehel> |
restart of elasticsearch on cloudelastic for JVM upgrade completed |
[production] |
14:03 |
<XioNoX> |
deactivate BGP to AS23930 on cr1-eqsin, will re-enable when their technical issues are fixed and they notify us |
[production] |
14:00 |
<elukey> |
run apt-get clean on notebook1004 to free some space - T224682 |
[production] |
13:45 |
<XioNoX> |
ganeti2001:~$ sudo gnt-instance shutdown apt2001.wikimedia.org - T224576 |
[production] |
12:26 |
<jmm@cumin2001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
12:26 |
<jmm@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
12:24 |
<kartik@deploy1001> |
Synchronized wmf-config/CommonSettings.php: SWAT: [[gerrit|416973|ContentTranslation: Set cookieDomain for Production]] (duration: 01m 04s) |
[production] |
12:11 |
<kartik@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit|574469|Enable CX out of beta in eu, sw, and ta Wikipedias (T245446, T245447, T245448)]] take II (duration: 01m 05s) |
[production] |
12:10 |
<kartik@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit|574469|Enable CX out of beta in eu, sw, and ta Wikipedias (T245446, T245447, T245448)]] (duration: 01m 15s) |
[production] |
12:05 |
<volans> |
uploaded spicerack_0.0.31-1_amd64.deb to apt.wikimedia.org stretch-wikimedia |
[production] |
11:45 |
<jbond42> |
changing uid/gid of reprepro effects release[12]001/install[12]002 |
[production] |
11:05 |
<moritzm> |
rolling out remaining PHP 7.0 security updates |
[production] |
10:57 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) |
[production] |
10:52 |
<moritzm> |
installing clamav security updates on mendelevium (ticket.wikimedia.org |
[production] |
10:03 |
<elukey> |
upgrade prometheus-mcrouter-exporter 0.1.0+git20200225-1 to all cumin alias parsoid/deployment-servers/mw-maintenance |
[production] |
09:54 |
<elukey> |
upgrade prometheus-mcrouter-exporter 0.1.0+git20200225-1 to all cumin alias all-mw-eqiad |
[production] |
09:37 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.roll-restart-workers |
[production] |
09:34 |
<elukey> |
roll restart the Hadoop Analytcs workers for openjdk upgrades |
[production] |
09:32 |
<elukey> |
upgrade prometheus-mcrouter-exporter 0.1.0+git20200225-1 to all cumin alias all-mw-codfw |
[production] |
09:18 |
<gehel> |
restarting elasticsearch on cloudelastic for JVM upgrade |
[production] |
08:51 |
<elukey> |
upload prometheus-mcrouter-exporter 0.1.0+git20200225-1 to stretch-wikimedia |
[production] |
08:38 |
<elukey> |
upgrade prometheus-mcrouter-exporter on mwdebug1001 to test the new version |
[production] |
06:19 |
<marostegui> |
Stop MySQL and poweroff db1084 for BBU replacement - T245647 |
[production] |
06:17 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Fully repool es1019 after on-site maintenance T243963', diff saved to https://phabricator.wikimedia.org/P10530 and previous config saved to /var/cache/conftool/dbconfig/20200226-061710-marostegui.json |
[production] |
06:16 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Restore es1017 (master) original weight (0) T243963', diff saved to https://phabricator.wikimedia.org/P10529 and previous config saved to /var/cache/conftool/dbconfig/20200226-061640-marostegui.json |
[production] |
06:09 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1084 for BBU replacement - T245647', diff saved to https://phabricator.wikimedia.org/P10528 and previous config saved to /var/cache/conftool/dbconfig/20200226-060906-marostegui.json |
[production] |
05:41 |
<kart_> |
Updated cxserver to 2020-02-24-110149-production (T227183) |
[production] |
05:35 |
<kartik@deploy1001> |
helmfile [CODFW] Ran 'apply' command on namespace 'cxserver' for release 'production' . |
[production] |
05:31 |
<kartik@deploy1001> |
helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' . |
[production] |
05:29 |
<kartik@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' . |
[production] |