2020-02-26
ยง
|
22:51 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
22:49 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
22:48 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
22:47 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
22:44 |
<foks> |
removing one file for legal compliance |
[production] |
22:27 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
22:25 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
22:19 |
<pt1979@cumin2001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
22:16 |
<pt1979@cumin2001> |
START - Cookbook sre.hosts.downtime |
[production] |
21:51 |
<Urbanecm> |
Password reset for User:Joax (T242941) |
[production] |
21:28 |
<mutante> |
ganeti - shutting apt2001 down again |
[production] |
21:17 |
<ladsgroup@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: [[gerrit:574454|Decrease the reads for term store for clients down to Q2Mio (T219123)]], take II (duration: 01m 04s) |
[production] |
21:15 |
<ladsgroup@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: [[gerrit:574454|Decrease the reads for term store for clients down to Q2Mio (T219123)]] (duration: 01m 04s) |
[production] |
21:15 |
<mutante> |
ganeti - re-starting apt2001 which is mysteriously broken and "half up" ..as in you can't ssh to it and don't get console but it does cause icinga alerts |
[production] |
20:35 |
<ladsgroup@deploy1001> |
Synchronized php-1.35.0-wmf.21/extensions/Wikibase/lib/includes/Store/Sql/Terms: SWAT: [[gerrit:575055|Do prefetching entity ids on batches of 20 entity per query (T246159)]] (duration: 01m 04s) |
[production] |
20:20 |
<jhuneidi@deploy1001> |
Synchronized php: group1 wikis to 1.35.0-wmf.21 refs T233869 (duration: 01m 04s) |
[production] |
20:19 |
<jhuneidi@deploy1001> |
rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.21 refs T233869 |
[production] |
20:18 |
<otto@deploy1001> |
helmfile [STAGING] Ran 'apply' command on namespace 'eventstreams' for release 'production' . |
[production] |
20:10 |
<XioNoX> |
add BGP to AS4780 in Equinix Palo-Alot |
[production] |
20:09 |
<XioNoX> |
add BGP to AS8859 in AMS-IX |
[production] |
20:00 |
<Amir1> |
Morning SWAT is done |
[production] |
19:58 |
<ladsgroup@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:574454|Increase the reads for term store for clients for up to Q6Mio (T219123)]], take II (duration: 01m 04s) |
[production] |
19:56 |
<ladsgroup@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:574454|Increase the reads for term store for clients for up to Q6Mio (T219123)]] (duration: 01m 02s) |
[production] |
18:37 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
18:37 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
18:37 |
<dzahn@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
18:37 |
<dzahn@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
18:09 |
<bstorm_> |
downtimed labstore1004/5, cloudstore1008/9 and cloudbackup1001/2 for merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/571821 |
[production] |
18:05 |
<mutante> |
phab1001 - manually running community_metrics and project_changes scripts (crons) (T244677) |
[production] |
17:49 |
<Amir1> |
setting cache type of mwdebug1001 to LCStoreStaticArray, this would break group1 and group2 in that node (T99740) |
[production] |
17:42 |
<XioNoX> |
remove ns2 redirect to eqiad on cr3-knams |
[production] |
17:40 |
<XioNoX> |
re-enable transits on cr3-esams |
[production] |
17:09 |
<robh> |
cr2-esasms work done, cr3-esams linecard swap starting now via T245825 |
[production] |
16:39 |
<robh> |
please note cr2-esams work is ongoing via T246009 and its downtime is expected |
[production] |
16:00 |
<jynus> |
deploy new grants to phabricator stats user to database on m3 T246105 |
[production] |
15:51 |
<jynus> |
starting s2, s3 eqiad backup source data check; expect increase read traffic on db1095:3313, db1140:3312, db1078, db1090:3312 T244958 |
[production] |
15:25 |
<addshore> |
addshore@mwmaint1002:~$ time mwscript extensions/Wikibase/repo/maintenance/rebuildItemTerms.php --wiki=wikidatawiki --batch-size=50 --sleep=1 --file=20to30holes-25feb2229 # T219123 |
[production] |
15:19 |
<volans@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
15:17 |
<volans@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
14:54 |
<volans@cumin2001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) |
[production] |
14:54 |
<volans@cumin2001> |
START - Cookbook sre.hosts.decommission |
[production] |
14:51 |
<volans@cumin2001> |
END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) |
[production] |
14:46 |
<volans@cumin2001> |
START - Cookbook sre.ganeti.makevm |
[production] |
14:19 |
<volans@cumin2001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
14:19 |
<volans@cumin2001> |
START - Cookbook sre.hosts.decommission |
[production] |
14:12 |
<volans@cumin2001> |
END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) |
[production] |
14:11 |
<volans@cumin2001> |
START - Cookbook sre.hosts.decommission |
[production] |
14:05 |
<gehel> |
restart of elasticsearch on cloudelastic for JVM upgrade completed |
[production] |
14:03 |
<XioNoX> |
deactivate BGP to AS23930 on cr1-eqsin, will re-enable when their technical issues are fixed and they notify us |
[production] |
14:00 |
<elukey> |
run apt-get clean on notebook1004 to free some space - T224682 |
[production] |