2021-07-12
§
|
11:38 |
<hnowlan> |
adjusting weights of codfw maps servers to reduce load on older spec machines |
[production] |
11:37 |
<hnowlan@puppetmaster1001> |
conftool action : set/weight=6; selector: name=maps2004.codfw.wmnet |
[production] |
11:34 |
<urbanecm@deploy1002> |
Synchronized wmf-config/logos.php: 773c956811cba5c3a2cbba32bc1e1a536dbd9f0b: Revert "Use ptwiki 20th anniversary logos" (T286380) (duration: 00m 57s) |
[production] |
11:34 |
<hnowlan@puppetmaster1001> |
conftool action : set/weight=6; selector: name=maps2003.codfw.wmnet |
[production] |
11:33 |
<hnowlan@puppetmaster1001> |
conftool action : set/weight=6; selector: name=maps2001.codfw.wmnet |
[production] |
11:33 |
<urbanecm@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: cd5f5375b4f712c56e9396cc550078272ef668de: Revert "ptwiki: Use celebration logos in new vector" (T286380) (duration: 00m 57s) |
[production] |
11:26 |
<wmde-fisch@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:702761|Add 'editautoreviewprotected' protection level to hewikisource (T275076)]] (duration: 00m 57s) |
[production] |
11:20 |
<hnowlan@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=maps2010.codfw.wmnet |
[production] |
11:19 |
<hnowlan> |
testing a depool of maps2010 to ensure kartotherian load can cope with two less nodes |
[production] |
11:12 |
<wmde-fisch@deploy1002> |
Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:703568|Enable transclusion back button on first wikis (T284553)]] (duration: 00m 58s) |
[production] |
11:01 |
<hnowlan@puppetmaster1001> |
conftool action : set/pooled=no; selector: name=maps2008.codfw.wmnet |
[production] |
10:58 |
<hnowlan> |
testing a depool of maps2008 to ensure kartotherian load can cope with one less node |
[production] |
10:30 |
<moritzm> |
installing apache updates on an-tool* hosts (affects Turnilo, Yarn, Superset, Hue) briefly |
[production] |
10:11 |
<elukey> |
add 10g disk to ml-serve-ctrl[12]00[12] for T285927 |
[production] |
10:07 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb1009.eqiad.wmnet |
[production] |
10:05 |
<mutante> |
planet - deleting state files, manually running update for all 161 en feeds - T285251 |
[production] |
10:03 |
<effie> |
depool mw2383 |
[production] |
10:02 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host rdb1009.eqiad.wmnet |
[production] |
10:01 |
<godog> |
test thanos-compact upload with smaller part size - T285835 |
[production] |
09:53 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb1010.eqiad.wmnet |
[production] |
09:50 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host rdb1010.eqiad.wmnet |
[production] |
09:18 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb1006.eqiad.wmnet |
[production] |
09:12 |
<kormat@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1183.eqiad.wmnet with reason: REIMAGE |
[production] |
09:10 |
<kormat@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on db1183.eqiad.wmnet with reason: REIMAGE |
[production] |
09:07 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.reboot-single for host rdb1006.eqiad.wmnet |
[production] |
09:07 |
<godog> |
repool thanos-fe2002 - T285835 |
[production] |
08:38 |
<godog> |
test a single frontend for thanos-swift / thanos-query to test "bad host" theory - T285835 |
[production] |
08:26 |
<ladsgroup@deploy1002> |
Synchronized php-1.37.0-wmf.12/extensions/Wikibase/client: Backport: [[gerrit:703890|Remove subscribing to other aspect for entity usage (T286193)]] (duration: 00m 59s) |
[production] |
07:44 |
<jynus> |
restart db1102:x1 mariadb instance |
[production] |
07:01 |
<moritzm> |
installing apache2 security updates |
[production] |
05:14 |
<Amir1> |
start of mwscript refreshImageMetadata.php --wiki=commonswiki --mediatype=OFFICE --batch-size=10 --verbose --mime="application/pdf" --force --sleep 5 on screen - It will take days / week to finish (T275268) |
[production] |
05:06 |
<ladsgroup@deploy1002> |
Synchronized wmf-config/filebackend.php: Config: [[gerrit:703951|Enable json image metadata everywhere (T275268)]] (duration: 01m 05s) |
[production] |
04:56 |
<ladsgroup@deploy1002> |
Synchronized php-1.37.0-wmf.12/maintenance/refreshImageMetadata.php: Backport: [[gerrit:703891|Add --sleep option to refreshImageMetadata.php]] (duration: 01m 04s) |
[production] |
04:10 |
<Amir1> |
mwscript refreshImageMetadata.php --wiki=testcommonswiki --mediatype=OFFICE --batch-size=20 --verbose --mime="application/pdf" --force (T275268) |
[production] |
04:08 |
<ladsgroup@deploy1002> |
Synchronized wmf-config/filebackend.php: Config: [[gerrit:703950|Set testcommonswiki to use json image metadata (T275268)]] (duration: 01m 10s) |
[production] |
2021-07-09
§
|
23:28 |
<legoktm@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'shellbox' for release 'main' . |
[production] |
23:27 |
<legoktm@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'shellbox' for release 'main' . |
[production] |
22:36 |
<legoktm> |
running benchmarking scripts again shellbox |
[production] |
14:49 |
<otto@deploy1002> |
Finished deploy [analytics/refinery@cdb3fc5] (hadoop-test): Deploy for finalize event_default_test gobblin job in hadoop test - T271232 (duration: 03m 08s) |
[production] |
14:46 |
<otto@deploy1002> |
Started deploy [analytics/refinery@cdb3fc5] (hadoop-test): Deploy for finalize event_default_test gobblin job in hadoop test - T271232 |
[production] |
11:56 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db1118', diff saved to https://phabricator.wikimedia.org/P16809 and previous config saved to /var/cache/conftool/dbconfig/20210709-115609-marostegui.json |
[production] |
11:40 |
<_joe_> |
deleting coredns pod in codfw, potentially causing T286360 |
[production] |
10:13 |
<_joe_> |
recreated all pods for zotero in codfw |
[production] |
00:47 |
<legoktm> |
zotero rolling restart didn't help, filed T286360 for DNS issues |
[production] |
00:39 |
<legoktm> |
doing a rolling restart of zotero in codfw to hopefully fix DNS ENOTFOUND issues |
[production] |