2021-10-08
§
|
20:10 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1004.eqiad.wmnet with reason: REIMAGE |
[production] |
20:08 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE |
[production] |
20:08 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1003.eqiad.wmnet with reason: REIMAGE |
[production] |
20:06 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1004.eqiad.wmnet with reason: REIMAGE |
[production] |
20:05 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1003.eqiad.wmnet with reason: REIMAGE |
[production] |
19:46 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1020.eqiad.wmnet with reason: REIMAGE |
[production] |
19:45 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1019.eqiad.wmnet with reason: REIMAGE |
[production] |
19:43 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1020.eqiad.wmnet with reason: REIMAGE |
[production] |
19:42 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1019.eqiad.wmnet with reason: REIMAGE |
[production] |
19:42 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE |
[production] |
19:39 |
<cmjohnson@cumin1001> |
START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE |
[production] |
18:15 |
<cstone> |
civicrm revision changed from 5cb7d487cb to 598b59b0ee |
[production] |
16:19 |
<urbanecm> |
[urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=enwiki --force # to measure performance on a large wiki |
[production] |
15:48 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'. |
[production] |
15:48 |
<elukey@deploy1002> |
helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'. |
[production] |
15:29 |
<jelto> |
enable puppet on gitlab1001 again for T283076 |
[production] |
14:05 |
<jiji@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
14:01 |
<jiji@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
09:49 |
<Amir1> |
wikiadmin@10.64.16.85(wikidatawiki)> delete from wb_changes_subscription where cs_subscriber_id in ('testcommonswiki', 'mowiki'); |
[production] |
09:39 |
<Emperor> |
installing stress on ms-be2045 given recent h/w issues T290881 |
[production] |
08:20 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
08:12 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
08:04 |
<urbanecm> |
[urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=frwiki --force |
[production] |
07:43 |
<Emperor> |
reboot ms-be2045 T290881 |
[production] |
07:41 |
<gehel> |
manually resuming the data reloads on wdqs1009 and wdqs2008 |
[production] |
06:42 |
<ayounsi@cumin1001> |
END (PASS) - Cookbook sre.network.cf (exit_code=0) |
[production] |
06:42 |
<ayounsi@cumin1001> |
START - Cookbook sre.network.cf |
[production] |
06:28 |
<ayounsi@cumin2002> |
END (PASS) - Cookbook sre.network.cf (exit_code=0) |
[production] |
06:28 |
<ayounsi@cumin2002> |
START - Cookbook sre.network.cf |
[production] |
05:35 |
<ryankemper@cumin1001> |
END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - T292814 |
[production] |
04:56 |
<ryankemper> |
[WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good |
[production] |
04:32 |
<ryankemper> |
T292814 Beginning rolling restart of `cloudelastic`: `sudo -i cookbook sre.elasticsearch.rolling-operation cloudelastic "cloudelastic restart" --nodes-per-run 1 --start-datetime 2021-10-08T03:53:49 --task-id T292814` on `ryankemper@cumin1001` tmux `elastic` |
[production] |
04:31 |
<ryankemper@cumin1001> |
START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - T292814 |
[production] |
04:29 |
<ryankemper> |
[WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'` |
[production] |
04:28 |
<ryankemper> |
[WDQS Deploy] Restarted `wdqs-categories` across both test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'` |
[production] |
04:28 |
<ryankemper> |
[WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'` |
[production] |
04:23 |
<ryankemper@deploy1002> |
Finished deploy [wdqs/wdqs@8f57a56]: 0.3.89 (duration: 08m 22s) |
[production] |
04:20 |
<ryankemper@cumin1001> |
END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - T292814 |
[production] |
04:20 |
<ryankemper@cumin1001> |
START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - T292814 |
[production] |
04:18 |
<gehel@cumin1001> |
END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) |
[production] |
04:17 |
<gehel@cumin1001> |
END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99) |
[production] |
04:15 |
<ryankemper> |
[WDQS Deploy] Tests passing following deploy of `0.3.89` on canary `wdqs1003`; proceeding to rest of fleet |
[production] |
04:14 |
<ryankemper@deploy1002> |
Started deploy [wdqs/wdqs@8f57a56]: 0.3.89 |
[production] |
04:14 |
<ryankemper> |
[WDQS Deploy] Gearing up for deploy of wdqs `0.3.89`. Pre-deploy tests passing on canary `wdqs1003` |
[production] |
03:58 |
<ryankemper@cumin1001> |
END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - T292814 |
[production] |
03:58 |
<ryankemper@cumin1001> |
START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - T292814 |
[production] |
02:04 |
<Krinkle> |
krinkle@deploy1002$ echo 'https://en.wikipedia.org/static/images/project-logos/jvwiktionary.png' | mwscript purgeList.php , ref T287425, T292810 |
[production] |
00:07 |
<tgr_> |
deploy window over |
[production] |
00:05 |
<tgr@deploy1002> |
Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments: Backport: [[gerrit:727498|Mentee overview: Make UncachedMenteeOverviewDataProvider::getBlocksForUsers faster (T290609)]] (duration: 00m 56s) |
[production] |