751-800 of 10000 results (26ms)
2020-10-29 §
04:06 <ryankemper@cumin1001> END (PASS) - Cookbook sre.elasticsearch.rolling-restart (exit_code=0) [production]
01:41 <mutante> scandium reimaged a second time after making puppet changes to ensure nodejs/npm is NOT installed anymore (T257906) [production]
01:17 <ryankemper> T266492 Beginning rolling restart of eqiad cirrus cluster, 3 nodes at a time, on `ryankemper@cumin1001` tmux session `elasticsearch_restart_eqiad` [production]
01:16 <ryankemper@cumin1001> START - Cookbook sre.elasticsearch.rolling-restart [production]
00:51 <ryankemper> Finished restart of wdqs categories across production hosts; wdqs deploy is complete and the service is healthy [production]
00:14 <Amir1> rolling restart of ores [production]
00:12 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
00:10 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime [production]
00:04 <ryankemper> Beginning restart of wdqs categories across production hosts, one at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 60 && systemctl restart wdqs-categories && sleep 30 && pool'` [production]
00:03 <ryankemper> Restarted wdqs categories across test hosts: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'` [production]
00:03 <ryankemper> Restarted wdqs updater across all hosts: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'` [production]
00:02 <ryankemper> Following wdqs deploy, https://query.wikidata.org successfully responds to an example query [production]
00:01 <ryankemper@deploy1001> Finished deploy [wdqs/wdqs@8c97b17]: 0.3.53 (duration: 09m 29s) [production]
2020-10-28 §
23:54 <ryankemper> Canary `wdqs1003` tests pass, proceeding with wdqs deploy to rest of fleet [production]
23:52 <ryankemper@deploy1001> Started deploy [wdqs/wdqs@8c97b17]: 0.3.53 [production]
23:52 <ryankemper@deploy1001> deploy aborted: 0.3.53 (duration: 00m 00s) [production]
23:52 <ryankemper@deploy1001> Started deploy [wdqs/wdqs@8c97b17]: 0.3.53 [production]
22:54 <mutante> scandium - scap pull after reinstalling OS [production]
22:14 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
22:12 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime [production]
21:41 <ryankemper> Disabled elasticsearch "saneitizer" systemd timer in eqiad due to checker jobs falling behind: `sudo systemctl disable mediawiki_job_cirrus_sanitize_jobs.timer` on `mwmaint1002` [production]
21:22 <herron@cumin1001> END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) [production]
21:05 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
21:05 <hnowlan@cumin1001> START - Cookbook sre.hosts.downtime [production]
20:50 <herron@cumin1001> START - Cookbook sre.ganeti.makevm [production]
20:22 <ladsgroup@deploy1001> Synchronized static/images/project-logos: Changing logo of Wikidata for the brithday (duration: 00m 58s) [production]
19:56 <jgleeson> updated Smashpig from 2246685626 to 09f29c1da5 [production]
19:53 <herron@cumin1001> END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97) [production]
19:53 <herron@cumin1001> START - Cookbook sre.ganeti.makevm [production]
19:50 <herron@cumin1001> END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) [production]
19:36 <herron@cumin1001> START - Cookbook sre.ganeti.makevm [production]
19:36 <herron@cumin1001> END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) [production]
19:36 <herron@cumin1001> START - Cookbook sre.ganeti.makevm [production]
19:30 <hnowlan@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
19:30 <hnowlan@cumin1001> START - Cookbook sre.hosts.downtime [production]
19:22 <dzahn@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
19:20 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime [production]
18:56 <tgr_> Morning deploys done [production]
18:55 <tgr@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:636983|Temporary enable 'editpage' warn logging (T251023)]] (duration: 00m 57s) [production]
18:51 <dzahn@cumin1001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
18:51 <dzahn@cumin1001> START - Cookbook sre.hosts.downtime [production]
18:47 <volans@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
18:46 <tgr@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:636791|Revert "cirrus: Hardcode more_like to codfw cirrus cluster"]] (duration: 00m 56s) [production]
18:45 <tgr@deploy1001> Synchronized wmf-config/PoolCounterSettings.php: Config: [[gerrit:636956|Revert "Revert "Increase cirrus morelike pool counter by 20%"" ()]] (duration: 00m 57s) [production]
18:43 <volans@cumin1001> START - Cookbook sre.dns.netbox [production]
18:40 <tgr@deploy1001> Synchronized php-1.36.0-wmf.14/extensions/GrowthExperiments/includes/HomepageModules/SuggestedEdits.php: Backport: [[gerrit:636787|Suggested edits: Include page ID with task preview data (T266600)]] (duration: 00m 59s) [production]
18:19 <tgr@deploy1001> Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:619880|Removing obsolete license definition]] (duration: 01m 00s) [production]
18:11 <cmjohnson@cumin1001> END (PASS) - Cookbook sre.dns.netbox (exit_code=0) [production]
18:07 <cmjohnson@cumin1001> START - Cookbook sre.dns.netbox [production]
18:06 <cmjohnson@cumin1001> END (FAIL) - Cookbook sre.dns.netbox (exit_code=99) [production]