1201-1250 of 10000 results (66ms)
2020-03-25 §
09:14 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1137', diff saved to https://phabricator.wikimedia.org/P10759 and previous config saved to /var/cache/conftool/dbconfig/20200325-091421-marostegui.json [production]
09:02 <marostegui@cumin1001> dbctl commit (dc=all): 'Slowly repool db1137', diff saved to https://phabricator.wikimedia.org/P10758 and previous config saved to /var/cache/conftool/dbconfig/20200325-090227-marostegui.json [production]
08:55 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
08:53 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime [production]
08:38 <marostegui> Reimage db1137 [production]
08:18 <marostegui> Reboot db1117 for full-upgrade [production]
08:15 <oblivian@deploy1001> helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-main' for release 'canary' . [production]
08:15 <oblivian@deploy1001> helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-main' for release 'production' . [production]
08:14 <_joe_> upgrading all eventgate-main to envoy 1.13.1 T246868 [production]
08:12 <marostegui> Stop all mysql daemons on db1117 [production]
07:50 <oblivian@deploy1001> helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-main' for release 'canary' . [production]
07:50 <oblivian@deploy1001> helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-main' for release 'production' . [production]
07:42 <XioNoX> reboot scs-eqsin for CPU usage [production]
07:20 <jmm@cumin2001> START - Cookbook sre.ganeti.makevm [production]
07:09 <marostegui@cumin1001> dbctl commit (dc=all): 'Depool db1137 for upgrade', diff saved to https://phabricator.wikimedia.org/P10757 and previous config saved to /var/cache/conftool/dbconfig/20200325-070946-marostegui.json [production]
06:57 <marostegui> Deploy schema change on db2129 (s6 codfw master) [production]
06:15 <marostegui> Rename tables on db1133 (m5 master) nova_api database - T248313 [production]
06:13 <marostegui> Remove grants 'nova'@'208.80.154.23' on nova.* - T248313 [production]
2020-03-24 §
20:53 <cdanis> repool eqsin [production]
20:52 <jforrester@deploy1001> Synchronized wmf-config/CommonSettings.php: Don't hard-set wgTmhUseBetaFeatures to true, let it vary by wiki (duration: 01m 07s) [production]
20:50 <jforrester@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Touch and secondary sync of IS for cache-busting (duration: 01m 07s) [production]
20:49 <jforrester@deploy1001> Synchronized wmf-config/InitialiseSettings.php: Set wgTmhUseBetaFeatures to vary by wiki (duration: 01m 06s) [production]
20:35 <twentyafterfour@deploy1001> rebuilt and synchronized wikiversions files: Attempt #2: group0 wikis to 1.35.0-wmf.25 refs T233873 [production]
20:32 <twentyafterfour@deploy1001> Synchronized wmf-config: Now touch and sync again because of settings cache rache condition. refs T248409 (duration: 00m 59s) [production]
20:31 <cdanis> rebooting cr2-eqsin T248394 [production]
20:30 <twentyafterfour@deploy1001> Synchronized wmf-config: Now sync InitializeSettings* refs T248409 (duration: 00m 59s) [production]
20:28 <twentyafterfour@deploy1001> Synchronized wmf-config/CommonSettings.php: sync CommonSettings before InitialiseSettings refs T248409 (duration: 00m 58s) [production]
20:27 <volans> force rebooting analytics1044 from console, host down and unreachable (ping, ssh, console) [production]
20:26 <cdanis> commit flow-table-size on cr2-eqsin T248394 [production]
20:19 <cdanis> eqsin depooled for router maintenance at 16:15 [production]
19:29 <twentyafterfour@deploy1001> scap failed: average error rate on 4/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details) [production]
19:29 <twentyafterfour> rolling back to wmf.24 due to high error rate refs T233873 [production]
19:28 <twentyafterfour@deploy1001> scap failed: average error rate on 7/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details) [production]
18:49 <gehel> repooling wdqs1006, catched up on lag [production]
17:12 <hashar@deploy1001> Finished scap: testwiki to 1.35.0-wmf.25 and rebuild l10n cache # T233873 (duration: 77m 52s) [production]
17:10 <ebernhardson> update cloudelastic-chi replica counts from 2 to 1 T231517 [production]
16:41 <moritzm> installing linux-perf updates on stretch [production]
16:31 <moritzm> installing linux-perf-4.19 updates on buster [production]
15:58 <mutante> installing OS on otrs1001.eqiad.wmnet (T248028) [production]
15:54 <hashar@deploy1001> Started scap: testwiki to 1.35.0-wmf.25 and rebuild l10n cache # T233873 [production]
15:35 <hnowlan@deploy1001> helmfile [STAGING] Ran 'sync' command on namespace 'changeprop' for release 'staging' . [production]
15:31 <hashar@deploy1001> Pruned MediaWiki: 1.35.0-wmf.22 (duration: 02m 02s) [production]
15:29 <hashar@deploy1001> Pruned MediaWiki: 1.35.0-wmf.21 (duration: 24m 00s) [production]
15:17 <hashar> Cleaning old MediaWiki deployments # T233873 [production]
15:03 <hashar> Applied patches to 1.35.0-wmf.25 # T233873 [production]
14:59 <hashar> scap prep 1.35.0-wmf.25 # T233873 [production]
14:55 <gehel> depooling wdqs1006 to catch up on lag [production]
14:28 <marostegui> Deploy schema change on db2117 (s6) [production]
14:26 <hashar> Branching wmf/1.35.0-wmf.25 # T233873 [production]
13:22 <moritzm> installing glib2.0 updates from Stretch point release [production]