201-250 of 10000 results (21ms)
2020-06-11 ยง
20:31 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime [production]
20:15 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
20:13 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime [production]
20:00 <pt1979@cumin2001> END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) [production]
19:59 <jhuneidi@deploy1001> rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.36 [production]
19:58 <pt1979@cumin2001> START - Cookbook sre.hosts.downtime [production]
19:33 <akosiaris> apply emergency sessionstore fixes in codfw as well [production]
19:32 <akosiaris@deploy1001> helmfile [CODFW] Ran 'sync' command on namespace 'sessionstore' for release 'production' . [production]
19:20 <gilles@deploy1001> Finished deploy [performance/asoranking@0a096c4]: T252424 (duration: 00m 47s) [production]
19:19 <gilles@deploy1001> Started deploy [performance/asoranking@0a096c4]: T252424 [production]
19:12 <akosiaris> repool eqiad for sessionstore [production]
19:12 <akosiaris@cumin1001> conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=sessionstore [production]
19:10 <akosiaris> remove the podaffinity restrictions for sessionstore in eqiad [production]
19:10 <akosiaris@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'sessionstore' for release 'production' . [production]
19:07 <akosiaris> increase memory limits for sessionstore in eqiad to 400Mi from 300Mi [production]
19:07 <akosiaris@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'sessionstore' for release 'production' . [production]
19:00 <akosiaris> increase sessionstore capacity in codfw from 4 pods to 6 [production]
19:00 <akosiaris@deploy1001> helmfile [CODFW] Ran 'sync' command on namespace 'sessionstore' for release 'production' . [production]
18:59 <akosiaris> depool eqiad, switch to codfw [production]
18:58 <akosiaris@cumin1001> conftool action : set/pooled=false; selector: name=eqiad,dnsdisc=sessionstore [production]
18:08 <ppchelko@deploy1001> Synchronized wmf-config/reverse-proxy-staging.php: Beta: Switch from HTCP purging to kafka purging gerrit:603530, reverse-proxy-staging.php (duration: 01m 06s) [production]
18:06 <ppchelko@deploy1001> Synchronized wmf-config/InitialiseSettings-labs.php: Beta: Switch from HTCP purging to kafka purging gerrit:603530, IS-labs.php (duration: 01m 06s) [production]
17:29 <mbsantos@deploy1001> helmfile [CODFW] Ran 'sync' command on namespace 'proton' for release 'production' . [production]
17:26 <mbsantos@deploy1001> helmfile [CODFW] Ran 'sync' command on namespace 'mobileapps' for release 'production' . [production]
17:22 <mbsantos@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'proton' for release 'production' . [production]
17:19 <mbsantos@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'production' . [production]
17:12 <bstorm_> reboot for stretch upgrade on labstore1004 T224582 [production]
16:49 <bstorm_> doing stretch upgrade for labstore1004 T224582 [production]
16:36 <bstorm_> rebooting labstore1004 for upgrades T224582 [production]
16:12 <bstorm_> downtimed labstore1005 for upgrades on 1004 since that will alert as well T224582 [production]
16:10 <bstorm_> downtimed labstore1004 for upgrades T224582 [production]
15:50 <cstone> SmashPig revision changed from b9de3c7aac to 2246685626 [production]
15:34 <jmm@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) [production]
15:31 <jmm@cumin1001> START - Cookbook sre.hosts.reboot-single [production]
15:25 <moritzm> installing buster kernel security updates (no reboots yet) [production]
15:04 <jmm@cumin1001> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) [production]
15:04 <mforns@deploy1001> Finished deploy [analytics/refinery@c969b56]: Regular analytics weekly train [analytics/refinery@c969b56afae1b2532e07f0ff699c2ce161360966] (duration: 01m 39s) [production]
15:04 <root@cumin1001> END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99) [production]
15:04 <root@cumin1001> START - Cookbook sre.network.prepare-upgrade [production]
15:02 <mforns@deploy1001> Started deploy [analytics/refinery@c969b56]: Regular analytics weekly train [analytics/refinery@c969b56afae1b2532e07f0ff699c2ce161360966] [production]
15:02 <jmm@cumin1001> START - Cookbook sre.hosts.reboot-single [production]
14:56 <herron> bounced elasticsearch on logstash1012 [production]
14:41 <akosiaris@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
14:40 <akosiaris@cumin1001> START - Cookbook sre.hosts.decommission [production]
14:37 <herron> enabled VO incident resolution notification in global settings [production]
14:34 <akosiaris@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
14:31 <akosiaris@cumin1001> START - Cookbook sre.hosts.decommission [production]
14:30 <godog> bounce logstash on logstash1009, apparent GC death spiral [production]
14:03 <jmm@cumin1001> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) [production]
14:03 <jmm@cumin1001> START - Cookbook sre.hosts.reboot-single [production]