4351-4400 of 10000 results (44ms)
2020-06-11 ยง
16:12 <bstorm_> downtimed labstore1005 for upgrades on 1004 since that will alert as well T224582 [production]
16:10 <bstorm_> downtimed labstore1004 for upgrades T224582 [production]
15:50 <cstone> SmashPig revision changed from b9de3c7aac to 2246685626 [production]
15:34 <jmm@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) [production]
15:31 <jmm@cumin1001> START - Cookbook sre.hosts.reboot-single [production]
15:25 <moritzm> installing buster kernel security updates (no reboots yet) [production]
15:04 <jmm@cumin1001> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) [production]
15:04 <mforns@deploy1001> Finished deploy [analytics/refinery@c969b56]: Regular analytics weekly train [analytics/refinery@c969b56afae1b2532e07f0ff699c2ce161360966] (duration: 01m 39s) [production]
15:04 <root@cumin1001> END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99) [production]
15:04 <root@cumin1001> START - Cookbook sre.network.prepare-upgrade [production]
15:02 <mforns@deploy1001> Started deploy [analytics/refinery@c969b56]: Regular analytics weekly train [analytics/refinery@c969b56afae1b2532e07f0ff699c2ce161360966] [production]
15:02 <jmm@cumin1001> START - Cookbook sre.hosts.reboot-single [production]
14:56 <herron> bounced elasticsearch on logstash1012 [production]
14:41 <akosiaris@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
14:40 <akosiaris@cumin1001> START - Cookbook sre.hosts.decommission [production]
14:37 <herron> enabled VO incident resolution notification in global settings [production]
14:34 <akosiaris@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
14:31 <akosiaris@cumin1001> START - Cookbook sre.hosts.decommission [production]
14:30 <godog> bounce logstash on logstash1009, apparent GC death spiral [production]
14:03 <jmm@cumin1001> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) [production]
14:03 <jmm@cumin1001> START - Cookbook sre.hosts.reboot-single [production]
14:03 <jmm@cumin1001> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) [production]
14:03 <jmm@cumin1001> START - Cookbook sre.hosts.reboot-single [production]
13:35 <filippo@cumin1001> conftool action : set/pooled=false; selector: dnsdisc=thanos-query,name=eqiad [production]
13:35 <filippo@cumin1001> conftool action : set/pooled=false; selector: dnsdisc=thanos-swift,name=eqiad [production]
12:39 <ayounsi@cumin1001> END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0) [production]
12:36 <elukey> updated pcc facts [production]
12:28 <jayme@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' . [production]
12:28 <jayme@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . [production]
12:28 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
12:25 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime [production]
12:15 <jayme@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' . [production]
12:15 <jayme@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' . [production]
12:04 <jforrester@deploy1001> Synchronized php-1.35.0-wmf.36/includes/title/NamespaceInfo.php: T253098 NamespaceInfo::makeValidNamespace: Don't throw for -1 or -2 (duration: 01m 06s) [production]
12:03 <marostegui> Reimage es2023 (es5 codfw master) [production]
11:54 <marostegui@cumin1001> dbctl commit (dc=all): 'Repool db2075 T254139', diff saved to https://phabricator.wikimedia.org/P11469 and previous config saved to /var/cache/conftool/dbconfig/20200611-115430-marostegui.json [production]
11:46 <marostegui> Deploy schema change on s6 codfw - T250066 [production]
11:44 <volans@deploy1001> Finished deploy [homer/deploy@df83901]: Release v0.2.3 (duration: 00m 25s) [production]
11:44 <volans@deploy1001> Started deploy [homer/deploy@df83901]: Release v0.2.3 [production]
11:36 <ayounsi@cumin1001> START - Cookbook sre.network.prepare-upgrade [production]
11:36 <matthiasmullie> EU BACON done [production]
11:35 <mlitn@deploy1001> Synchronized php-1.35.0-wmf.36/extensions/GrowthExperiments: Help panel: Update guidance behavior rules (duration: 01m 06s) [production]
11:34 <jayme@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' . [production]
11:34 <jayme@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' . [production]
11:28 <kartik@deploy1001> Synchronized php-1.35.0-wmf.36/extensions/ContentTranslation/modules/tools/mw.cx.tools.IssueTrackingTool.js: Backport: [[gerrit|604587|IssueTrackingTool: Fix js error in getCurrentNodeId method (T254965)]] (duration: 01m 07s) [production]
11:08 <jayme@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' . [production]
11:04 <mlitn@deploy1001> Synchronized php-1.35.0-wmf.36/extensions/MachineVision: $aliases should be an array of strings, not AliasGroup objects (duration: 01m 07s) [production]
10:47 <moritzm> repooling mw1318,mw2139,mw2145,mw2147,mw2221,mw2219,mw2250,mw2350 (these were depooled, but seem all fine in Icinga and were probably just forgotten) [production]
10:41 <filippo@cumin1001> conftool action : set/pooled=yes; selector: cluster=thanos,service=thanos-swift [production]
10:40 <filippo@cumin1001> conftool action : set/pooled=yes; selector: cluster=thanos,service=thanos-query [production]