151-200 of 10000 results (9ms)
2020-06-11 §
17:22 <bstorm_> delaying failback labstore1004 for drive syncs T224582 [admin]
17:19 <mbsantos@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'mobileapps' for release 'production' . [production]
17:17 <bstorm_> failing NFS back to labstore1004 to complete the upgrade process T224582 [admin]
17:12 <bstorm_> reboot for stretch upgrade on labstore1004 T224582 [production]
16:49 <bstorm_> doing stretch upgrade for labstore1004 T224582 [production]
16:36 <bstorm_> rebooting labstore1004 for upgrades T224582 [production]
16:15 <bstorm_> failing over NFS for labstore1004 to labstore1005 T224582 [admin]
16:12 <bstorm_> downtimed labstore1005 for upgrades on 1004 since that will alert as well T224582 [production]
16:10 <bstorm_> downtimed labstore1004 for upgrades T224582 [production]
15:50 <cstone> SmashPig revision changed from b9de3c7aac to 2246685626 [production]
15:34 <jmm@cumin1001> END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) [production]
15:31 <jmm@cumin1001> START - Cookbook sre.hosts.reboot-single [production]
15:25 <moritzm> installing buster kernel security updates (no reboots yet) [production]
15:04 <jmm@cumin1001> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) [production]
15:04 <mforns@deploy1001> Finished deploy [analytics/refinery@c969b56]: Regular analytics weekly train [analytics/refinery@c969b56afae1b2532e07f0ff699c2ce161360966] (duration: 01m 39s) [production]
15:04 <root@cumin1001> END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99) [production]
15:04 <root@cumin1001> START - Cookbook sre.network.prepare-upgrade [production]
15:02 <mforns@deploy1001> Started deploy [analytics/refinery@c969b56]: Regular analytics weekly train [analytics/refinery@c969b56afae1b2532e07f0ff699c2ce161360966] [production]
15:02 <jmm@cumin1001> START - Cookbook sre.hosts.reboot-single [production]
15:01 <mforns> started refinery deploy for v0.0.126 [analytics]
14:58 <mforns> deployed refinery-source v0.0.126 [analytics]
14:56 <herron> bounced elasticsearch on logstash1012 [production]
14:44 <Reedy> rm -rf doc1001:/srv/docroot/org/wikimedia/doc/mediawiki-libs-PasswordBlacklist T254799 [releng]
14:41 <akosiaris@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
14:40 <Reedy> Reloading Zuul to deploy https://gerrit.wikimedia.org/r/604707 [releng]
14:40 <akosiaris@cumin1001> START - Cookbook sre.hosts.decommission [production]
14:37 <herron> enabled VO incident resolution notification in global settings [production]
14:34 <akosiaris@cumin1001> END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) [production]
14:31 <akosiaris@cumin1001> START - Cookbook sre.hosts.decommission [production]
14:30 <godog> bounce logstash on logstash1009, apparent GC death spiral [production]
14:03 <jmm@cumin1001> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) [production]
14:03 <jmm@cumin1001> START - Cookbook sre.hosts.reboot-single [production]
14:03 <jmm@cumin1001> END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) [production]
14:03 <jmm@cumin1001> START - Cookbook sre.hosts.reboot-single [production]
13:57 <ottomata> removed accidentally added page_restrictions column(s) on Hive table event.mediawiki_user_blocks_change after a incorrect schema change was merged (no data was ever set in this column) [analytics]
13:45 <RhinosF1> added sitemap.xml to search console [tools.zppixbot]
13:37 <Reedy> Reloading Zuul to deploy https://gerrit.wikimedia.org/r/604706 T254799 [releng]
13:35 <filippo@cumin1001> conftool action : set/pooled=false; selector: dnsdisc=thanos-query,name=eqiad [production]
13:35 <filippo@cumin1001> conftool action : set/pooled=false; selector: dnsdisc=thanos-swift,name=eqiad [production]
13:33 <wm-bot> <zppixbot> auto-update@website: Synced website repo in 95.s [tools.zppixbot]
13:16 <wm-bot> <zppixbot> auto-update@website: Synced website repo in 45.s [tools.zppixbot]
12:42 <arturo> introduce puppet profile 'toolsbeta-docker-registry' and relocate some hiera config there [toolsbeta]
12:39 <ayounsi@cumin1001> END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0) [production]
12:39 <arturo> for the record, k8s etcd servers certificate changed (puppet based) and k8s just kept working [toolsbeta]
12:36 <elukey> updated pcc facts [production]
12:35 <arturo> according to `aborrero@cloud-cumin-01:~$ sudo cumin --force -x 'O{project:toolsbeta}' 'run-puppet-agent'` we are mostly back in business [toolsbeta]
12:28 <jayme@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' . [production]
12:28 <jayme@deploy1001> helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . [production]
12:28 <marostegui@cumin1001> END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) [production]
12:25 <marostegui@cumin1001> START - Cookbook sre.hosts.downtime [production]