2020-06-11
§
|
15:04 |
<jmm@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) |
[production] |
15:04 |
<mforns@deploy1001> |
Finished deploy [analytics/refinery@c969b56]: Regular analytics weekly train [analytics/refinery@c969b56afae1b2532e07f0ff699c2ce161360966] (duration: 01m 39s) |
[production] |
15:04 |
<root@cumin1001> |
END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99) |
[production] |
15:04 |
<root@cumin1001> |
START - Cookbook sre.network.prepare-upgrade |
[production] |
15:02 |
<mforns@deploy1001> |
Started deploy [analytics/refinery@c969b56]: Regular analytics weekly train [analytics/refinery@c969b56afae1b2532e07f0ff699c2ce161360966] |
[production] |
15:02 |
<jmm@cumin1001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
15:01 |
<mforns> |
started refinery deploy for v0.0.126 |
[analytics] |
14:58 |
<mforns> |
deployed refinery-source v0.0.126 |
[analytics] |
14:56 |
<herron> |
bounced elasticsearch on logstash1012 |
[production] |
14:44 |
<Reedy> |
rm -rf doc1001:/srv/docroot/org/wikimedia/doc/mediawiki-libs-PasswordBlacklist T254799 |
[releng] |
14:41 |
<akosiaris@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
14:40 |
<Reedy> |
Reloading Zuul to deploy https://gerrit.wikimedia.org/r/604707 |
[releng] |
14:40 |
<akosiaris@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
14:37 |
<herron> |
enabled VO incident resolution notification in global settings |
[production] |
14:34 |
<akosiaris@cumin1001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
14:31 |
<akosiaris@cumin1001> |
START - Cookbook sre.hosts.decommission |
[production] |
14:30 |
<godog> |
bounce logstash on logstash1009, apparent GC death spiral |
[production] |
14:03 |
<jmm@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) |
[production] |
14:03 |
<jmm@cumin1001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
14:03 |
<jmm@cumin1001> |
END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) |
[production] |
14:03 |
<jmm@cumin1001> |
START - Cookbook sre.hosts.reboot-single |
[production] |
13:57 |
<ottomata> |
removed accidentally added page_restrictions column(s) on Hive table event.mediawiki_user_blocks_change after a incorrect schema change was merged (no data was ever set in this column) |
[analytics] |
13:45 |
<RhinosF1> |
added sitemap.xml to search console |
[tools.zppixbot] |
13:37 |
<Reedy> |
Reloading Zuul to deploy https://gerrit.wikimedia.org/r/604706 T254799 |
[releng] |
13:35 |
<filippo@cumin1001> |
conftool action : set/pooled=false; selector: dnsdisc=thanos-query,name=eqiad |
[production] |
13:35 |
<filippo@cumin1001> |
conftool action : set/pooled=false; selector: dnsdisc=thanos-swift,name=eqiad |
[production] |
13:33 |
<wm-bot> |
<zppixbot> auto-update@website: Synced website repo in 95.s |
[tools.zppixbot] |
13:16 |
<wm-bot> |
<zppixbot> auto-update@website: Synced website repo in 45.s |
[tools.zppixbot] |
12:42 |
<arturo> |
introduce puppet profile 'toolsbeta-docker-registry' and relocate some hiera config there |
[toolsbeta] |
12:39 |
<ayounsi@cumin1001> |
END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0) |
[production] |
12:39 |
<arturo> |
for the record, k8s etcd servers certificate changed (puppet based) and k8s just kept working |
[toolsbeta] |
12:36 |
<elukey> |
updated pcc facts |
[production] |
12:35 |
<arturo> |
according to `aborrero@cloud-cumin-01:~$ sudo cumin --force -x 'O{project:toolsbeta}' 'run-puppet-agent'` we are mostly back in business |
[toolsbeta] |
12:28 |
<jayme@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' . |
[production] |
12:28 |
<jayme@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-main' for release 'production' . |
[production] |
12:28 |
<marostegui@cumin1001> |
END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) |
[production] |
12:25 |
<marostegui@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
12:15 |
<jayme@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' . |
[production] |
12:15 |
<jayme@deploy1001> |
helmfile [EQIAD] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' . |
[production] |
12:14 |
<arturo> |
try switching all VMs to toolsbeta-puppetmaster-04 |
[toolsbeta] |
12:14 |
<arturo> |
poweroff toolsbeta-puppetmaster-03 |
[toolsbeta] |
12:12 |
<arturo> |
copy over labs/private from toolsbeta-puppetmaster-03 to toolsbeta-puppetmaster-04 |
[toolsbeta] |
12:04 |
<jforrester@deploy1001> |
Synchronized php-1.35.0-wmf.36/includes/title/NamespaceInfo.php: T253098 NamespaceInfo::makeValidNamespace: Don't throw for -1 or -2 (duration: 01m 06s) |
[production] |
12:03 |
<marostegui> |
Reimage es2023 (es5 codfw master) |
[production] |
11:54 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Repool db2075 T254139', diff saved to https://phabricator.wikimedia.org/P11469 and previous config saved to /var/cache/conftool/dbconfig/20200611-115430-marostegui.json |
[production] |
11:53 |
<arturo> |
create VM toolsbeta-puppetmaster-04 |
[toolsbeta] |
11:46 |
<marostegui> |
Deploy schema change on s6 codfw - T250066 |
[production] |
11:44 |
<volans@deploy1001> |
Finished deploy [homer/deploy@df83901]: Release v0.2.3 (duration: 00m 25s) |
[production] |
11:44 |
<volans@deploy1001> |
Started deploy [homer/deploy@df83901]: Release v0.2.3 |
[production] |
11:36 |
<ayounsi@cumin1001> |
START - Cookbook sre.network.prepare-upgrade |
[production] |