2019-08-13
ยง
|
14:31 |
<gehel@cumin2001> |
START - Cookbook sre.elasticsearch.rolling-reboot |
[production] |
14:29 |
<XioNoX> |
rollback: disable all peering and transit on cr2-eqdfw |
[production] |
14:18 |
<XioNoX> |
reboot cr2-eqdfw for software upgrade - T227886 |
[production] |
14:14 |
<XioNoX> |
disable all peering and transit on cr2-eqdfw |
[production] |
14:04 |
<volans@cumin2001> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) |
[production] |
14:04 |
<volans@cumin2001> |
START - Cookbook sre.hosts.decommission |
[production] |
13:20 |
<jbond42> |
rolling update of postgresql-9.6 |
[production] |
13:07 |
<jijiki> |
rolling restart hhvm on api servers in eqiad |
[production] |
12:57 |
<jijiki> |
Restart hhvm on mw1235 |
[production] |
12:17 |
<fsero@puppetmaster1001> |
conftool action : set/pooled=true; selector: dnsdisc=sessionstore|citoid|cxserver|eventgate-analytics|eventgate-main|termbox|blubberoid|mathoid|zotero,name=eqiad |
[production] |
12:08 |
<_joe_> |
restarted php-fpm on mw1221 |
[production] |
12:03 |
<fsero@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'sessionstore' for release 'production' . |
[production] |
12:00 |
<fsero@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' . |
[production] |
11:56 |
<fsero@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' . |
[production] |
11:56 |
<fsero@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' . |
[production] |
11:49 |
<fsero@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'blubberoid' for release 'production' . |
[production] |
11:44 |
<fsero> |
recreating cxserver blubber and sessionstore namespace - T228836 |
[production] |
11:39 |
<fsero@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'mathoid' for release 'production' . |
[production] |
11:35 |
<gehel> |
restart wdqs-blazegraph on wdqs2001 |
[production] |
11:34 |
<gehel> |
restart wdqs-updater on wdqs2001 |
[production] |
11:30 |
<fsero@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-main' for release 'main' . |
[production] |
11:29 |
<fsero@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' . |
[production] |
11:25 |
<fsero@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'citoid' for release 'production' . |
[production] |
11:21 |
<fsero> |
recreating citoid eventgate-analytics eventgate-main mathoid namespace - T228836 |
[production] |
11:20 |
<fsero@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'termbox' for release 'production' . |
[production] |
11:18 |
<raynor> |
EU SWAT finished |
[production] |
11:15 |
<pmiazga@deploy1001> |
Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:529925|Undeploy editor gender surveys (T227793)]] (duration: 00m 48s) |
[production] |
11:13 |
<fsero> |
recreating termbox namespace - T228836 |
[production] |
11:06 |
<oblivian@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'zotero' for release 'production' . |
[production] |
11:04 |
<fsero> |
resetting net.netfilter.nf_conntrack_tcp_timeout_time_wait to 65 in kubernetes2006 |
[production] |
10:59 |
<_joe_> |
[eqiad] downtiming zotero on icinga for 10 minutes while recreating the deployment with helmfile |
[production] |
10:57 |
<oblivian@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
10:57 |
<oblivian@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
10:56 |
<oblivian@cumin1001> |
END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) |
[production] |
10:56 |
<oblivian@cumin1001> |
START - Cookbook sre.hosts.downtime |
[production] |
10:49 |
<oblivian@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'kube-system' for release 'coredns' . |
[production] |
10:44 |
<oblivian@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'kube-system' for release 'coredns' . |
[production] |
10:39 |
<oblivian@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' . |
[production] |
10:39 |
<_joe_> |
recreating rbac roles via helmfile |
[production] |
10:32 |
<oblivian@> |
helmfile [EQIAD] Ran 'apply' command on namespace 'kube-system' for release 'calico-policy-controller' . |
[production] |
10:29 |
<_joe_> |
deleting calico deploy and configmap in kubernetes in eqiad, recreating with helmfile |
[production] |
10:25 |
<jbond42> |
rolling update of ghostscript |
[production] |
10:23 |
<fsero@puppetmaster1001> |
conftool action : set/pooled=false; selector: dnsdisc=sessionstore|citoid|cxserver|eventgate-analytics|eventgate-main|termbox|blubberoid|mathoid|zotero,name=eqiad |
[production] |
10:10 |
<fsero> |
initialize_cluster.sh kube-system kubemaster.svc.eqiad.wmnet 6443 - T228836 |
[production] |
10:10 |
<fsero> |
creating tiller in kube-system for helmfile T228836 |
[production] |
09:58 |
<vgutierrez> |
upgrading the rest of cache@upload to 8.0.3-1wm3 - T221594 |
[production] |
08:49 |
<marostegui> |
Stop MySQL on db2057 - T230394 |
[production] |
08:48 |
<marostegui> |
Remove db2057 from tendril and zarcillo T230394 |
[production] |
07:55 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-eqiad.php: Remove db2057 from config T230394 (duration: 00m 47s) |
[production] |
07:54 |
<marostegui@deploy1001> |
Synchronized wmf-config/db-codfw.php: Remove db2057 from config T230394 (duration: 00m 48s) |
[production] |