2021-09-24
ยง
|
15:52 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons. - elukey@cumin1001 |
[production] |
15:46 |
<elukey@cumin1001> |
START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons. - elukey@cumin1001 |
[production] |
15:23 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons. - elukey@cumin1001 |
[production] |
15:17 |
<elukey@cumin1001> |
START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons. - elukey@cumin1001 |
[production] |
15:09 |
<elukey> |
sudo cumin -m async -b2 "c:profile::analytics::cluster::hdfs_mount" "umount /mnt/hdfs" "mount /mnt/hdfs" - T288625 |
[production] |
14:32 |
<bd808@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' . |
[production] |
14:07 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
14:03 |
<cmjohnson@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
13:31 |
<Amir1> |
start of rebuilding metadata of images in commons to make them use json |
[production] |
13:24 |
<elukey@cumin1001> |
START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 |
[production] |
11:58 |
<effie> |
upgrading scap on canaries - T291095 |
[production] |
11:39 |
<jiji@cumin1001> |
conftool action : set/pooled=true; selector: dnsdisc=tegola-vector-tiles |
[production] |
11:32 |
<effie> |
uploading scap-4.0.0 to buster-wikimedia and stretch-wikimedia |
[production] |
11:17 |
<effie> |
restart pybal in low traffic load balancers |
[production] |
10:44 |
<jynus> |
corrupting and fixing image metadata on testwiki before running script on commons T290462 |
[production] |
10:16 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid public cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001 |
[production] |
10:11 |
<btullis@cumin1001> |
END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop analytics cluster: Restart of jvm daemons. - btullis@cumin1001 |
[production] |
09:39 |
<jynus> |
upgrade and restart db2099 |
[production] |
09:32 |
<btullis@cumin1001> |
START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons. - btullis@cumin1001 |
[production] |
09:29 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001 |
[production] |
09:25 |
<marostegui> |
Rename flaggedimages on db1096(ruwiki) and db1098(arwiki) T290340 |
[production] |
09:25 |
<elukey@cumin1001> |
START - Cookbook sre.druid.roll-restart-workers for Druid public cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001 |
[production] |
09:09 |
<jynus> |
upgrade and restart db2139, db2101 |
[production] |
09:03 |
<btullis@cumin1001> |
START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001 |
[production] |
08:35 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 |
[production] |
08:22 |
<jynus> |
upgrade and restart db2098 T290868 |
[production] |
08:20 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid analytics cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001 |
[production] |
08:08 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mx2002.wikimedia.org |
[production] |
07:59 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.decommission for hosts mx2002.wikimedia.org |
[production] |
07:42 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mx1002.wikimedia.org |
[production] |
07:34 |
<elukey@cumin1001> |
START - Cookbook sre.druid.roll-restart-workers for Druid analytics cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001 |
[production] |
07:17 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 |
[production] |
07:11 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.decommission for hosts mx1002.wikimedia.org |
[production] |
07:01 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 |
[production] |
07:01 |
<elukey@cumin1001> |
END (ERROR) - Cookbook sre.hadoop.roll-restart-workers (exit_code=97) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 |
[production] |
07:00 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 |
[production] |
06:55 |
<elukey@cumin1001> |
START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 |
[production] |
06:53 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid test cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001 |
[production] |
06:44 |
<elukey@cumin1001> |
START - Cookbook sre.druid.roll-restart-workers for Druid test cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001 |
[production] |
06:41 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons. - elukey@cumin1001 |
[production] |
06:30 |
<elukey@cumin1001> |
START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons. - elukey@cumin1001 |
[production] |
06:26 |
<elukey> |
restart archiva on archiva1002 to pick up new openjdk upgrades |
[production] |
06:11 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1177 (re)pooling @ 100%: After fixing some indexes T291584', diff saved to https://phabricator.wikimedia.org/P17324 and previous config saved to /var/cache/conftool/dbconfig/20210924-061105-root.json |
[production] |
05:56 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1177 (re)pooling @ 75%: After fixing some indexes T291584', diff saved to https://phabricator.wikimedia.org/P17323 and previous config saved to /var/cache/conftool/dbconfig/20210924-055601-root.json |
[production] |
05:40 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1177 (re)pooling @ 50%: After fixing some indexes T291584', diff saved to https://phabricator.wikimedia.org/P17322 and previous config saved to /var/cache/conftool/dbconfig/20210924-054057-root.json |
[production] |
05:25 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1177 (re)pooling @ 25%: After fixing some indexes T291584', diff saved to https://phabricator.wikimedia.org/P17321 and previous config saved to /var/cache/conftool/dbconfig/20210924-052554-root.json |
[production] |
05:10 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'db1177 (re)pooling @ 10%: After fixing some indexes T291584', diff saved to https://phabricator.wikimedia.org/P17320 and previous config saved to /var/cache/conftool/dbconfig/20210924-051050-root.json |
[production] |
05:07 |
<marostegui@cumin1001> |
dbctl commit (dc=all): 'Depool db1177 T291584', diff saved to https://phabricator.wikimedia.org/P17319 and previous config saved to /var/cache/conftool/dbconfig/20210924-050739-marostegui.json |
[production] |
01:27 |
<mwdebug-deploy@deploy1002> |
helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |
01:23 |
<mwdebug-deploy@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' . |
[production] |