2021-09-24
ยง
|
17:02 |
<elukey@cumin1001> |
START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons. - elukey@cumin1001 |
[production] |
16:35 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 |
[production] |
15:59 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons. - elukey@cumin1001 |
[production] |
15:53 |
<elukey@cumin1001> |
START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-analytics cluster: Roll restart of jvm daemons. - elukey@cumin1001 |
[production] |
15:52 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons. - elukey@cumin1001 |
[production] |
15:46 |
<elukey@cumin1001> |
START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-public cluster: Roll restart of jvm daemons. - elukey@cumin1001 |
[production] |
15:41 |
<dpifke> |
Cherry-picking https://gerrit.wikimedia.org/r/c/performance/coal/+/722948 and latest https://gerrit.wikimedia.org/r/c/operations/puppet/+/721047 in deployment-prep. Should only affect deployment-webperf11. |
[releng] |
15:23 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0) for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons. - elukey@cumin1001 |
[production] |
15:17 |
<elukey@cumin1001> |
START - Cookbook sre.zookeeper.roll-restart-zookeeper for Zookeeper A:zookeeper-druid-analytics cluster: Roll restart of jvm daemons. - elukey@cumin1001 |
[production] |
15:09 |
<elukey> |
sudo cumin -m async -b2 "c:profile::analytics::cluster::hdfs_mount" "umount /mnt/hdfs" "mount /mnt/hdfs" - T288625 |
[production] |
15:06 |
<btullis> |
btullis@cumin1001:~$ sudo cumin --mode async 'aqs100[4,7].eqiad.wmnet' 'nodetool-a snapshot -t T291469' 'nodetool-b snapshot -t T291469' |
[analytics] |
14:47 |
<btullis> |
btullis@aqs1007:~$ sudo nodetool-a repair --full local_group_default_T_mediarequest_per_file data |
[analytics] |
14:46 |
<dcaro> |
Created new project (T290768) |
[wikiwho] |
14:32 |
<bd808@deploy1002> |
helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' . |
[production] |
14:21 |
<dcaro> |
Created new project (T290098) |
[fr-tech-dev] |
14:07 |
<cmjohnson@cumin1001> |
END (PASS) - Cookbook sre.dns.netbox (exit_code=0) |
[production] |
14:03 |
<cmjohnson@cumin1001> |
START - Cookbook sre.dns.netbox |
[production] |
13:31 |
<Amir1> |
start of rebuilding metadata of images in commons to make them use json |
[production] |
13:24 |
<elukey@cumin1001> |
START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-jumbo-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 |
[production] |
13:02 |
<arturo> |
[codfw1dev] create VM manila-share-controller-01 on cloudinfra-codfw1dev |
[admin] |
13:00 |
<arturo> |
[codfw1dev] rebase labs/private.git on cloudinfra-puppetmaster-01, had merge conflict |
[admin] |
11:58 |
<effie> |
upgrading scap on canaries - T291095 |
[production] |
11:39 |
<jiji@cumin1001> |
conftool action : set/pooled=true; selector: dnsdisc=tegola-vector-tiles |
[production] |
11:32 |
<effie> |
uploading scap-4.0.0 to buster-wikimedia and stretch-wikimedia |
[production] |
11:17 |
<effie> |
restart pybal in low traffic load balancers |
[production] |
11:02 |
<btullis> |
btullis@an-master1001:~$ sudo systemctl restart hadoop-mapreduce-historyserver |
[analytics] |
10:47 |
<btullis> |
btullis@an-master1002:~$ sudo systemctl restart hadoop-hdfs-namenode |
[analytics] |
10:47 |
<btullis> |
btullis@an-master1002:~$ sudo systemctl restart hadoop-hdfs-zkfc |
[analytics] |
10:44 |
<jynus> |
corrupting and fixing image metadata on testwiki before running script on commons T290462 |
[production] |
10:35 |
<btullis> |
btullis@an-master1001:~$ sudo -u hdfs kerberos-run-command hdfs /usr/bin/hdfs haadmin -failover an-master1002-eqiad-wmnet an-master1001-eqiad-wmnet |
[analytics] |
10:16 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid public cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001 |
[production] |
10:11 |
<btullis@cumin1001> |
END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99) restart masters for Hadoop analytics cluster: Restart of jvm daemons. - btullis@cumin1001 |
[production] |
10:07 |
<btullis> |
btullis@an-launcher1002:~$ sudo -u analytics kerberos-run-command analytics /usr/local/bin/refine_eventlogging_legacy --ignore_failure_flag=true --table_include_regex='centralnoticeimpression' --since='2021-09-23T04:00:00.000Z' --until='2021-09-24T05:00:00.000Z' |
[analytics] |
09:39 |
<jynus> |
upgrade and restart db2099 |
[production] |
09:32 |
<btullis@cumin1001> |
START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons. - btullis@cumin1001 |
[production] |
09:29 |
<btullis@cumin1001> |
END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001 |
[production] |
09:25 |
<marostegui> |
Rename flaggedimages on db1096(ruwiki) and db1098(arwiki) T290340 |
[production] |
09:25 |
<elukey@cumin1001> |
START - Cookbook sre.druid.roll-restart-workers for Druid public cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001 |
[production] |
09:09 |
<jynus> |
upgrade and restart db2139, db2101 |
[production] |
09:03 |
<btullis@cumin1001> |
START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001 |
[production] |
08:35 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-test-eqiad cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 |
[production] |
08:22 |
<jynus> |
upgrade and restart db2098 T290868 |
[production] |
08:20 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.druid.roll-restart-workers (exit_code=0) for Druid analytics cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001 |
[production] |
08:08 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mx2002.wikimedia.org |
[production] |
07:59 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.decommission for hosts mx2002.wikimedia.org |
[production] |
07:42 |
<jmm@cumin2002> |
END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mx1002.wikimedia.org |
[production] |
07:34 |
<elukey@cumin1001> |
START - Cookbook sre.druid.roll-restart-workers for Druid analytics cluster: Roll restart of Druid jvm daemons. - elukey@cumin1001 |
[production] |
07:17 |
<elukey@cumin1001> |
END (PASS) - Cookbook sre.hadoop.roll-restart-workers (exit_code=0) restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 |
[production] |
07:11 |
<jmm@cumin2002> |
START - Cookbook sre.hosts.decommission for hosts mx1002.wikimedia.org |
[production] |
07:01 |
<elukey@cumin1001> |
START - Cookbook sre.hadoop.roll-restart-workers restart workers for Hadoop test cluster: Roll restart of jvm daemons for openjdk upgrade. - elukey@cumin1001 |
[production] |